Preference Learning in Recommender Systems Marco de gemmis, Leo laquinta, Pasquale Lops, Cataldo Musto Fedelucio Narducci and Giovanni Semeraro Department of Computer Science niversity of Bari"Aldo Moro", Italy idegemmis, iaquinta, lops, musto, narducci, semeraroj@di uniba.it Abstract. As proved by the continuous growth of the number of web sites which embody recommender systems as a way of personalizing the experience of users with their content, recommender systems represent one of the most popular applications of principles and techniques com- ing from Information Filtering(IF). As IF techniques usually perform a progressive removal of non-relevant content according to the information stored in a user profile, recommendation algorithms process information about user interests- acquired in an explicit(e. g, letting users express their opinion about items) or implicit (e.g, studying some behavioral features) way-and exploit these data to generate a list of recommended items. Although each type of filtering method has its own weaknesses and strengths, preference handling is one of the core issues in the design of every recommender system: since these systems aim to guide users in a personalized way to interesting or useful objects in a large space of possi ble options, it is important for them to accurately catch and model use preferences. The paper provides a general overview of the approaches to learning preference models in the context of recommender systen 1 Introduction How many times did you search something on the Web and you were not able to find successfully what were you looking for? The existence of a large quantity of information, in combination with the dynamic and heterogeneous nature of the Web, makes retrieval a hard task for the average user, who is usually over- amount refer to this as Information Overload problem), the role of user modeling and personalized information access is becoming crucial: although it is too soon to deeply understand the long-term effects of this surplus of information in our habits and in daily life, it is clear that users need a personalized support in sift ing through large amounts of available information according to their interests and preferences Information Filtering systems, like Recommender Systems, relying on this dea, adapt their behavior to individual users by learning their tastes during the interaction, in order to construct a profile that can be later exploited to select relevant items. Nowadays these systems represent the main solution to the information overload problem, because they are able to gather and exploit
Preference Learning in Recommender Systems Marco de Gemmis, Leo Iaquinta, Pasquale Lops, Cataldo Musto, Fedelucio Narducci, and Giovanni Semeraro Department of Computer Science University of Bari “Aldo Moro”, Italy {degemmis,iaquinta,lops,musto,narducci,semeraro}@di.uniba.it Abstract. As proved by the continuous growth of the number of web sites which embody recommender systems as a way of personalizing the experience of users with their content, recommender systems represent one of the most popular applications of principles and techniques coming from Information Filtering (IF). As IF techniques usually perform a progressive removal of non-relevant content according to the information stored in a user profile, recommendation algorithms process information about user interests - acquired in an explicit (e.g., letting users express their opinion about items) or implicit (e.g., studying some behavioral features) way - and exploit these data to generate a list of recommended items. Although each type of filtering method has its own weaknesses and strengths, preference handling is one of the core issues in the design of every recommender system: since these systems aim to guide users in a personalized way to interesting or useful objects in a large space of possible options, it is important for them to accurately catch and model user preferences. The paper provides a general overview of the approaches to learning preference models in the context of recommender systems. 1 Introduction How many times did you search something on the Web and you were not able to find successfully what were you looking for? The existence of a large quantity of information, in combination with the dynamic and heterogeneous nature of the Web, makes retrieval a hard task for the average user, who is usually overwhelmed by the abundant amount of information. In this context (we usually refer to this as Information Overload problem), the role of user modeling and personalized information access is becoming crucial: although it is too soon to deeply understand the long-term effects of this surplus of information in our habits and in daily life, it is clear that users need a personalized support in sifting through large amounts of available information according to their interests and preferences. Information Filtering systems, like Recommender Systems, relying on this idea, adapt their behavior to individual users by learning their tastes during the interaction, in order to construct a profile that can be later exploited to select relevant items. Nowadays these systems represent the main solution to the information overload problem, because they are able to gather and exploit
heterogeneous information about users, emerging as one of the most useful tools to achieve a more intelligent information access. In the workflow of a typical rec- ommendation process, learning user preferences is a primary step: catching and modeling user interests in an effective way can be a key issue for personalization goals. Gathering user characteristics, acquired through an explicit(e.g, directly asking to the user) or implicit process(e. g, observing the user behavior),can produce a user model to be exploited to enable adaptivity mechanisms during the interaction with an information system The problem of recommending items has been studied extensively, and two main paradigms have emerged. Content-based recommendation systems try to recommend items similar to those a given user has liked in the past, whereas systems designed according to the collaborative recommendation paradigm iden- tify users whose preferences are similar to those of the given user and recom- mend items they have liked. Further, in literature we found also other note- worthy paradigms: demographic recommenders, whose aim is to categorize the user starting from personal attributes making recommendation based on demo- graphic classes; knowledge-based systems, which exploit know ledge about how a particular item meets a particular user need; hybrid systems, at last, com- bine different recommendation techniques trying to exploit their advantages and reducing at the same time their draw backs. Each of above paradigms has par ticular methods to elicit user interests and preferences: most of them are related to Machine Learning area(probabilistic models, bayesian or neural networks decision trees, association rules), but there are also some other techniques(so- called heuristics)which learn user profiles by exploiting preferences expressed by similar users(usually referred to as "neighbours")or processing textual contents describing the items liked The paper provides a general overview of the approaches to learning prefer- ce models in the context of recommender systems and it is organized as follows Section 2 introduce general concepts and terminology about recommender sys- tems. Preference learning issues in the area of recommender systems is presented in Section 3, where we also introduce the feedback gathering problem and some machine learning techniques used to acquire and infer user preferences. Conclu- sions are drawn in the last sectio 2 Basics of Recommender Systems Nowadays it is very important for people to be supported in their decisions, due o the exponential increase of available information. Everyday we get advices from other people: " Hey, check out this Web site",I saw this book, you will like it'","That restaurant is very good!". When making a choice in the absence of decisive first-hand knowledge, choosing as other like-minded people have cho- sen in the past may be a good strategy. Recommender systems have the same role as human recommendations: they present information that they perceive to be useful and worth trying out. These systems are used in several application de to support users in taking decisions, to help them in managing the
heterogeneous information about users, emerging as one of the most useful tools to achieve a more intelligent information access. In the workflow of a typical recommendation process, learning user preferences is a primary step: catching and modeling user interests in an effective way can be a key issue for personalization goals. Gathering user characteristics, acquired through an explicit (e.g., directly asking to the user) or implicit process (e.g., observing the user behavior), can produce a user model to be exploited to enable adaptivity mechanisms during the interaction with an information system. The problem of recommending items has been studied extensively, and two main paradigms have emerged. Content-based recommendation systems try to recommend items similar to those a given user has liked in the past, whereas systems designed according to the collaborative recommendation paradigm identify users whose preferences are similar to those of the given user and recommend items they have liked. Further, in literature we found also other noteworthy paradigms: demographic recommenders, whose aim is to categorize the user starting from personal attributes making recommendation based on demographic classes; knowledge-based systems, which exploit knowledge about how a particular item meets a particular user need ; hybrid systems, at last, combine different recommendation techniques trying to exploit their advantages and reducing at the same time their drawbacks. Each of above paradigms has particular methods to elicit user interests and preferences: most of them are related to Machine Learning area (probabilistic models, bayesian or neural networks, decision trees, association rules), but there are also some other techniques (socalled heuristics) which learn user profiles by exploiting preferences expressed by similar users (usually referred to as “neighbours”) or processing textual contents describing the items liked. The paper provides a general overview of the approaches to learning preference models in the context of recommender systems and it is organized as follows. Section 2 introduce general concepts and terminology about recommender systems. Preference learning issues in the area of recommender systems is presented in Section 3, where we also introduce the feedback gathering problem and some machine learning techniques used to acquire and infer user preferences. Conclusions are drawn in the last section. 2 Basics of Recommender Systems Nowadays it is very important for people to be supported in their decisions, due to the exponential increase of available information. Everyday we get advices from other people: “Hey, check out this Web site”, “I saw this book, you will like it”, “That restaurant is very good!”. When making a choice in the absence of decisive first-hand knowledge, choosing as other like-minded people have chosen in the past may be a good strategy. Recommender systems have the same role as human recommendations: they present information that they perceive to be useful and worth trying out. These systems are used in several application domains to support users in taking decisions, to help them in managing the ex-
ponential increase of information and, in general, to provide a more intelligent form of information access The creation and management of personalized recommendations require mainly three distinct and important components: a user profile, an algorithm to update the profile given usage/input information, and an adaptive tool that exploits the profile in order to provide personalization. First, the system needs to be able to store relevant information about users that will be used to infer their preferences and needs. Such information are stored in an individual user profi Second, if the system has to adapt with the user over time, some mechanism needed to keep the profile up-to-date. This could happen through explicit data input or implicit recording of user behavior as she interacts with the system, or combination of them. Third, the system needs some way to exploit the current profile data in making recommendations to the user. The types of information stored in the profile will depend on the goals of the system and the algorithms it employs in order to provide recommendations. Different approaches to recom- endation will require different pieces of information about the user, thus the profile structure will differ from system to system In this section we will provide an overview of the main recommendation aches and their benefits and weaknesses 2.1 Collaborative Recommender Systems n Collaborative Filtering(CF) systems recommendations are based on evalu ations of users who share similar interests among them. The idea behind thes systems is that a set of users which liked the same items in the past probably share the same preferences. Thus, picking a user from this set, we can suggest her n the unseen items which other users with similar tastes showed to like in the past. Opinions on items can be expressed as explicit user ratings on some scale ranging from bad to good, or as implicit ratings given by logging user actions As an example of the latter, viewing or skipping items could be interpreted positive and negative ratings respectively. CF systems analyze opinions of other users on items, thus they provide a liking degree not based on the nature of the item, but on human judgment The main advantage of collaborative methods is that items in different prod- uct categories can be recommended. Movies, images, art and text items are all epresented by opinions of users and thus they can be recommended by the same system. In CF, a user profile simply consists of the data the user has specified These data are compared to those of other users to find overlaps in interests among users. For example, the nearest neighbor approach, used in some collab- orative recommender system [20, represents the preferences by the items rated (or purchased) by the user. The profile is represented by the user-item matrix 22 where for each cell (u, i)we have the rate of the user u on the item i. The recommender algorithm performs three tasks: it finds similar users, creates the nearest neighbors set for each user, infers the like degree for an unseen item based on the nearest neighbors behavior
ponential increase of information and, in general, to provide a more intelligent form of information access. The creation and management of personalized recommendations require mainly three distinct and important components: a user profile, an algorithm to update the profile given usage/input information, and an adaptive tool that exploits the profile in order to provide personalization. First, the system needs to be able to store relevant information about users that will be used to infer their preferences and needs. Such information are stored in an individual user profile. Second, if the system has to adapt with the user over time, some mechanism is needed to keep the profile up-to-date. This could happen through explicit data input or implicit recording of user behavior as she interacts with the system, or a combination of them. Third, the system needs some way to exploit the current profile data in making recommendations to the user. The types of information stored in the profile will depend on the goals of the system and the algorithms it employs in order to provide recommendations. Different approaches to recommendation will require different pieces of information about the user, thus the profile structure will differ from system to system. In this section we will provide an overview of the main recommendation approaches and their benefits and weaknesses. 2.1 Collaborative Recommender Systems In Collaborative Filtering (CF) systems recommendations are based on evaluations of users who share similar interests among them. The idea behind these systems is that a set of users which liked the same items in the past probably share the same preferences. Thus, picking a user from this set, we can suggest her all the unseen items which other users with similar tastes showed to like in the past. Opinions on items can be expressed as explicit user ratings on some scale ranging from bad to good, or as implicit ratings given by logging user actions. As an example of the latter, viewing or skipping items could be interpreted as positive and negative ratings respectively. CF systems analyze opinions of other users on items, thus they provide a liking degree not based on the nature of the item, but on human judgment. The main advantage of collaborative methods is that items in different product categories can be recommended. Movies, images, art and text items are all represented by opinions of users and thus they can be recommended by the same system. In CF, a user profile simply consists of the data the user has specified. These data are compared to those of other users to find overlaps in interests among users. For example, the nearest neighbor approach, used in some collaborative recommender system [20], represents the preferences by the items rated (or purchased) by the user. The profile is represented by the user-item matrix [22] where for each cell (u,i) we have the rate of the user u on the item i. The recommender algorithm performs three tasks: it finds similar users, creates the nearest neighbors set for each user, infers the like degree for an unseen item based on the nearest neighbors behavior
Terveen and Hill 38 claim three essentials are needed to support CF: many people must participate(increasing the likelihood that any one person will find other users with similar preferences ), there must be an easy way to represent the user interests in the system, and the algorithms must be able to match people with similar interests. These three elements are not that easy to develop, and produce the main shortcoming of CF systems. Following the main limitations of collaborative systems 4, 18 NEW USER PROBLEM-In order to make accurate recommendations, the system must first learn the preferences of the user from her ratings NEW ITEM PROBLEM(EARLY RATER)-Until new items are rated by a substantial number of users, the recommender system would not be able recommend them SPARSITY PROBLEM- The number of ratings obtained is usually very small compared to the number of ratings to be predicted and the success of th collaborative recommender system depends on the availability of a critical mass of users. One way to overcome the problem of rating sparsity is to use user profile information when calculating user similarity. That is, two users could be considered similar not only if they similarly rated the same items, but also if they belong to the same demographic segment. For example Pazzani uses gender, age, area code, education, and employment information of users in the restaurant recommendation application 25 GREY SHEEP PROBLEM(UNUSUAL USER)-In a small or even medium com munity of users, there are individuals who would not benefit from pure CF systems because their opinions do not consistently agree or disagree with any group of people. These individuals will rarely, if ever, receive accurate predictions, even after the initial start up phase for the user and the sys- tem [11]. The majority of users falls into the class of the so-called"white sheep", those who have high correlation with many other users and who will therefore, in theory, be easy to find recommendations for. The opposite type of people are the"black sheep", those for whom there are no or few people who they correlate with. This makes it very difficult to make recommenda- tions for them. On the positive side, for statistical reasons, as the number of users of a system increases the chance of finding other people with similar tastes increases and so better recommendations can be provided SCALABILITY PROBLEM-CF systems require data from a large number of users before being effective as well as requiring a large amount of data from each user. Therefore, the required computational resources become a critical issue to find users with similar tastes LACK OF TRANSPARENCY PROBLEM- Collaborative systems today are black bores, computerized oracles which give advice but cannot be questioned. A user is given no indicators to consult in order to decide when to trust a recommendation and when to doubt one. These problems have prevented acceptance of collaborative systems in all but low-risk content domains since they are untrustworthy for high-risk content domains
Terveen and Hill [38] claim three essentials are needed to support CF: many people must participate (increasing the likelihood that any one person will find other users with similar preferences), there must be an easy way to represent the user interests in the system, and the algorithms must be able to match people with similar interests. These three elements are not that easy to develop, and produce the main shortcoming of CF systems. Following the main limitations of collaborative systems [4, 18]. – New user problem - In order to make accurate recommendations, the system must first learn the preferences of the user from her ratings. – New item problem (early rater) - Until new items are rated by a substantial number of users, the recommender system would not be able to recommend them. – Sparsity problem - The number of ratings obtained is usually very small compared to the number of ratings to be predicted and the success of the collaborative recommender system depends on the availability of a critical mass of users. One way to overcome the problem of rating sparsity is to use user profile information when calculating user similarity. That is, two users could be considered similar not only if they similarly rated the same items, but also if they belong to the same demographic segment. For example, Pazzani uses gender, age, area code, education, and employment information of users in the restaurant recommendation application [25]. – Grey sheep problem (unusual user) - In a small or even medium community of users, there are individuals who would not benefit from pure CF systems because their opinions do not consistently agree or disagree with any group of people. These individuals will rarely, if ever, receive accurate predictions, even after the initial start up phase for the user and the system [11]. The majority of users falls into the class of the so-called “white sheep”, those who have high correlation with many other users and who will therefore, in theory, be easy to find recommendations for. The opposite type of people are the “black sheep”, those for whom there are no or few people who they correlate with. This makes it very difficult to make recommendations for them. On the positive side, for statistical reasons, as the number of users of a system increases the chance of finding other people with similar tastes increases and so better recommendations can be provided. – Scalability problem - CF systems require data from a large number of users before being effective as well as requiring a large amount of data from each user. Therefore, the required computational resources become a critical issue to find users with similar tastes. – Lack of transparency problem - Collaborative systems today are black boxes, computerized oracles which give advice but cannot be questioned. A user is given no indicators to consult in order to decide when to trust a recommendation and when to doubt one. These problems have prevented acceptance of collaborative systems in all but low-risk content domains since they are untrustworthy for high-risk content domains
2.2 Content-based Recommender System Unlike CF systems, where user opinions were a key element to learn user pref erences and finding items to suggest, in content-based(CB) recommenders the ratings expressed by a single user have no role in recommendations provided to other users. The core of this approach is the processing of the contents describ- ing the items to be recommended. The items can be very different depending on the number and type of attributes used to describe them. Each item can be described by the same small number of attributes with known set of values, but this is not appropriate for items, such as Web pages, news or documents, de- scribed by means of unstructured text. In this case there are no attributes with well-defined values and the use of document modeling techniques with roots in Information Retrieval 30, 3 and Information Filtering 5 research is desirable A method to represent unstructured data is the Vector Space Model (vSM) The VSM 34 is a spatial representation of text documents. In this model each document is represented by a vector in a n-dimensional space, where each dimension corresponds to a term from the overall vocabulary of a given document collection. Formally, every document is represented as a vector of term weights where each weight indicates the degree of association between the document and the term. The CB approach can be applied only in the domains where we can provide some textual metadata describing the items A CB recommender learns a profile of the user interests based on some fea- tures of the objects the user rated. Afterwards the system exploits the user profile to suggest relevant items by matching the profile representation against that of items to be recommended. The result of this matching is a binary or continuous relevance judgment, the latter case resulting in a ranked list of potentially inter- esting items. If data are represented by the vsm, the matching might be realized by computing the cosine similarity between the prototype vector and the item vectors. Many systems ask users for feedback on the recommended items so that the matching can be performed according the relevance feedback. The CB paradigm has several advantages when compared to the Cf one USER INDEPENDENCE-CB recommenders exploit solely ratings provided by the active user to build her own profile TRANSPARENCY-Explanations of recommendations can be provided by list ng content features or descriptions that caused an item to be recommended NEW ITEM-CB recommenders are capable of recommending items not yet rated by any user On the other hand, CB systems have several shortcomings LIMITED CONTENT ANALYSIS- CB techniques are limited by the features that are associated either automatically or manually with the items. No CB system can provide good suggestions if the content does not contain enough information to distinguish items the user likes from items the user does not like Some representations capture only certain aspects of the content, but there are many others that would influence a us
2.2 Content-based Recommender Systems Unlike CF systems, where user opinions were a key element to learn user preferences and finding items to suggest, in content-based (CB) recommenders the ratings expressed by a single user have no role in recommendations provided to other users. The core of this approach is the processing of the contents describing the items to be recommended. The items can be very different depending on the number and type of attributes used to describe them. Each item can be described by the same small number of attributes with known set of values, but this is not appropriate for items, such as Web pages, news or documents, described by means of unstructured text. In this case there are no attributes with well-defined values and the use of document modeling techniques with roots in Information Retrieval [30, 3] and Information Filtering [5] research is desirable. A method to represent unstructured data is the Vector Space Model (VSM). The VSM [34] is a spatial representation of text documents. In this model, each document is represented by a vector in a n-dimensional space, where each dimension corresponds to a term from the overall vocabulary of a given document collection. Formally, every document is represented as a vector of term weights, where each weight indicates the degree of association between the document and the term. The CB approach can be applied only in the domains where we can provide some textual metadata describing the items. A CB recommender learns a profile of the user interests based on some features of the objects the user rated. Afterwards the system exploits the user profile to suggest relevant items by matching the profile representation against that of items to be recommended. The result of this matching is a binary or continuous relevance judgment, the latter case resulting in a ranked list of potentially interesting items. If data are represented by the VSM, the matching might be realized by computing the cosine similarity between the prototype vector and the item vectors. Many systems ask users for feedback on the recommended items so that the matching can be performed according the relevance feedback. The CB paradigm has several advantages when compared to the CF one: – User independence - CB recommenders exploit solely ratings provided by the active user to build her own profile. – Transparency - Explanations of recommendations can be provided by listing content features or descriptions that caused an item to be recommended. – New item - CB recommenders are capable of recommending items not yet rated by any user. On the other hand, CB systems have several shortcomings: – Limited content analysis - CB techniques are limited by the features that are associated either automatically or manually with the items. No CB system can provide good suggestions if the content does not contain enough information to distinguish items the user likes from items the user does not like. Some representations capture only certain aspects of the content, but there are many others that would influence a user’s experience. For instance
there often is not enough information in the word frequency to model the user interests in jokes or poems, while techniques for affective computing would be most appropriate. Again, for Web pages, feature extraction by using techniques for text representation completely ignores aesthetic qualities and multimedia information OVER-SPECIALIZATION- CB recommenders have no inherent method for finding something unexpected. The system recommends only items scoring highly against the user profile, i. e items similar to those already rated. This draw back is also called serendipity problem NEW USER- Enough ratings have to be collected before a CB system can really understand user preferences and provide accurate recommendations. Therefore, when few ratings are available, such as for a new user, the system ould not be able to provide reliable recommendations 2.3 Other Approaches Demographic Recommender Systems These systems aim to categorize the user starting from personal attributes making recommendation based on demographic classes. Grundy[32, for ex ample, recommends books by gathering personal information through an interactive dialogue matching users responses against a library of manually assembled user stereotypes. Pazzani [25 uses machine learning techniques to obtain a classifier based on demographic data. The representation of de- mographic information in a user model can vary greatly. Grundy system uses hand-crafted attributes with numeric confidence values while pazzani extracts features from users'home pages The benefit of a demographic approach is that it may not require a history of user ratings of the type needed by collaborative and content-based tech niques. However, up to our knowledge, there are not many recommender systems using demographic data because this form of information is diffi- cult to collect: till some years ago, indeed, users were reluctant to share a big amount of personal information with a system. Nowadays with the ex ponential growth of social network and the continuous expansion of Web 2.0 platforms, the situation is changed towards a more open perspective with users more trustful to sharing of information. Despite this, still today demographic approaches notice less success than others Knowledge-based Recommender Systems These systems uses a knowledge-based(KB) approach to generate recom- mendations. All recommendation techniques make some kind of inference KB approaches are distinguished in that they have functional knowledge they have knowledge about how a particular item meets a particular user need, and can therefore reason about the relationship between a need and possible recommendation 9. The user profile can be any knowledge struc- ture that supports this inference. In the simplest case, as in google, it may simply be the query that the user has formulated. In others, it may be a of the user needs 391
there often is not enough information in the word frequency to model the user interests in jokes or poems, while techniques for affective computing would be most appropriate. Again, for Web pages, feature extraction by using techniques for text representation completely ignores aesthetic qualities and multimedia information. – Over-specialization - CB recommenders have no inherent method for finding something unexpected. The system recommends only items scoring highly against the user profile, i.e. items similar to those already rated. This drawback is also called serendipity problem. – New user - Enough ratings have to be collected before a CB system can really understand user preferences and provide accurate recommendations. Therefore, when few ratings are available, such as for a new user, the system would not be able to provide reliable recommendations. 2.3 Other Approaches – Demographic Recommender Systems These systems aim to categorize the user starting from personal attributes making recommendation based on demographic classes. Grundy [32], for example, recommends books by gathering personal information through an interactive dialogue matching users responses against a library of manually assembled user stereotypes. Pazzani [25] uses machine learning techniques to obtain a classifier based on demographic data. The representation of demographic information in a user model can vary greatly. Grundy system uses hand-crafted attributes with numeric confidence values, while Pazzani extracts features from users’ home pages. The benefit of a demographic approach is that it may not require a history of user ratings of the type needed by collaborative and content-based techniques. However, up to our knowledge, there are not many recommender systems using demographic data because this form of information is diffi- cult to collect: till some years ago, indeed, users were reluctant to share a big amount of personal information with a system. Nowadays with the exponential growth of social network and the continuous expansion of Web 2.0 platforms , the situation is changed towards a more open perspective, with users more trustful to sharing of information. Despite this, still today demographic approaches notice less success than others. – Knowledge-based Recommender Systems These systems uses a knowledge-based (KB) approach to generate recommendations. All recommendation techniques make some kind of inference. KB approaches are distinguished in that they have functional knowledge: they have knowledge about how a particular item meets a particular user need, and can therefore reason about the relationship between a need and a possible recommendation [9]. The user profile can be any knowledge structure that supports this inference. In the simplest case, as in Google, it may simply be the query that the user has formulated. In others, it may be a more detailed representation of the user needs [39]
A particular kind of KB systems implement the case-based reasoning(CBR) This recommender solves a new problem looking up a similar past solved one. In 21, four main steps of a CBR recommender are identified: retrieve, reuse, adaptation, and retain. The first step looks in the knowledge-base for a case similar to the new problem, then reuse the retrieved solution(making some adaptation, if necessary). Finally the new adapted case is stored in the case-library. In this system there is not a user preference elicitation because the main task of the recommendation algorithm is to retrieve the case most similar to the problem to solve. The KB systems do not have a ramp-up problem("early rater"problem and the"sparse ratings "problem) since its recommendations do not depend on base of user ratings. Therefore KB approach is complementary to others [8 Hybrid Recommender Systems They combine two or more recommender algorithms(the more frequent ap- proach is to combine CF and CB)in order to emphasize their strengths and to level out their corresponding weaknesses. Robin Burke proposed a very analytical classification of hybrid systems 9, listing a number of hybridiza tion methods to combine pairs of recommender algorithms WEIGHTED-The score(or votes)of a recommended item is compute from the results of all of the available recommendation techniques present in the system. The simplest combined hybrid would be a linear combi- nation of recommendation scores SWITCHING- A switching hybrid uses some criterion to switch between recommendation techniques. Switching hybrids introduce additional com- plexity into the recommendation process since the switching criteria must e determined, and this introduces another level of parameterizatio MIXED- Recommendations from several different recommenders are pre- sented at the same time. This may be possible where it is practical to make large number of recommendations simultaneously FEATURE COMBINATION- Features from different recommendation sources are thrown together into a single recommendation algorithm For exam- ple CF and CB techniques might be merged treating collaborative infor- mation as simply additional feature data associated with each example and using content-based techniques over this augmented data set CASCADE- The cascade hybrid involves a staged process because one recommender refines the recommendations given by another one. FEATURE AUGMENTATION- Output from one technique is used as an input feature to another. This means that one technique is employed to produce a rating or classification of an item and that information is then incorporated into the processing of the next recommendation technique. META-LEVEL- The model learned by one recommender is used as input o another. This differs from feature augmentation: in an augmentation hybrid, we use a learned model to generate features for input to a second algorithm; in a meta-level hy brid, the entire model becomes the input
A particular kind of KB systems implement the case-based reasoning (CBR). This recommender solves a new problem looking up a similar past solved one. In [21], four main steps of a CBR recommender are identified: retrieve, reuse, adaptation, and retain. The first step looks in the knowledge-base for a case similar to the new problem, then reuse the retrieved solution (making some adaptation, if necessary). Finally the new adapted case is stored in the case-library. In this system there is not a user preference elicitation because the main task of the recommendation algorithm is to retrieve the case most similar to the problem to solve. The KB systems do not have a ramp-up problem (“early rater” problem and the “sparse ratings” problem) since its recommendations do not depend on a base of user ratings. Therefore KB approach is complementary to others [8]. – Hybrid Recommender Systems They combine two or more recommender algorithms (the more frequent approach is to combine CF and CB) in order to emphasize their strengths and to level out their corresponding weaknesses. Robin Burke proposed a very analytical classification of hybrid systems [9], listing a number of hybridization methods to combine pairs of recommender algorithms. • Weighted - The score (or votes) of a recommended item is computed from the results of all of the available recommendation techniques present in the system. The simplest combined hybrid would be a linear combination of recommendation scores. • Switching - A switching hybrid uses some criterion to switch between recommendation techniques. Switching hybrids introduce additional complexity into the recommendation process since the switching criteria must be determined, and this introduces another level of parameterization. • Mixed - Recommendations from several different recommenders are presented at the same time. This may be possible where it is practical to make large number of recommendations simultaneously. • Feature Combination - Features from different recommendation sources are thrown together into a single recommendation algorithm. For example CF and CB techniques might be merged treating collaborative information as simply additional feature data associated with each example and using content-based techniques over this augmented data set. • Cascade - The cascade hybrid involves a staged process because one recommender refines the recommendations given by another one. • Feature Augmentation - Output from one technique is used as an input feature to another. This means that one technique is employed to produce a rating or classification of an item and that information is then incorporated into the processing of the next recommendation technique. • Meta-level - The model learned by one recommender is used as input to another. This differs from feature augmentation: in an augmentation hybrid, we use a learned model to generate features for input to a second algorithm; in a meta-level hybrid, the entire model becomes the input
3 Learning User Preferences in Recommender Systems As stated by 7, a preference is an ordering relation between two or more items to characterize which, among a set of possible choices, is the one that best fits user tastes. Preferences are something able to guide our choices, discriminating items we like from those we don't like(or we like the least). In other terms, learning user preferences is a way to find the solution of a research(or optimization, in some case) problem whose space of possible solutions is represented by the set of the items the user can enjoy(namely, in recommender systems, the set of items that can be recommended). Although the semantics of the concept of preference is pretty clear, acquiring user preferences and working with them is a more difficult task. Indeed, the complexity of the problem of preference learning is strictly related to the number of dimensions used to represent the set of possible choices. So, in order to generate a user profile we need to gather user feedbacks in order to catch information about user prefences and model them using a specific representation. Next, this information can be processed(e.g. through Machine Learning-related approaches)in order to learn user profiles to be exploited in the recommendation process 3.1 Feedback Gathering The information filtering and information retrieval systems rely on relevance feedback(RF) to capture an appropriate snapshot of user information needs in order to allow the user to directly express her notion of relevance with respect to individual documents 5. RF has been employed in several classes of personaliza tion systems. Driven by the need for better representation of information needs RF was initially introduced to support basic query expansion 33. However, its success in inferring the user's notion of relevance on a per-document basis has lead to a subsequent adoption by information filtering and recommendation sys- tems RF approaches are based on a feedback gathering scheme, either explicit or implicit. In the former, object ratings of predefined scale are provided explicitly by users, while implicit feedback gathering techniques infer object relevance in a transparent fashion, by monitoring user interaction with the syst Explicit Ratings. The use of explicit ratings is common in everyday life; rang ing from grading students' work to assessing competing consumer goods(see Alton-Scheidl et al. [2 for a review ). Although some forms of rating are made in free text form(e. g. book reviews), it is frequently the case that ratings are made on an agreed discrete scale(e. g star ratings for restaurants, marks out of ten for films, etc). Ratings made on these scales allow these judgments to be processed statistically to provide averages, ranges, distributions, etc. A central feature of explicit ratings is that the evaluator has to examine an item and assign it a value on the rating scale. This imposes a cognitive cost on the evaluator to assess the ce of an object 24]. Indeed, the act of rating alters the user behavi
3 Learning User Preferences in Recommender Systems As stated by [7], a preference is an ordering relation between two or more items to characterize which, among a set of possible choices, is the one that best fits user tastes. Preferences are something able to guide our choices, discriminating items we like from those we don’t like (or we like the least). In other terms, learning user preferences is a way to find the solution of a research (or optimization, in some case) problem whose space of possible solutions is represented by the set of the items the user can enjoy (namely, in recommender systems, the set of items that can be recommended). Although the semantics of the concept of preference is pretty clear, acquiring user preferences and working with them is a more difficult task. Indeed, the complexity of the problem of preference learning is strictly related to the number of dimensions used to represent the set of possible choices. So, in order to generate a user profile we need to gather user feedbacks in order to catch information about user prefences and model them using a specific representation. Next, this information can be processed (e.g. through Machine Learning-related approaches) in order to learn user profiles to be exploited in the recommendation process. 3.1 Feedback Gathering The information filtering and information retrieval systems rely on relevance feedback (RF) to capture an appropriate snapshot of user information needs in order to allow the user to directly express her notion of relevance with respect to individual documents [5]. RF has been employed in several classes of personalization systems. Driven by the need for better representation of information needs, RF was initially introduced to support basic query expansion [33]. However, its success in inferring the user’s notion of relevance on a per-document basis has lead to a subsequent adoption by information filtering and recommendation systems. RF approaches are based on a feedback gathering scheme, either explicit or implicit. In the former, object ratings of predefined scale are provided explicitly by users, while implicit feedback gathering techniques infer object relevance in a transparent fashion, by monitoring user interaction with the system. Explicit Ratings. The use of explicit ratings is common in everyday life; ranging from grading students’ work to assessing competing consumer goods (see Alton-Scheidl et al. [2] for a review). Although some forms of rating are made in free text form (e.g. book reviews), it is frequently the case that ratings are made on an agreed discrete scale (e.g. star ratings for restaurants, marks out of ten for films, etc). Ratings made on these scales allow these judgments to be processed statistically to provide averages, ranges, distributions, etc. A central feature of explicit ratings is that the evaluator has to examine an item and assign it a value on the rating scale. This imposes a cognitive cost on the evaluator to assess the performance of an object [24]. Indeed, the act of rating alters the user behavior
from her normal interaction pattern and, consequently, even less noticeable ex- plicit feedback approaches are considered expensive. Since the results may not become immediately apparent, users tend to skip the rating task 15 Also, explicit RF techniques disregard user knowledge on the current topic. Users are often unclear about their search interests. They browse for more in- formation to clarify their need and re-formulate their query accordingly. The uncertainty in their search episodes increases the cognitive load during explicit RF, as users must decide on the relevance of an item with a lack of confidence. Finally, the use of explicit ratings imposes privacy issues that have to be esolved [16. Irrespective of the underlying reason, users are not always com- fortable in providing direct indications of their interests. Due to the obtrusive nature of explicit ratings, not many users are willing to de them. hence the performance of profile capturing and recommendation algorithms of such systems degrades, due to the dearth of ratings. In CF systems based on explicit feedback gathering policies, the sparsity of rF judgments can often render such ns unusable, since there are few previous assessments to learn from Explicit RF can relying also on critiquing examples. For instance, Client [27 allows to plan travel arrangements. Users are required to examples of possible solutions. For instance, " the arrival time of this fight leg is too late. "The interaction is cyclical: (1) the system provides example solutions (2)the user examines any of them and may state a critique on any aspect of it (3)the critique becomes an additional preference in the model, and (4)the sys- em refines the solution set. Ricci and Nguyen 31] propose a similar critiquing interaction to provide recommendations of restaurants in a mobile context As discussed in Pu and Chen 26, the motivation for this methodology is that people usually cannot state preferences in advance but construct their pref- erences as they see the available options. However, because the critiques come from the user in response to the shown examples, the current solutions can hinder the user from refocusing the search in another direction(the anchoring effect). A complete preference model can be acquired only if the system is able to stimulate the user by showing diverse examples Implicit Ratings. Implicit RF gathering techniques are proposed as unob- trusive alternative or supplement to explicit ratings in order to state(indirect) assessment about usefulness of any individual item. Such techniques passivel monitor user interactions with the system in order to estimate user interests[23 Click-throughs, time spent viewing a document and mouse gestures are among the possible sources of implicit feedback [17. The main benefits of implicit feed back, over explicit ratings, are that they remove the cognitive cost of providing relevance judgments explicitly and they can be gathered in large quantities and aggregated to infer item relevance Since implicit judgments are derived trans- parently, they contain less indicative value than explicit ratings. Although the accuracy of implicit approaches has been questioned 24], recent studies have shown that they can be effectively adopted to state relevance feedback 40
from her normal interaction pattern and, consequently, even less noticeable explicit feedback approaches are considered expensive. Since the results may not become immediately apparent, users tend to skip the rating task [15]. Also, explicit RF techniques disregard user knowledge on the current topic. Users are often unclear about their search interests. They browse for more information to clarify their need and re-formulate their query accordingly. The uncertainty in their search episodes increases the cognitive load during explicit RF, as users must decide on the relevance of an item with a lack of confidence. Finally, the use of explicit ratings imposes privacy issues that have to be resolved [16]. Irrespective of the underlying reason, users are not always comfortable in providing direct indications of their interests. Due to the obtrusive nature of explicit ratings, not many users are willing to provide them. Hence, the performance of profile capturing and recommendation algorithms of such systems degrades, due to the dearth of ratings. In CF systems based on explicit feedback gathering policies, the sparsity of RF judgments can often render such systems unusable, since there are few previous assessments to learn from. Explicit RF can relying also on critiquing examples. For instance, SmartClient [27] allows to plan travel arrangements. Users are required to critique examples of possible solutions. For instance, “the arrival time of this flight leg is too late.” The interaction is cyclical: (1) the system provides example solutions, (2) the user examines any of them and may state a critique on any aspect of it, (3) the critique becomes an additional preference in the model, and (4) the system refines the solution set. Ricci and Nguyen [31] propose a similar critiquing interaction to provide recommendations of restaurants in a mobile context. As discussed in Pu and Chen [26], the motivation for this methodology is that people usually cannot state preferences in advance but construct their preferences as they see the available options. However, because the critiques come from the user in response to the shown examples, the current solutions can hinder the user from refocusing the search in another direction (the anchoring effect). A complete preference model can be acquired only if the system is able to stimulate the user by showing diverse examples. Implicit Ratings. Implicit RF gathering techniques are proposed as unobtrusive alternative or supplement to explicit ratings in order to state (indirect) assessment about usefulness of any individual item. Such techniques passively monitor user interactions with the system in order to estimate user interests [23]. Click-throughs, time spent viewing a document and mouse gestures are among the possible sources of implicit feedback [17]. The main benefits of implicit feedback, over explicit ratings, are that they remove the cognitive cost of providing relevance judgments explicitly and they can be gathered in large quantities and aggregated to infer item relevance. Since implicit judgments are derived transparently, they contain less indicative value than explicit ratings. Although the accuracy of implicit approaches has been questioned [24], recent studies have shown that they can be effectively adopted to state relevance feedback [40]
There are several types of feedback that can be implicitly captured. Nichols 24 presented a list of potential types of user behaviors that could be exploited as sources for implicit feedback. Kelly Teevan [17 extended a classification of observable feedback behaviors according to two axes, Behavior Category and Minimum Scope to categorize actions that can be observed during user information seeking episodes. Their work has also focused on classifying existing scientific literature on implicit feedback according to Behavior Category and Minimum Scope. Unsurprising, a lot of analyzed works concerns examination with object scope, i.e. click-through or scrolling measures are largely investigated and exhibit a strong positive correlation with the explicit ratings. Such data can be easily captured in realtime at no considerable computational cost, while user behaviors that fall in the "Reference", "Annotate"and"Create"requis mard to a more precise control over individual services and applications and, thus, are capture and benefits for estimating user interests are not fully clear 3.2 Modeling User Preferences Feedback gathering techniques allow to collect information about user tastes and interests. However, before this information can be exploited as input to learn preferences models, this data need to be modeled following a specific representa- tion. Techniques for modeling information(we usually refer to this as items, in recommender systems)can be split depending on the kind of data which will be stored in the user profile. If we have to handle unstructured data(the ones usually exploited by CB recommenders) it is necessary to process them through some Information Retrieval-related techniques(such as stemming, lemmatization, in- dexing, and so on) which let us to shift from a textual source to a structured one For structured data, like generic ratings or some well-defined attribute-value pairs(e. g. demographic data), instead, it is possible to represent them through a matrix, how usually happens in CF systems. In both cases all the informatior provided by the user, apart from their nature, can be also represented in a more complex way( semantic or neural networks, probabilistic models, etc. )so that we can use them as input for learning user profiles In the next section we will survey several Machine Learning techniques learning user profiles in different recommender systems 3.3 Techniques for Learning User Profiles According to [1, recommendations techniques can be grouped into two general classes: model-based and memory/heuristic based. The same classification can be made for techniques for learning user profiles: offline learning techniques(used in model-based recommender systems) and online learning techniques (used in The Behavior Category(Examine, Retain, Reference, Annotate and Create), refers to the underlying purpose of the observed behavior Minimum Scope(Segment, Object and Class), refers to the smallest of the item being acted upon
There are several types of feedback that can be implicitly captured. Nichols [24] presented a list of potential types of user behaviors that could be exploited as sources for implicit feedback. Kelly & Teevan [17] extended a classification of observable feedback behaviors according to two axes, Behavior Category1 and Minimum Scope2 to categorize actions that can be observed during user information seeking episodes. Their work has also focused on classifying existing scientific literature on implicit feedback according to Behavior Category and Minimum Scope. Unsurprising, a lot of analyzed works concerns examination with object scope, i.e. click-through or scrolling measures are largely investigated and exhibit a strong positive correlation with the explicit ratings. Such data can be easily captured in realtime at no considerable computational cost, while user behaviors that fall in the “Reference”, “Annotate” and “Create” require a more precise control over individual services and applications and, thus, are hard to capture and benefits for estimating user interests are not fully clear. 3.2 Modeling User Preferences Feedback gathering techniques allow to collect information about user tastes and interests. However, before this information can be exploited as input to learn preferences models, this data need to be modeled following a specific representation. Techniques for modeling information (we usually refer to this as items, in recommender systems) can be split depending on the kind of data which will be stored in the user profile. If we have to handle unstructured data (the ones usually exploited by CB recommenders) it is necessary to process them through some Information Retrieval-related techniques (such as stemming, lemmatization, indexing, and so on) which let us to shift from a textual source to a structured one. For structured data, like generic ratings or some well-defined attribute-value pairs (e.g. demographic data), instead, it is possible to represent them through a matrix, how usually happens in CF systems. In both cases all the information provided by the user, apart from their nature, can be also represented in a more complex way (semantic or neural networks, probabilistic models, etc.) so that we can use them as input for learning user profiles. In the next section we will survey several Machine Learning techniques for learning user profiles in different recommender systems. 3.3 Techniques for Learning User Profiles According to [1], recommendations techniques can be grouped into two general classes: model-based and memory/heuristic based. The same classification can be made for techniques for learning user profiles: offline learning techniques (used in model-based recommender systems) and online learning techniques (used in 1 The Behavior Category (Examine, Retain, Reference, Annotate and Create), refers to the underlying purpose of the observed behavior. 2 Minimum Scope (Segment, Object and Class), refers to the smallest possible scope of the item being acted upon