The current issue and full text archive of this journal is available at www.emeraldinsight.com/1066-2243.htm Integrating user modeling g approaches approaches into a framework for recommender agents Daniela Godoy, Silvia Schiaffino and Analia amandi Facultad de ciencias exactas IS/STAN Research Institute. UNCPBA Received 10 February 2009 Tandil, Argentina and Consejo Nacional de investigaciones Cientificas y Tecnicas (CONICET, Buenos Aires, argentina 9 Novel Accepted 14 Novel Abstract Recommender agents are used to make recommendations of interesting items in a wide variety of application domains, such as web page recommendation, music, e-commerce, movie recommendation, tourism, restaurant recommendation, among others. Despite the various and different domains in which recommender agents are used and the variety of approaches they use to represent user interests and make recommendations, there is some functionality that is common to all generalizing these common behaviors into a framework that enables developers to reuse recommender gents main characteristics in their own developments. Design/methodology/approach-This work presents a framework for recommendation that provides the control structures, the data structures and a set of algorithms and metrics for different commendation methods. The proposed framework acts as the base design for recommender gents or applications that want to add the already modeled and implemented capabilities to their own functionality. In contrast with other proposals, this framework is designed to enable the integration of diverse user models, such as demographic, content-based and item-based. In additio to the different implementations provided for these components, new algorithms and user mode representations can be easily added to the proposed approach. Thus, personal agents originally designed to assist a single user can reuse the behavior implemented in the framework to expand heir recommendation strategies. Findings-The paper describes three different recommender agents built by materializing the proposed framework: a movie recommender agent, a tourism recommender agent, and a web page recommender agent. Each agent uses a different recommendation approach. Personalsearcher, an agent originally designed to suggest interesting web pages to a user, was extended to collaboratively movies using an item-based approach and Traveller suggests holiday packages using demographic user models. Findings encountered during the development of these agents and their empirical evaluation are described here. Originality/value- The advantages of the proposed framework are twofold. On the one hand, the functionality provided by the framework enables the development of recommender agents without the Emerald need for implementing its whole set of capabilities from scratch. The main processes and data structures of recommender agents are already implemented. On the other hand, already existing agents can be enhanced by incorporating the functionality provided by the recommendation framework in order to act collaborative vol20No.1,2010 Keywords Modelling, Worldwide web, Internet 0662243 Research pa
Integrating user modeling approaches into a framework for recommender agents Daniela Godoy, Silvia Schiaffino and Analı´a Amandi Facultad de Ciencias Exactas, ISISTAN Research Institute, UNCPBA, Tandil, Argentina and Consejo Nacional de Investigaciones, Cientı´ficas y Te´cnicas (CONICET), Buenos Aires, Argentina Abstract Purpose – Recommender agents are used to make recommendations of interesting items in a wide variety of application domains, such as web page recommendation, music, e-commerce, movie recommendation, tourism, restaurant recommendation, among others. Despite the various and different domains in which recommender agents are used and the variety of approaches they use to represent user interests and make recommendations, there is some functionality that is common to all of them, such as user model management and recommendation of interesting items. This paper aims at generalizing these common behaviors into a framework that enables developers to reuse recommender agents’ main characteristics in their own developments. Design/methodology/approach – This work presents a framework for recommendation that provides the control structures, the data structures and a set of algorithms and metrics for different recommendation methods. The proposed framework acts as the base design for recommender agents or applications that want to add the already modeled and implemented capabilities to their own functionality. In contrast with other proposals, this framework is designed to enable the integration of diverse user models, such as demographic, content-based and item-based. In addition to the different implementations provided for these components, new algorithms and user model representations can be easily added to the proposed approach. Thus, personal agents originally designed to assist a single user can reuse the behavior implemented in the framework to expand their recommendation strategies. Findings – The paper describes three different recommender agents built by materializing the proposed framework: a movie recommender agent, a tourism recommender agent, and a web page recommender agent. Each agent uses a different recommendation approach. PersonalSearcher, an agent originally designed to suggest interesting web pages to a user, was extended to collaboratively assist a group of users using content-based algorithms. MovieRecommender recommends interesting movies using an item-based approach and Traveller suggests holiday packages using demographic user models. Findings encountered during the development of these agents and their empirical evaluation are described here. Originality/value – The advantages of the proposed framework are twofold. On the one hand, the functionality provided by the framework enables the development of recommender agents without the need for implementing its whole set of capabilities from scratch. The main processes and data structures of recommender agents are already implemented. On the other hand, already existing agents can be enhanced by incorporating the functionality provided by the recommendation framework in order to act collaboratively. Keywords Modelling, Worldwide web, Internet Paper type Research paper The current issue and full text archive of this journal is available at www.emeraldinsight.com/1066-2243.htm User modeling approaches 29 Received 10 February 2009 Revised 7 November 2009, 9 November 2009 Accepted 14 November 2009 Internet Research Vol. 20 No. 1, 2010 pp. 29-54 q Emerald Group Publishing Limited 1066-2243 DOI 10.1108/10662241011020824
1. Introduction 20.1 Recommender agents are used to make recommendations of interesting items in a wide variety of application domains, such as web page recommendation lieberman et al 2001), music(Yoshii et al, 2008; Yapriady and Uitdenbogerd, 2005), e-commerce (Schafer et al, 2000), movie recommendation (Milleret al, 2003), tourism(Srisuwan and Srivihok, 2008), restaurant recommendation(Burke et al, 1996) among others. 30 User modeling approaches used by these agents differ not only because of domain-dependent characteristics, but also because of the recommendation strategy they adopt. Although there have been some attempts to abstract common behavior of recommender systems into general frameworks, these approaches fail at integrating multiple types of user models. For example, the framework Collaborative filtering Engine[l](coFE)(Herlocker et al, 1999), provides algorithms for collaborative filtering However, it only supports user preferences represented by a list of rated items. The framework is not designed to support collaboration based content-based and demographic user models. Similarly, Taste[2] is a collaborative filtering engine for Java, which takes users' preferences for items ("tastes")and returns estimated preferences for other items. Taste supports both memory-based and item-based recommender systems, but it does not currently support model-based recommenders. Pazzani(1999)presents a theoretical framework to integrate the three types of user models rather than a software design of reusable components for recommender agents In this work we present a framework for recommendation that provides the control tructures, the data structures and a set of algorithms and metrics for different recommendation methods. The proposed framework acts as the base design for recommender agents or applications that want to add the already modeled and implemented capabilities to their own functionality In contrast to other proposals, this framework is designed to enable the integration of diverse user models, such as demographic, content-based and item-based models. Thus, personal agents originally designed to assist a single user can reuse the behavior implemented in the framework to expand their recommendation strategies We describe three recommender agents that have been built and enhanced by reusing the functionality implemented in the framework as examples of its instantiation. Each agent uses a different recommendation approach. PersonalSearcher( Godoy and amandi, 2000), an agent originally designed to suggest interesting Web pages to a user, was extended to collaboratively assist a group of users using content-based algorithms. MovieRecommender recommends interesting movies using an item-based approach and Traveller suggests holiday packages using demographic user models The article is organized as follows. Section 2 describes the different recommendations approaches we can find in the literature, mainly from the user modeling point of view. Section 3 presents our proposed framework, describing its main characteristics. Section 4 describes three recommender agents that were built using the proposed framework. Finally, section 6 presents our conclusions and future work 2. User modeling in recommender agents a variety of approaches have been used by agents to perform recommendations including content-based, collaborative, demographic, knowledge-based and others MMontaner et al, 2003; Adomavicius and Tuzhilin, 2005). To improve performance
1. Introduction Recommender agents are used to make recommendations of interesting items in a wide variety of application domains, such as web page recommendation (Lieberman et al., 2001), music (Yoshii et al., 2008; Yapriady and Uitdenbogerd, 2005), e-commerce (Schafer et al., 2000), movie recommendation (Miller et al., 2003), tourism (Srisuwan and Srivihok, 2008), restaurant recommendation (Burke et al., 1996) among others. User modeling approaches used by these agents differ not only because of domain-dependent characteristics, but also because of the recommendation strategy they adopt. Although there have been some attempts to abstract common behavior of recommender systems into general frameworks, these approaches fail at integrating multiple types of user models. For example, the framework Collaborative Filtering Engine[1] (CoFE) (Herlocker et al., 1999), provides algorithms for collaborative filtering. However, it only supports user preferences represented by a list of rated items. The framework is not designed to support collaboration based on content-based and demographic user models. Similarly, Taste[2] is a coollaborative filtering engine for Java, which takes users’ preferences for items (“tastes”) and returns estimated preferences for other items. Taste supports both memory-based and item-based recommender systems, but it does not currently support model-based recommenders. Pazzani (1999) presents a theoretical framework to integrate the three types of user models rather than a software design of reusable components for recommender agents. In this work we present a framework for recommendation that provides the control structures, the data structures and a set of algorithms and metrics for different recommendation methods. The proposed framework acts as the base design for recommender agents or applications that want to add the already modeled and implemented capabilities to their own functionality. In contrast to other proposals, this framework is designed to enable the integration of diverse user models, such as demographic, content-based and item-based models. Thus, personal agents originally designed to assist a single user can reuse the behavior implemented in the framework to expand their recommendation strategies. We describe three recommender agents that have been built and enhanced by reusing the functionality implemented in the framework as examples of its instantiation. Each agent uses a different recommendation approach. PersonalSearcher (Godoy and Amandi, 2000), an agent originally designed to suggest interesting Web pages to a user, was extended to collaboratively assist a group of users using content-based algorithms. MovieRecommender recommends interesting movies using an item-based approach and Traveller suggests holiday packages using demographic user models. The article is organized as follows. Section 2 describes the different recommendations approaches we can find in the literature, mainly from the user modeling point of view. Section 3 presents our proposed framework, describing its main characteristics. Section 4 describes three recommender agents that were built using the proposed framework. Finally, section 6 presents our conclusions and future work. 2. User modeling in recommender agents A variety of approaches have been used by agents to perform recommendations, including content-based, collaborative, demographic, knowledge-based and others (Montaner et al., 2003; Adomavicius and Tuzhilin, 2005). To improve performance, INTR 20,1 30
these methods have sometimes been combined in hybrid recommenders(Yoshii et al g 2008). In spite of their common goal, these approaches differ in the form they represent user interests or preferences into user models. Figure 1 shows the integration of these approaches user models into the framework for collaborative recommender agents. The content-based approach is based on the intuition that each user exhibits a particular behavior under a given set of circumstances, and that this behavior is repeated under similar circumstances(Zukerman and Albrecht, 2001 ). A content-based 31 recommender learns a model of the user interests based on the features present in items the user rated as interesting either by implicit or explicit feedback. Thus, a user model contains those features that characterize a user interests, enabling agents to categorize items for recommendation based on the features they exhibit. For example, text recommendation in agents like News Dude(Billsus and pazzani, 1999)or Letizia Lieberman et al, 2001)use the words appearing in documents as features. The user models derived by content-based recommenders depend on the learning ethods employed In existing agents, user models range from a simple set of words weighted according to their importance at describing the user interests or the output format of a particular learning algorithm such as a decision tree or a probabilistic network; and to more sophisticated models keeping track of both long-term and short-term Interests In contrast with the content-based approach in which the behavior of users is predicted from their past behavior, collaborative filtering(CF) is based on the intuition that people within a particular group tend to behave alike under similar circumstances. In the collaborative filtering approach the behavior of a user is predicted from the behavior of other like-minded people (zukerman and Albrecht, 2001) User B User User c Content-based Mission: Impossible 8 user model Marital Status Married Collaborative user model user model User models comparison& clustering Dey c Collaboration algorithms& strategies Integration of diverse user Items& feedback Data structures& logs nodels into the framework Collaborative framework ecommendation
these methods have sometimes been combined in hybrid recommenders (Yoshii et al., 2008). In spite of their common goal, these approaches differ in the form they represent user interests or preferences into user models. Figure 1 shows the integration of these user models into the framework for collaborative recommender agents. The content-based approach is based on the intuition that each user exhibits a particular behavior under a given set of circumstances, and that this behavior is repeated under similar circumstances (Zukerman and Albrecht, 2001). A content-based recommender learns a model of the user interests based on the features present in items the user rated as interesting either by implicit or explicit feedback. Thus, a user model contains those features that characterize a user interests, enabling agents to categorize items for recommendation based on the features they exhibit. For example, text recommendation in agents like NewsDude (Billsus and Pazzani, 1999) or Letizia (Lieberman et al., 2001) use the words appearing in documents as features. The user models derived by content-based recommenders depend on the learning methods employed. In existing agents, user models range from: . a simple set of words weighted according to their importance at describing the user interests or the output format of a particular learning algorithm such as a decision tree or a probabilistic network; and . to more sophisticated models keeping track of both long-term and short-term interests. In contrast with the content-based approach in which the behavior of users is predicted from their past behavior, collaborative filtering (CF) is based on the intuition that people within a particular group tend to behave alike under similar circumstances. In the collaborative filtering approach the behavior of a user is predicted from the behavior of other like-minded people (Zukerman and Albrecht, 2001). Figure 1. Integration of diverse user models into the framework for collaborative recommendation User modeling approaches 31
In a collaborative filtering system, there is a database of m users U=u1, 42, .. uml 20.1 n items/=(1, i2,..., im) and a mapping between user-item pairs. The latter mapping is represented as a m x n matrix M. In the pure cf approach the matrix M usually represents ratings of items given either explicitly or implicitly by users, thus the entry Mri represents a user u rating on item i Thus, the preferences of users are explicitl stated by the matrix M and a user model in this approach comprises a vector of item 2 ratings, with the ratings being binary or real-valued. The aim of collaborative filtering for the active user ur is to predict the score for an item i; which has not been rated yet by ur in order to recommend this item. By comparing the ratings of the active user to those of other users using some similarity measure, the system determines users who are most similar to the active one, and makes predictions or recommendations based on items that similar users have previously rated highly It is possible to identify two major classes of collaborative filtering, memory-based and model-based(Sarwar et al, 2001). Memory-based collaborative filtering uses nearest-neighbor algorithms that determine a set of neighboring users who have rated items similarly, and combine the neighbors preferences to obtain a prediction for the active user Model-based collaborative recommenders do not use the user-item matrix directly to make recommendations, they generalize a model of user ratings using some machine learning approach and use this model to make predictions. Memory-based is the most popular prediction technique in Cf applications since it is more efficient in medium-size matrices, some examples are(Resnick et al, 1994; Shardanand and Maes, 1995; Terveen et aL, 1997). However, if the user-item matrix is large the nearest neighbor computation becomes expensive. Then, model-based recommenders like Basu et al, 1998; Zhang and lyengar, 2002; Heckerman et al, 2000; Lee, 2001; Lin et al, 2002) are a suitable alternative. x Demographic recommenders aim at categorizing users based on their personal tributes as belonging to stereotypical classes. Instead of applying learning techniques for acquiring user models, these agents are based on stereotype reasoning (Kobsa et al, 2001). In this case, a user model is a list of demographic features that represent a class of users. This representation of demographic information in a user model can vary greatly. For example, Pazzani(1999)extract features from home pages to predict the preferences for certain restaurants, and Krulwich( 1997) use demographic groups from marketing research to suggest a range of products and services. In knowledge-based approaches, recommendation is based on inferences about a er needs and preferences which are performed using some functional knowledge that is, there is knowledge about how a particular item meets a particular user need, and can therefore reason about the relationship between a need and a possible recommendation(Burke, 2002 ). The user models in knowledge-based recommenders can also take many forms, since they can consist of any knowledge structure that supports inference. The restaurant recommender Entree(Burke et al, 1996) makes recommendations by finding restaurants in a new city similar to restaurants the user knows and likes based on the knowledge of cuisines to infer similarity between restaurants. Ontology-based user profiling is also an example of knowledge-based recommendation. For example, Quickstep(Middleton et al, 2004)is a recommender system addressing the problem of recommending on-line research papers to esearchers, which bases user interest models on an ontology of research paper topics
In a collaborative filtering system, there is a database of m users U ¼ f g u1; u2; ... ; um , n items I ¼ f g i1; i2; ... ; im and a mapping between user-item pairs. The latter mapping is represented as a m £ n matrix M. In the pure CF approach the matrix M usually represents ratings of items given either explicitly or implicitly by users, thus the entry Mrj represents a user ur rating on item ij. Thus, the preferences of users are explicitly stated by the matrix M and a user model in this approach comprises a vector of item ratings, with the ratings being binary or real-valued. The aim of collaborative filtering for the active user ur is to predict the score for an item ij which has not been rated yet by ur in order to recommend this item. By comparing the ratings of the active user to those of other users using some similarity measure, the system determines users who are most similar to the active one, and makes predictions or recommendations based on items that similar users have previously rated highly. It is possible to identify two major classes of collaborative filtering, memory-based and model-based (Sarwar et al., 2001). Memory-based collaborative filtering uses nearest-neighbor algorithms that determine a set of neighboring users who have rated items similarly, and combine the neighbor’s preferences to obtain a prediction for the active user. Model-based collaborative recommenders do not use the user-item matrix directly to make recommendations, they generalize a model of user ratings using some machine learning approach and use this model to make predictions. Memory-based is the most popular prediction technique in CF applications since it is more efficient in medium-size matrices, some examples are (Resnick et al., 1994; Shardanand and Maes, 1995; Terveen et al., 1997). However, if the user-item matrix is large the nearest neighbor computation becomes expensive. Then, model-based recommenders like (Basu et al., 1998; Zhang and Iyengar, 2002; Heckerman et al., 2000; Lee, 2001; Lin et al., 2002) are a suitable alternative. Demographic recommenders aim at categorizing users based on their personal attributes as belonging to stereotypical classes. Instead of applying learning techniques for acquiring user models, these agents are based on stereotype reasoning (Kobsa et al., 2001). In this case, a user model is a list of demographic features that represent a class of users. This representation of demographic information in a user model can vary greatly. For example, Pazzani (1999) extract features from home pages to predict the preferences for certain restaurants, and Krulwich (1997) use demographic groups from marketing research to suggest a range of products and services. In knowledge-based approaches, recommendation is based on inferences about a user needs and preferences which are performed using some functional knowledge, that is, there is knowledge about how a particular item meets a particular user need, and can therefore reason about the relationship between a need and a possible recommendation (Burke, 2002). The user models in knowledge-based recommenders can also take many forms, since they can consist of any knowledge structure that supports inference. The restaurant recommender Entree (Burke et al., 1996) makes recommendations by finding restaurants in a new city similar to restaurants the user knows and likes based on the knowledge of cuisines to infer similarity between restaurants. Ontology-based user profiling is also an example of knowledge-based recommendation. For example, Quickstep (Middleton et al., 2004) is a recommender system addressing the problem of recommending on-line research papers to researchers, which bases user interest models on an ontology of research paper topics. INTR 20,1 32
In spite of these different types of user models, recommender agents still share similar g behavior as regare rds collaboration. Most of these agents have to compare user models to find similar users for exchanging information about potential interesting items and, at approaches the same time, they can be benefited from the knowledge that can be extracted from the implicit behavior of the whole community. The following section presents our proposal to abstract and model into a framework this common functionality 3. Abstracting the common behavior of recommender agents If we analyze different recommender agents, we can observe that despite the domain-dependent characteristics and user modeling approaches, these agents behave quite similarly. Then, the characteristics and behaviors that are common to most recommender agents can be abstracted to build a framework to facilitate the implementation of this kind of systems. A software framework is a reusable design for a software system (or subsystem) that is expressed as a set of abstract classes and the way the framework components collaborate for a specific type of software (ohnson and Foote, 1988). In this paper we present a framework for recommendation that abstracts a number of content-based, collaborative and social filtering methods commonly used by recommender agents. This design acts as the skeleton for recommender agents or applications that want to add the modeled behavior to their own functionality For example, a personal agent originally designed to assist a single user can reuse the behavior implemented in the framework to start acting collaboratively in a community Figure 2 depicts a general view of this framework and its main components. Our proposed framework provides the control structure, the data structures, and a set of Recommendation Engine Feedback User Models Recommendations emory-Based Recommendation Neighbourhood Calculation Model-Based Recommendation Layer 3 Social AnalysIs a Layer 4 Social Data Social Network Social Ana Analysis component interactions
In spite of these different types of user models, recommender agents still share similar behavior as regards collaboration. Most of these agents have to compare user models to find similar users for exchanging information about potential interesting items and, at the same time, they can be benefited from the knowledge that can be extracted from the implicit behavior of the whole community. The following section presents our proposal to abstract and model into a framework this common functionality. 3. Abstracting the common behavior of recommender agents If we analyze different recommender agents, we can observe that despite the domain-dependent characteristics and user modeling approaches, these agents behave quite similarly. Then, the characteristics and behaviors that are common to most recommender agents can be abstracted to build a framework to facilitate the implementation of this kind of systems. A software framework is a reusable design for a software system (or subsystem) that is expressed as a set of abstract classes and the way the framework components collaborate for a specific type of software (Johnson and Foote, 1988). In this paper we present a framework for recommendation that abstracts a number of content-based, collaborative and social filtering methods commonly used by recommender agents. This design acts as the skeleton for recommender agents or applications that want to add the modeled behavior to their own functionality. For example, a personal agent originally designed to assist a single user can reuse the behavior implemented in the framework to start acting collaboratively in a community. Figure 2 depicts a general view of this framework and its main components. Our proposed framework provides the control structure, the data structures, and a set of Figure 2. Layered view of the framework and component interactions User modeling approaches 33
INTR algorithms and metrics for different recommendation methods. The functionality 20.1 provided by the framework is basically the following Management of user models (layer 1) Memory-based recommendation (layer 2) Model-based recommendation (layer 3) 34 Social analysis of virtual communities (layer 4) Recommendation engine. The framework is implemented in Java, using a client-server architecture. We can see it as a service-provider that is a server that provides recommender system capabilities to various remote clients, namely recommender agents. The client application or agent communicates with the server part of the framework to, for example, send user rating and receive collaborative recommendations. The algorithms, techniques and data structures used to make recommendations are on the server side. The user model is on the client side, since it is usually application dependent, and it is sent to the recommendation engine(previous some transformation if it is necessary)on the server side so that it can be used to make suggestions The framework allows the creation of recommender agents from scratch as well as the integration of more complex recommender agents with the purpose of enriching their functionality. In this direction, as shown in layer l in Figure 2, the framework supports a number of standard user models that can be used to create agents without further implementation efforts. More specific, domain-dependent user models can be added by specializing the supported models and providing a means to assess their similarity with both items to be recommended as well as other models. In section 4 the integration of different types of user models is exemplified by three instantiations of the framework. The comparison of user models enables agents to add collaborative recommendation by finding a set of users that have similar characteristics or have a history of agreeing with the active user(that is, they rate items similarly). Multiple algorithms and metrics are implemented in the framework for establishing the neighborhood of users and combine the preferences of neighbors for prediction. Thus, by only specifying the mechanism of comparison of user models, already developed personal agents can take advantage of collaborative recommendations using some memory-based algorithm. This functionality is provided by layer 2. In the next framework layer we can find the model-based collaborative filtering algorithms, which provide item recommendation by first extracting a model of users. The model inference is performed by machine learning algorithms such as clustering, Bayesian networks, or rule-based approaches. The algorithms in this layer can be used in combination with the mentioned memory-based algorithms. For example, user clustering can be used to narrow the search of neighbors in a collaborative algorithm. The recommendation engine is in charge of dealing with the recommendations generated by using the different recommendation approaches or a combination of them (e.g. content-based and collaborative recommendation). This engine enables the development of agents that pro-actively recommend users interesting information generate recommendations under demand, or both. In addition, the recommendation engine collects the feedback from users, which is used to update the user models, and records the activity of users in the system
algorithms and metrics for different recommendation methods. The functionality provided by the framework is basically the following: . Management of user models (layer 1). . Memory-based recommendation (layer 2). . Model-based recommendation (layer 3). . Social analysis of virtual communities (layer 4). . Recommendation engine. The framework is implemented in Java, using a client-server architecture. We can see it as a service-provider, that is a server that provides recommender system capabilities to various remote clients, namely recommender agents. The client application or agent communicates with the server part of the framework to, for example, send user ratings and receive collaborative recommendations. The algorithms, techniques and data structures used to make recommendations are on the server side. The user model is on the client side, since it is usually application dependent, and it is sent to the recommendation engine (previous some transformation if it is necessary) on the server side so that it can be used to make suggestions. The framework allows the creation of recommender agents from scratch as well as the integration of more complex recommender agents with the purpose of enriching their functionality. In this direction, as shown in layer 1 in Figure 2, the framework supports a number of standard user models that can be used to create agents without further implementation efforts. More specific, domain-dependent user models can be added by specializing the supported models and providing a means to assess their similarity with both items to be recommended as well as other models. In section 4 the integration of different types of user models is exemplified by three instantiations of the framework. The comparison of user models enables agents to add collaborative recommendation by finding a set of users that have similar characteristics or have a history of agreeing with the active user (that is, they rate items similarly). Multiple algorithms and metrics are implemented in the framework for establishing the neighborhood of users and combine the preferences of neighbors for prediction. Thus, by only specifying the mechanism of comparison of user models, already developed personal agents can take advantage of collaborative recommendations using some memory-based algorithm. This functionality is provided by layer 2. In the next framework layer we can find the model-based collaborative filtering algorithms, which provide item recommendation by first extracting a model of users. The model inference is performed by machine learning algorithms such as clustering, Bayesian networks, or rule-based approaches. The algorithms in this layer can be used in combination with the mentioned memory-based algorithms. For example, user clustering can be used to narrow the search of neighbors in a collaborative algorithm. The recommendation engine is in charge of dealing with the recommendations generated by using the different recommendation approaches or a combination of them (e.g. content-based and collaborative recommendation). This engine enables the development of agents that pro-actively recommend users interesting information, generate recommendations under demand, or both. In addition, the recommendation engine collects the feedback from users, which is used to update the user models, and records the activity of users in the system. INTR 20,1 34
As shown in layer 4 in the figure, the knowledge about the activities of users User modeling registered by the recommendation engine can be analyzed from a social point-of-view Thus, in the last layer of the framework it is possible to find algorithms and techniques approaches for social data analysis such as those included in the Social Data Mining(amento et al 2003)and Social Network Analysis(Sabater and Sierra, 2002) areas. Furthermore, the data about user activities serve as a source for generating diverse visualizations to explore and interpret the behavior of the community. Each component of the proposed framework is detailed in the following subsections. 3. 1 Management of user models As we have said before, a user model is a representation of a user interests, habits and preferences in a given domain. The representation formalism of a user model va from one application to another. Our approach provides a number of stan representations for user models that tries to capture the approaches most widely by recommender agents We consider three main categories of user models within recommender agents content-based user models, item- based (or collaborative) user models, and demographic user models. Each type of user model has its own representation and requires a different method to compare it against other models or against items to recommend. The following sections describe the representations modeled in the framework. 3.1.1 Content-based user models. Content-based user models are built from the observation of the interaction of a user with an underlying application. depending on the domain, different representations for the user model can be found. In our framework we consider three main representation formalisms for this kind of user model: a feature vector, a classifier denoting the relation between a set of features and a set of classes or categories, and a hierarchy of classifiers. In addition, new representations for user models can be easily adde One of the most popular representations of items is describing them through their main characteristics. This representation is known as feature vector. For example, a scientific paper can be described by the authors, an abstract, the publication date, the journal or conference where it was published, a set of keywords, among others. In the case of web pages or text documents, they are represented as a set of relevant words each having a frequency value. The user model is then a vector of relevant words representing the user interests. In some domains, the items are classified or categorized according to the attributes or features describing them by using a classifier inferred from a set of examples. Thus our approach also provides the representation for those user models in which the user models holds the structure of a classifier that categorizes examples in a set of classes e.g. a decision tree). In turn, the classifiers may form a hierarchy to distinguish hierarchical classes and categories. For example, the topics of interest of a user may be organized into a hierarchy where different levels of abstraction in the user preferences can be modeled 3.1.2 ltem-based user models. The idea underlying collaborative filtering is giving recommendations of items that were interesting to other users that are similar to the user the agent is assisting. The goal is obtaining the utility or rate a user would give for an item given information of the ratings provided by similar users
As shown in layer 4 in the figure, the knowledge about the activities of users registered by the recommendation engine can be analyzed from a social point-of-view. Thus, in the last layer of the framework it is possible to find algorithms and techniques for social data analysis such as those included in the Social Data Mining (Amento et al., 2003) and Social Network Analysis (Sabater and Sierra, 2002) areas. Furthermore, the data about user activities serve as a source for generating diverse visualizations to explore and interpret the behavior of the community. Each component of the proposed framework is detailed in the following subsections. 3.1 Management of user models As we have said before, a user model is a representation of a user interests, habits and preferences in a given domain. The representation formalism of a user model varies from one application to another. Our approach provides a number of standard representations for user models that tries to capture the approaches most widely used by recommender agents. We consider three main categories of user models within recommender agents: content-based user models, item-based (or collaborative) user models, and demographic user models. Each type of user model has its own representation and requires a different method to compare it against other models or against items to recommend. The following sections describe the representations modeled in the framework. 3.1.1 Content-based user models. Content-based user models are built from the observation of the interaction of a user with an underlying application. Depending on the domain, different representations for the user model can be found. In our framework we consider three main representation formalisms for this kind of user model: a feature vector; a classifier denoting the relation between a set of features and a set of classes or categories; and a hierarchy of classifiers. In addition, new representations for user models can be easily added. One of the most popular representations of items is describing them through their main characteristics. This representation is known as feature vector. For example, a scientific paper can be described by the authors, an abstract, the publication date, the journal or conference where it was published, a set of keywords, among others. In the case of web pages or text documents, they are represented as a set of relevant words, each having a frequency value. The user model is then a vector of relevant words representing the user interests. In some domains, the items are classified or categorized according to the attributes or features describing them by using a classifier inferred from a set of examples. Thus, our approach also provides the representation for those user models in which the user models holds the structure of a classifier that categorizes examples in a set of classes (e.g. a decision tree). In turn, the classifiers may form a hierarchy to distinguish hierarchical classes and categories. For example, the topics of interest of a user may be organized into a hierarchy where different levels of abstraction in the user preferences can be modeled. 3.1.2 Item-based user models. The idea underlying collaborative filtering is giving recommendations of items that were interesting to other users that are similar to the user the agent is assisting. The goal is obtaining the utility or rate a user would give for an item given information of the ratings provided by similar users. User modeling approaches 35
Thus, the user models in collaborative filtering do not model the contents of the 20.1 items a user is interested in, namely documents, movies, or books, but the evaluation or rating the user has assigned to these items. Thus, such a user model is composed of a set of name-value pairs in which the name represents an item under consideration and the value a rating provided for the item 3.1.3 Demographic user models. Demographic data about users can be also used to information may include attributes such a sex, age, city, nationality, job, hobdhic make them recommendations of potentially interesting items. Demogra among other features that may be relevant to the application domain. a demographic user model is generally obtained from the information explicitly given by the user through a user interface provided for that purpose Figure 3 shows the different user models proposed by our approach. A recommender agent that wants to define its own user model should implement a class inheriting from one of the classes shown in Figure 3(HierarchicalUM for example). Similarly, new algorithms to build the content-based user models can be defined. Our framework provides a set of well-known Machine Learning(decision trees, naive Bayes, etc )and Information Retrieval (Rochio, tf-idf, etc )algorithms that agent developers can use. 3.2 Memory-based recommendation In order to make collaborative recommendations a subset of users out of the whole population have to be chosen based on their similarity with the active user and a weighted bination of their ratings is used to generate predictions. Neighborhood-based or user-based collaborative filtering is performed in three steps weighting all users according to their similarity with the active user, selecting a subset of these users and computing predictions based on the ratings given by the group of users. For each of these steps different algorithms and techniques are implemented in the framework In the first step, memory-based algorithms utilize either a metric of comparison for content-based user models or the entire user- item matrix to estimate a neighborhood of users that resemble the active one. This process results in a user-user matrix of similarities in which a row represents a user and columns hold the distance/similarity with the remains uerMlodel addAr inneVeu(Irate val User model I MLAgoithmirazo representations provided by the framework matchingltem)
Thus, the user models in collaborative filtering do not model the contents of the items a user is interested in, namely documents, movies, or books, but the evaluation or rating the user has assigned to these items. Thus, such a user model is composed of a set of name-value pairs in which the name represents an item under consideration and the value a rating provided for the item. 3.1.3 Demographic user models. Demographic data about users can be also used to make them recommendations of potentially interesting items. Demographic information may include attributes such a sex, age, city, nationality, job, hobbies, among other features that may be relevant to the application domain. A demographic user model is generally obtained from the information explicitly given by the user through a user interface provided for that purpose. Figure 3 shows the different user models proposed by our approach. A recommender agent that wants to define its own user model should implement a class inheriting from one of the classes shown in Figure 3 (HierarchicalUM for example). Similarly, new algorithms to build the content-based user models can be defined. Our framework provides a set of well-known Machine Learning (decision trees, naı¨ve Bayes, etc.) and Information Retrieval (Rochio, tf-idf, etc.) algorithms that agent developers can use. 3.2 Memory-based recommendation In order to make collaborative recommendations, a subset of users out of the whole population have to be chosen based on their similarity with the active user and a weighted combination of their ratings is used to generate predictions. Neighborhood-based or user-based collaborative filtering is performed in three steps: weighting all users according to their similarity with the active user, selecting a subset of these users and computing predictions based on the ratings given by the group of users. For each of these steps different algorithms and techniques are implemented in the framework. In the first step, memory-based algorithms utilize either a metric of comparison for content-based user models or the entire user-item matrix to estimate a neighborhood of users that resemble the active one. This process results in a user-user matrix of similarities in which a row represents a user and columns hold the distance/similarity with the remaining users. Figure 3. User model representations provided by the framework INTR 20,1 36
In the first case, content-based models represented as feature vectors are compared by User modeling comparing the values for the different attributes representing each user model. For vectors of normalized numerical attributes several common distance/similarity approaches functions are provided by the framework, including the euclidean distance, Manhattan distance, and cosine similarity. Thus, an agent representing the interest of a user by a single vector of keywords, such as Letizia, can be straightforwardly integrated in the framework by using the cosine similarity to compare user models and gain the ability of recommending the information discovered by other users. We can observe these components in Figure 4. More specific methods can be defined to compare more complex, specialized user models. For instance, demographic user models can be compared to another demographic user model by using the similarity functions used for feature vectors or defining a similarity measure, possibly combining numerical and nominal attributes weighted according to their importance In the second case, that is in item- based models, neighbors are identified by comparing the ratings of all users with the ratings given by the active user to items. The most common metrics are implemented in the framework for this purpose, ncluding the mean squared difference, the pearson correlation coefficient and the Spearman rank correlation. Also, significance weighting(Herlocker et al, 1999)is implemented to add a certain level of trust to neighbor correlations. Further correlations measures can be defined by extending the framework in this point. The information available in the user-user similarity matrix allows the selection of the most alike users to use their opinions for prediction. The selection of neighbors can be achieved by using correlation-thresholding(Shardanand and Maes, 1995), which selects all users whose correlation is above a certain absolute threshold, and best-n-neighbors, which select the best n correlates for a given n (herlocker et al, 1999) Once a neighborhood of users is formed, different algorithms can be used to combine the preferences of neighbors to produce a prediction or top-N recommendations for the active user. The provided methods to combine the ratings of the neighbor users are the weighted average of the ratings used in ringo I CartintBanedRucummader Modk BasosRucamaneu Memory-based and nodel-based algorithms
In the first case, content-based models represented as feature vectors are compared by comparing the values for the different attributes representing each user model. For vectors of normalized numerical attributes several common distance/similarity functions are provided by the framework, including the Euclidean distance, Manhattan distance, and cosine similarity. Thus, an agent representing the interest of a user by a single vector of keywords, such as Letizia, can be straightforwardly integrated in the framework by using the cosine similarity to compare user models and gain the ability of recommending the information discovered by other users. We can observe these components in Figure 4. More specific methods can be defined to compare more complex, specialized user models. For instance, demographic user models can be compared to another demographic user model by using the similarity functions used for feature vectors or defining a similarity measure, possibly combining numerical and nominal attributes weighted according to their importance. In the second case, that is in item-based models, neighbors are identified by comparing the ratings of all users with the ratings given by the active user to items. The most common metrics are implemented in the framework for this purpose, including the mean squared difference, the Pearson correlation coefficient and the Spearman rank correlation. Also, significance weighting (Herlocker et al., 1999) is implemented to add a certain level of trust to neighbor correlations. Further correlations measures can be defined by extending the framework in this point. The information available in the user-user similarity matrix allows the selection of the most alike users to use their opinions for prediction. The selection of neighbors can be achieved by using correlation-thresholding (Shardanand and Maes, 1995), which selects all users whose correlation is above a certain absolute threshold, and best-n-neighbors, which select the best n correlates for a given n (Herlocker et al., 1999). Once a neighborhood of users is formed, different algorithms can be used to combine the preferences of neighbors to produce a prediction or top-N recommendations for the active user. The provided methods to combine the ratings of the neighbor users are the weighted average of the ratings used in Ringo Figure 4. Memory-based and model-based algorithms User modeling approaches 37
INTR (Shardanand and Maes, 1995), which uses the correlations as weights, the 20.1 deviation-from-mean approach of groupLens (Resnick et al, 1994)and the regression technique, which uses an approximation of the ratings based on regression model instead of the real ratings. 3. 3 Model-based recommendation Model-based recommendation consists of extracting a model from user preferences and Q ing this model for prediction. The model building process can be performed by different machine learning algorithms. Clustering approaches work by identifying groups of users who appear to have similar preferences or ratings. Then, predictions for a candidate user are made by averaging the opinions of the other users in the cluster to which the user belongs. Our proposed framework provides different clustering algorithms that recommender agents can use to group similar users, including k-Means, PAM(Kaufman and Rousseeuw, 1987), and agglomerative hierarchical algorithm(HAc). However, agent developers have to be aware of the fact that not all algorithms can be applied to every type of user model. For example, the k-Means algorithm cannot be directly applied to content-based user models, since it considers vectors of numerical attributes. For this kind of models a means to compute an average value of user models has to be provided by the agent developers in order to calculate the cluster centroids. a variation of the k-Means for nominal data is the k-Modes algorithm that can be used for demographic user models. Figure 4 shows the class hierarchy for this part of the framework. Other clustering algorithms as well as specializations of the existing ones can be added by extending this group of classes. In addition to clustering algorithms, a rule-based approach is included in the framework for model extraction. Usually, association rules discovery algorithms are used in e-commerce recommender systems to find associations between co-purchased items. The system then generates item recommendations based on the strength of the association between items. Apriori algorithm(Agrawal and Srikant, 1994 ), one of the most popular algorithms for mining of association rules, is included in the framework. Thus, a recommendation model based on association rules corresponds to the set of association rules generated from the user preferences. The model-based algorithms can be used in combination with memory-based algorithms. For example, the results of user clustering can reduce the number of users against whom the agent compares the active user for neighborhood formation. Enhancing the accuracy of finding neighbor users can result in a more effective collaborative filtering. 3.4 Recommendation engine Using the framework agents can recommend items that are potentially interesting for their users and, in turn, to receive suggestions made by other users. In addition, they can request a prediction of the expected interest value for a specific item. Users can also provide feedback for the suggestions and recommendations they receive. In order to support these tasks, our framework provides storage capabilities in persistent media for the following components: user models, the recommendations made by users, user milarity structures, and statistic data about users and their interactions
(Shardanand and Maes, 1995), which uses the correlations as weights, the deviation-from-mean approach of GroupLens (Resnick et al., 1994) and the regression technique, which uses an approximation of the ratings based on regression model instead of the real ratings. 3.3 Model-based recommendation Model-based recommendation consists of extracting a model from user preferences and using this model for prediction. The model building process can be performed by different machine learning algorithms. Clustering approaches work by identifying groups of users who appear to have similar preferences or ratings. Then, predictions for a candidate user are made by averaging the opinions of the other users in the cluster to which the user belongs. Our proposed framework provides different clustering algorithms that recommender agents can use to group similar users, including k-Means, PAM (Kaufman and Rousseeuw, 1987), and agglomerative hierarchical algorithm (HAC). However, agent developers have to be aware of the fact that not all algorithms can be applied to every type of user model. For example, the k-Means algorithm cannot be directly applied to content-based user models, since it considers vectors of numerical attributes. For this kind of models a means to compute an average value of user models has to be provided by the agent developers in order to calculate the cluster centroids. A variation of the k-Means for nominal data is the k-Modes algorithm that can be used for demographic user models. Figure 4 shows the class hierarchy for this part of the framework. Other clustering algorithms as well as specializations of the existing ones can be added by extending this group of classes. In addition to clustering algorithms, a rule-based approach is included in the framework for model extraction. Usually, association rules discovery algorithms are used in e-commerce recommender systems to find associations between co-purchased items. The system then generates item recommendations based on the strength of the association between items. Apriori algorithm (Agrawal and Srikant, 1994), one of the most popular algorithms for mining of association rules, is included in the framework. Thus, a recommendation model based on association rules corresponds to the set of association rules generated from the user preferences. The model-based algorithms can be used in combination with memory-based algorithms. For example, the results of user clustering can reduce the number of users against whom the agent compares the active user for neighborhood formation. Enhancing the accuracy of finding neighbor users can result in a more effective collaborative filtering. 3.4 Recommendation engine Using the framework agents can recommend items that are potentially interesting for their users and, in turn, to receive suggestions made by other users. In addition, they can request a prediction of the expected interest value for a specific item. Users can also provide feedback for the suggestions and recommendations they receive. In order to support these tasks, our framework provides storage capabilities in persistent media for the following components: user models, the recommendations made by users, user similarity structures, and statistic data about users and their interactions. INTR 20,1 38