Availableonlineatwww.sciencedirect.com ScienceDirect FUZZY sets and systems ELSEVIER Fuzzy Sets and Systems 160(2009)76-94 Representation, similarity measures and aggregation methods using uzzy sets for content-based recommender systems Azene Zenebea,*. Anthony F norco Systems Department, Bowie State University, 14000 Jericho Park Rd, Bowie, MD 20715-9465, USA iNformation Systems Department, University of Maryland at Baltimore County(UMBC). 1000 Hilltop Circle, MD 21250,USA Received 17 July 2007; received in revised form 8 March 2008: accepted 10 March 2008 Available online 26 march 2008 Abstract Representation of features of items and user feedback, and reasoning about their relationships are major problems in recommender systems. This is because item features and user feedback are subjective, imprecise and vague. The paper presents a fuzzy set theoretic method(FTM) for recommender systems that handles the non-stochastic uncertainty induced from subjectivity, vagueness and precision in the data, and the domain knowledge and the task under consideration. The research further advances the application of fuzzy modeling for content-based recommender systems initially presented by Ronald Yager. The paper defines a representation method, similarity measures and aggregation methods as well as empirically evaluates the methods performance through simulation using a benchmark movie data FTM consist of a representation method for items'features and user feedback using fuzzy sets, and a content-based algorithm based on various fuzzy set theoretic similarity measures (the fuzzy set extensions of the Jaccard index, cosine, proximity or correlation similarity measures ), and aggregation methods for computing recommendation confidence scores(the maximum-minimum or Weighted-sum fuzzy set theoretic aggregation methods). Compared to the baseline crisp set based method(CSM) presented, the empirical evaluation of the FTM using the movie data and simulation shows an improvement in precision without loss of recall. Moreover, the paper provides a guideline for recommender systems designers that will help in choosing from a combination of one of the fuzzy set theoretic aggregation methods and similarity measures. Published by Elsevier B.V. Keywords: Fuzzy sets; Fuzzy connectives and aggregation operators; Recommender systems; Learning: Empirical evaluation 1. Introduction Recommender systems are systems that provide users with an ordered list of items and information that help the to decide which items to consider or look at based on the individual user preferences [25]. Recommendation systems use background data such as historical data consisting of ratings from users before the recommendation begins, input data such as features of items or users 'ratings in order to initiate a recommendation and models and algorithms to combine the former two and generate a recommendation [8] There have been many advances in recommender systems research. An extensive review of the different approa used in recommender systems is presented in Burke [8]. Recently, Adomavicius and Tuzhilin [1] have identified Corresponding author. Tel: +1301 3641: fax: +1301 8603593 E-mail address: azenebe@bowiestate edu(A. Zenebe) 0165-0114S-see front matter Published by Elsevier B V. doi:10.1016js2008.03017
Fuzzy Sets and Systems 160 (2009) 76 – 94 www.elsevier.com/locate/fss Representation, similarity measures and aggregation methods using fuzzy sets for content-based recommender systems Azene Zenebea,∗, Anthony F. Norciob aManagement Information Systems Department, Bowie State University, 14000 Jericho Park Rd., Bowie, MD 20715-9465, USA bInformation Systems Department, University of Maryland at Baltimore County (UMBC), 1000 Hilltop Circle, MD 21250, USA Received 17 July 2007; received in revised form 8 March 2008; accepted 10 March 2008 Available online 26 March 2008 Abstract Representation of features of items and user feedback, and reasoning about their relationships are major problems in recommender systems. This is because item features and user feedback are subjective, imprecise and vague. The paper presents a fuzzy set theoretic method (FTM) for recommender systems that handles the non-stochastic uncertainty induced from subjectivity, vagueness and imprecision in the data, and the domain knowledge and the task under consideration. The research further advances the application of fuzzy modeling for content-based recommender systems initially presented by Ronald Yager. The paper defines a representation method, similarity measures and aggregation methods as well as empirically evaluates the methods’ performance through simulation using a benchmark movie data. FTM consist of a representation method for items’ features and user feedback using fuzzy sets, and a content-based algorithm based on various fuzzy set theoretic similarity measures (the fuzzy set extensions of the Jaccard index, cosine, proximity or correlation similarity measures), and aggregation methods for computing recommendation confidence scores (the maximum–minimum or Weighted-sum fuzzy set theoretic aggregation methods). Compared to the baseline crisp set based method (CSM) presented, the empirical evaluation of the FTM using the movie data and simulation shows an improvement in precision without loss of recall. Moreover, the paper provides a guideline for recommender systems designers that will help in choosing from a combination of one of the fuzzy set theoretic aggregation methods and similarity measures. Published by Elsevier B.V. Keywords: Fuzzy sets; Fuzzy connectives and aggregation operators; Recommender systems; Learning; Empirical evaluation 1. Introduction Recommender systems are systems that provide users with an ordered list of items and information that help them to decide which items to consider or look at based on the individual user preferences [25]. Recommendation systems use background data such as historical data consisting of ratings from users before the recommendation begins, input data such as features of items or users’ ratings in order to initiate a recommendation, and models and algorithms to combine the former two and generate a recommendation [8]. There have been many advances in recommender systems research. An extensive review of the different approaches used in recommender systems is presented in Burke [8]. Recently, Adomavicius and Tuzhilin [1] have identified ∗ Corresponding author. Tel.: +1 301 860 3641; fax: +1 301 860 3593. E-mail address: azenebe@bowiestate.edu (A. Zenebe). 0165-0114/$ - see front matter Published by Elsevier B.V. doi:10.1016/j.fss.2008.03.017
A. Zenebe, A F. Norcio/ Fuzzy Sets and Systems 160(2009)76-94 various areas of improvements for current recommender systems. They are: (i) better methods for representing user behavior and information about items; (ii) more advanced recommendation modeling methods; (iii) incorporation of contextual information into recommendation process; (iv)utilization of multi-criteria ratings; (v) development of less intrusive and more flexible recommendation methods; and(vi) development of recommender systems effective ness measures. This paper attempts to address the improvement need stated in(i)and (ii) using the fuzzy modelin L.I. The problem A content-based recommendation requires data on the behavior of users and features of items. Its performance depends on the data and how this data is used, i.e. represented and inferred. Representation of and reasoning about the behavior of users and features of items raised a number of challenging issues Features of items and users'be- havior are subjective, vague and imprecise. These, in turn, induce uncertainty on representation of and reasonin about the items'features, users'behavior, and their relationship. Such uncertainty is non-stochastic or non-random and is induced from subjectivity, vagueness and imprecision in the data, the domain knowledge and the task under consideration [20] In relation to items, the uncertainty is associated to the extent(e. g. low to high) in which the items have some features For instance, to what extent does a movie have drama content or is it highly drama related? In relation to the behavior such as interest, the uncertainty is associated to methods employed to measure and represent users'interest as precisely as possible. In relation to the recommendation task, uncertainty is associated to types of relationship that exist between first. user behavior and item features. second. among users in terms of their behavior. and third among items in terms of their features Such non-stochastic uncertainty is not studied well and modeled in previous recommender systems research such as [25, 12] 1. 2. The proposed method In fuzzy modeling membership functions in fuzzy set theory are deliberately designed to treat the vagueness and imprecision in the context of the application [22]. This capability is used to provide the framework to address the representation and inference challenges associated to non-stochastic uncertainty [20] in recommender systems. The study is motivated by the"reclusive methods" proposed by Yager [30]. Yager [30] discusses the potential of fuzzy modeling and also presents a methodology consisting of collection of justifications and heuristic rules for the recommendations based on fuzzy set and fuzzy logic. He assumes the availability of a representation scheme for each object that allows the development of an appropriate tool to calculate the similarity relationship over the set of objects He also assumes the availability of similarity index between two objects. However, he did not conduct an empirical tudy to support or refute the effectiveness of using fuzzy modeling. This research is an attempt to further develop and empirically evaluate fuzzy set theoretic method (FTM) for content-based recommender systems using the problem of movie recommendation as its domain. Particularly, the proposed approach uses features of items as background data and user's feedback such as ratings of items as input. The study defines a representational method, aggregation methods, and similarity measures for content-based ommender systems. It also develops algorithms and carries out an empirical assessment of the effect of the fuzzy set theoretic method(FTM) on the performance of a movie recommender system by comparing its results to the re- sults of the baseline crisp set based method(CSM). An empirical assessment of the effect of the various inference mechanisms--combinations of the aggregation methods and similarity measures is also performed. The study also considers analyzing the effect of the model size(also called training size-number of rated items initially needed from a user), and recommendation size(number of items recommended at one time to a user)on the performance of a movie recommender system. 13. Results. conclusion and contributions Using actual data on movies, the results of the simulation study provide empirical evidence that supports the ef fectiveness of FTM. The results show a modest increase in precision without loss of recall. It also requires a modest training size and recommendation size. Moreover, the results of analysis of variance show that there are significant
A. Zenebe, A.F. Norcio / Fuzzy Sets and Systems 160 (2009) 76–94 77 various areas of improvements for current recommender systems. They are: (i) better methods for representing user behavior and information about items; (ii) more advanced recommendation modeling methods; (iii) incorporation of contextual information into recommendation process; (iv) utilization of multi-criteria ratings; (v) development of less intrusive and more flexible recommendation methods; and (vi) development of recommender systems effectiveness measures. This paper attempts to address the improvement need stated in (i) and (ii) using the fuzzy modeling technique. 1.1. The problem A content-based recommendation requires data on the behavior of users and features of items. Its performance depends on the data and how this data is used, i.e. represented and inferred. Representation of and reasoning about the behavior of users and features of items raised a number of challenging issues. Features of items and users’ behavior are subjective, vague and imprecise. These, in turn, induce uncertainty on representation of and reasoning about the items’ features, users’ behavior, and their relationship. Such uncertainty is non-stochastic or non-random and is induced from subjectivity, vagueness and imprecision in the data, the domain knowledge and the task under consideration [20]. In relation to items, the uncertainty is associated to the extent (e.g. low to high) in which the items have some features. For instance, to what extent does a movie have drama content or is it highly drama related? In relation to the users’ behavior such as interest, the uncertainty is associated to methods employed to measure and represent users’ interest as precisely as possible. In relation to the recommendation task, uncertainty is associated to types of relationship that exist between first, user behavior and item features, second, among users in terms of their behavior, and third, among items in terms of their features. Such non-stochastic uncertainty is not studied well and modeled in previous recommender systems research such as [25,12]. 1.2. The proposed method In fuzzy modeling membership functions in fuzzy set theory are deliberately designed to treat the vagueness and imprecision in the context of the application [22]. This capability is used to provide the framework to address the representation and inference challenges associated to non-stochastic uncertainty [20] in recommender systems. The study is motivated by the “reclusive methods” proposed by Yager [30]. Yager [30] discusses the potential of fuzzy modeling and also presents a methodology consisting of collection of justifications and heuristic rules for the recommendations based on fuzzy set and fuzzy logic. He assumes the availability of a representation scheme for each object that allows the development of an appropriate tool to calculate the similarity relationship over the set of objects. He also assumes the availability of similarity index between two objects. However, he did not conduct an empirical study to support or refute the effectiveness of using fuzzy modeling. This research is an attempt to further develop and empirically evaluate fuzzy set theoretic method (FTM) for content-based recommender systems using the problem of movie recommendation as its domain. Particularly, the proposed approach uses features of items as background data and user’s feedback such as ratings of items as input. The study defines a representational method, aggregation methods, and similarity measures for content-based recommender systems. It also develops algorithms and carries out an empirical assessment of the effect of the fuzzy set theoretic method (FTM) on the performance of a movie recommender system by comparing its results to the results of the baseline crisp set based method (CSM). An empirical assessment of the effect of the various inference mechanisms—combinations of the aggregation methods and similarity measures is also performed. The study also considers analyzing the effect of the model size (also called training size—number of rated items initially needed from a user), and recommendation size (number of items recommended at one time to a user) on the performance of a movie recommender system. 1.3. Results, conclusion and contributions Using actual data on movies, the results of the simulation study provide empirical evidence that supports the effectiveness of FTM. The results show a modest increase in precision without loss of recall. It also requires a modest training size and recommendation size. Moreover, the results of analysis of variance show that there are significant
differences among the different alternative combination of fuzzy set theoretic similarity measures and aggregation methods in their recommendation accuracy. The paper has the following main contributions Compared to using crisp set theory, the paper shows using fuzzy set theory slightly improves precision without loss of recall for content-based movies recommendation application 2. The paper provides a representation framework for features of items and users feedback using fuzzy sets as well as new algorithms for content-based item recommender systems 3. The paper presents a practical and detailed description of how to apply fuzzy set theory in a new domain as well how to conduct extensive experimental study to validate the theory of the fuzzy set theoretic aggregation methods and the fuzzy set theoretic similarity measures ombination of 4. The paper provides a guideline for recommender systems designers that will help them to choose a The remainder of the paper is organized as follows. Section 2 presents a review of related literature. Section 3 presents the representation method, inference methods, similarity measures and algorithms. Section 4 describes the dataset, evaluation settings, and evaluation metrics. Section 5 presents the results of the evaluation followed by the discussion in Section 6. Finally, our conclusions and future research directions are presented in Section 7. 2. Related literature 2./. Recommender systems There are various classifications of recommendation methods based on the sources of data and how these data are used for recommendation, Burke [8 has classified recommendation methods into: collaborative, content-based, demographic, utility-based, and knowledge-based. There are also various variants of hybrid methods that combine these methods. These hybrid methods are discussed in detail in [8]. Moreover, a recent survey of state-of-the-art recommender systems along with suggestions for improvements is found in [1] The two most widely used methods of recommendation are content-based and collaborative filtering(CF). In collab- orative filtering, an item is recommended to a user based on other similar users actions like interests, preferences and ratings[24]. Because of the availability of ratings data, CF is the most fully explored and several numbers of studies are eported Deshpande and Karypis [12] developed item-based Top-N recommendation algorithms that are collaborative ype and faster than traditional user-user collaborative algorithms with comparable recommendation hit-rate. More- over, results of evaluation of these CF algorithms for recommender systems using the MovieLens dataset are reported in terms of precision and Fl-measure in [21] In content-based recommendation, an item is recommended to a user mainly based on the characteristics of the em and the user's past actions like purchases, queries, and ratings. Moreover, in content-based recommendation, standard machine learning techniques such as clustering, Bayesian networks and induction learning are applied in forming attribute-based models [8]. Alspector et al. [2] used a set of seven movie features--category, MAAP rating, academy award, origin, length, director and Maltin rating, in addition to the rating. They showed that the pure CF method produces significantly better results(in terms of correlation measure between predicted and actual rating)than the ones obtained with the content -based method Basu et al. [6] applied inductive learning approaches that use Ripper for recommendation of movies. They showed that content-based approaches result in loss of precision with modest increase in recall; collaborative approaches improve precision with modest loss of recall; and hybrid approaches increase both precision and recall. Weng and George [27] have also reported similar result for precision. These studies indicated that the mere introduction of movie features alone does not improve precision. The present study attempts to show that a proper introduction of movie features does improve precision without loss of recall. 2.2. Fuzzy modeling Fuzzy set theory offers a rich spectrum of methods for the management of non-stochastic uncertainty. It is well suited to handle imprecise information, the un-sharpness of classes of objects or situations, and the gradualness of preference profiles [31]
78 A. Zenebe, A.F. Norcio / Fuzzy Sets and Systems 160 (2009) 76–94 differences among the different alternative combination of fuzzy set theoretic similarity measures and aggregation methods in their recommendation accuracy. The paper has the following main contributions: 1. Compared to using crisp set theory, the paper shows using fuzzy set theory slightly improves precision without loss of recall for content-based movies recommendation application. 2. The paper provides a representation framework for features of items and users feedback using fuzzy sets as well as new algorithms for content-based item recommender systems. 3. The paper presents a practical and detailed description of how to apply fuzzy set theory in a new domain as well as how to conduct extensive experimental study to validate the theory. 4. The paper provides a guideline for recommender systems designers that will help them to choose a combination of one of the fuzzy set theoretic aggregation methods and the fuzzy set theoretic similarity measures. The remainder of the paper is organized as follows. Section 2 presents a review of related literature. Section 3 presents the representation method, inference methods, similarity measures and algorithms. Section 4 describes the dataset, evaluation settings, and evaluation metrics. Section 5 presents the results of the evaluation followed by the discussion in Section 6. Finally, our conclusions and future research directions are presented in Section 7. 2. Related literature 2.1. Recommender systems There are various classifications of recommendation methods. Based on the sources of data and how these data are used for recommendation, Burke [8] has classified recommendation methods into: collaborative, content-based, demographic, utility-based, and knowledge-based. There are also various variants of hybrid methods that combine these methods. These hybrid methods are discussed in detail in [8]. Moreover, a recent survey of state-of-the-art recommender systems along with suggestions for improvements is found in [1]. The two most widely used methods of recommendation are content-based and collaborative filtering (CF). In collaborative filtering, an item is recommended to a user based on other similar users’ actions like interests, preferences and ratings [24]. Because of the availability of ratings data, CF is the most fully explored and several numbers of studies are reported. Deshpande and Karypis [12] developed item-based Top-N recommendation algorithms that are collaborative type and faster than traditional user–user collaborative algorithms with comparable recommendation hit-rate. Moreover, results of evaluation of these CF algorithms for recommender systems using the MovieLens dataset are reported in terms of precision and F1-measure in [21]. In content-based recommendation, an item is recommended to a user mainly based on the characteristics of the item and the user’s past actions like purchases, queries, and ratings. Moreover, in content-based recommendation, standard machine learning techniques such as clustering, Bayesian networks and induction learning are applied in forming attribute-based models [8]. Alspector et al. [2] used a set of seven movie features—category, MAAP rating, academy award, origin, length, director and Maltin rating, in addition to the rating. They showed that the pure CF method produces significantly better results (in terms of correlation measure between predicted and actual rating) than the ones obtained with the content-based method. Basu et al. [6] applied inductive learning approaches that use Ripper for recommendation of movies. They showed that content-based approaches result in loss of precision with modest increase in recall; collaborative approaches improve precision with modest loss of recall; and hybrid approaches increase both precision and recall. Weng and George [27] have also reported similar result for precision. These studies indicated that the mere introduction of movie features alone does not improve precision. The present study attempts to show that a proper introduction of movie features does improve precision without loss of recall. 2.2. Fuzzy modeling Fuzzy set theory offers a rich spectrum of methods for the management of non-stochastic uncertainty. It is well suited to handle imprecise information, the un-sharpness of classes of objects or situations, and the gradualness of preference profiles [31]
A. Zenebe, A F. Norcio/ Fuzzy Sets and Systems 160(2009)76-94 4. A fuzzy set A in X is characterized by its membership function, which is defined as [31]:PA(): xEX-[0, 1]. ere X is a domain space. Altematively, set A can be characterized by a set of pairs A =I(r, HA()),xE X). B.Ccording to the context in which X is used and the concept presented, the fuzzy membership function, HA(x),can A Ive different interpretations [7]. As a degree of similarity, it represents the proximity between different pieces of information. For example, the membership of movie x in the fuzzy set of"drama movies"can be estimated by the degree of similarity. As a degree of preference, it represents the intensity of preference in favor of x, or the feasibility satisfaction or liking with x based on certain criteria, like movie attributes such as content-intensity of actionL, dv of selecting x as a value of X. For instance a movie with rating of four out of five indicates the degree of a user and humor. These two interpretations are used in this research. The relationship between fuzzy set theory and probability has been the most controversial in uncertainty modeling mainly due to the possible confusion and differences between membership function and probability measure. It is lear now that they are two complementary theories. Dubois and Prade [16] presented some of the mathematical difference between possibility and the quantitative, explicit Kolmogoroff's probability Zimmerman [34] also indicated that comparing fuzzy set theory with probability theory is not easy because of the various aspects in which the comparison can be made, because of the many different ways of mathematically defining and operating fuzzy sets, and because of the various kinds of fuzziness that can be modeled. One difference between fuzziness and probability is probability theory measures the chance that a proposition is correct where as fuzzy set theory measures degree of correctness to which a proposition is correct [15] Possibility theory, introduced in 1978 by Zadeh [32], made it possible to deal with uncertainties of imprecise knowledge. It allows the quantification of uncertainty as a pair of numbers, possibility and necessity. In a proposition such as'X is A', where X is a variable and A is a fuzzy set, if all we know about the value of X is that X is a then this corresponds to a situation where information is incomplete. Then one can associate a possibility distribution to X(denoted by Tx) where the values of X are ordered according to their degree of plausibility or possibility. The possibility distribution function T(u) is defined to be numerically equal to the membership function HA(u)[32]. That is, a membership function has a possiblistic interpretation, which assumes the presence Forecasting, bidding and auctions, negotiation, targeting, recommender systems and profiling are pointed out as application areas that can benefit from soft computing paradigm and data mining [5, 29, 30]. Research efforts that address non-stochastic uncertainty using fuzzy modeling for recommendation systems have emerged recently, e.g [ 19,30]. These works use fuzzy sets to define linguistic categorizations of products in an e-commerce, and form overlapping clusters using fuzzy clustering FTM is case-based as""similar"cases are recalled, and based on them each possible decision is evaluated. "[17, p 609]. Dubois et al. [ 14] presented a fuzzy-set based counterpart to the case-based decision. As a case-based decision theory(CBDT)driven approach, FTM does have the following features[17]: each recommendation decision is evaluated over a different set of cases: to make recommendation decision there is no need to consider all possible states and the orresponding outcomes; the system is not assumed to know anything about the outside world, apart from past cases; and the system learns by including new cases and updated its similarity judgment and recommendation decisions continuously FTM differs from the previous recommendation systems in its representation and inference methods. Furthermore this study has done extensive empirical evaluation to identify the significance of the use of fuzzy representation and inference methods and other factors on the performance 3. The representation and inference methods The proposed fuzzy set theoretic content-based approach is based on a user's previous feedback, and features of the new items and features of the set of i for which the user has provided feedback. The rationale of this method is that users are more likely to have interest in items, like movies, that are similar to the items they have experienced and liked. The representation method, inference engine consisting of aggregation methods and similarity measures, and the algorithms are presented in this section 3.1. Item representation using fuzzy set A membership function in fuzzy set theory is deliberately designed to handle the vagueness and imprecision in the context of the application [22]. The type of function that is suitable depends on the application context, and in certain
A. Zenebe, A.F. Norcio / Fuzzy Sets and Systems 160 (2009) 76–94 79 A fuzzy set A in X is characterized by its membership function, which is defined as [31]: A(x) : x ∈ X → [0, 1], where X is a domain space. Alternatively, set A can be characterized by a set of pairs:A = {(x, A(x)), x ∈ X}. According to the context in which X is used and the concept presented, the fuzzy membership function, A(x), can have different interpretations [7]. As a degree of similarity, it represents the proximity between different pieces of information. For example, the membership of movie x in the fuzzy set of “drama movies” can be estimated by the degree of similarity. As a degree of preference, it represents the intensity of preference in favor of x, or the feasibility of selecting x as a value of X. For instance, a movie with rating of four out of five indicates the degree of a user’s satisfaction or liking with x based on certain criteria, like movie attributes such as content-intensity of action, drama, and humor. These two interpretations are used in this research. The relationship between fuzzy set theory and probability has been the most controversial in uncertainty modeling mainly due to the possible confusion and differences between membership function and probability measure. It is clear now that they are two complementary theories. Dubois and Prade [16] presented some of the mathematical difference between possibility and the quantitative, explicit Kolmogoroff’s probability. Zimmerman [34] also indicated that comparing fuzzy set theory with probability theory is not easy because of the various aspects in which the comparison can be made, because of the many different ways of mathematically defining and operating fuzzy sets, and because of the various kinds of fuzziness that can be modeled. One difference between fuzziness and probability is probability theory measures the chance that a proposition is correct where as fuzzy set theory measures degree of correctness to which a proposition is correct [15]. Possibility theory, introduced in 1978 by Zadeh [32], made it possible to deal with uncertainties of imprecise knowledge. It allows the quantification of uncertainty as a pair of numbers, possibility and necessity. In a proposition such as ‘X is A’, where X is a variable and A is a fuzzy set, if all we know about the value of X is that X is A, then this corresponds to a situation where information is incomplete. Then one can associate a possibility distribution to X (denoted by x ) where the values of X are ordered according to their degree of plausibility or possibility. The possibility distribution function x (u) is defined to be numerically equal to the membership function A(u) [32]. That is, a membership function has a possiblistic interpretation, which assumes the presence. Forecasting, bidding and auctions, negotiation, targeting, recommender systems and profiling are pointed out as application areas that can benefit from soft computing paradigm and data mining [5,29,30]. Research efforts that address non-stochastic uncertainty using fuzzy modeling for recommendation systems have emerged recently, e.g. [19,30]. These works use fuzzy sets to define linguistic categorizations of products in an e-commerce, and form overlapping clusters using fuzzy clustering. FTM is case-based as “ “similar” cases are recalled, and based on them each possible decision is evaluated.” [17, p. 609]. Dubois et al. [14] presented a fuzzy-set based counterpart to the case-based decision. As a case-based decision theory (CBDT) driven approach, FTM does have the following features [17]: each recommendation decision is evaluated over a different set of cases; to make recommendation decision there is no need to consider all possible states and the corresponding outcomes; the system is not assumed to know anything about the outside world, apart from past cases; and the system learns by including new cases and updated its similarity judgment and recommendation decisions continuously. FTM differs from the previous recommendation systems in its representation and inference methods. Furthermore, this study has done extensive empirical evaluation to identify the significance of the use of fuzzy representation and inference methods and other factors on the performance. 3. The representation and inference methods The proposed fuzzy set theoretic content-based approach is based on a user’s previous feedback, and features of the new items and features of the set of items for which the user has provided feedback. The rationale of this method is that users are more likely to have interest in items, like movies, that are similar to the items they have experienced and liked. The representation method, inference engine consisting of aggregation methods and similarity measures, and the algorithms are presented in this section. 3.1. Item representation using fuzzy set A membership function in fuzzy set theory is deliberately designed to handle the vagueness and imprecision in the context of the application [22]. The type of function that is suitable depends on the application context, and in certain
A. Zenebe, A F. Norcio/ Fuzzy Sets and Systems 160(2009)76-94 cases the meaning captured by fuzzy sets is not too sensitive to the variations in the shape [22]. In practice, triangular, trapezoid, Gaussian, S-function, and exponential-like functions are the most commonly used membership functions Moreover, in practice, a suitable membership function,s shape is assumed a priori and its parameters are determined by domain experts or using machine learning techniques [22]. The former approach, i.e. domain analysis and expertise, is used in this research For an item described with multiple attributes, more than one attribute can be used for recommendation. Some attributes can also be multi-valued involving overlapping or non-mutually exclusive possible values. One should note that, movies are multi-genres and multi-actors 3]. The values of multi-valued attributes in an item can be represented more accurately within a fuzzy set framework than within a crisp set framework. Items of this type are considered in his research. Moreover, the representation scheme presented for a movie can be generalized and applied to any item with similar characteristics to a movie. A few examples are music, TV shows, restaurants and books Let an item lj(=l,..., M) be defined in the space of an attribute X=(x1, x2, x3,...,L then Ij can take multiple values such as x1, x2,..., and xL. The membership function of item; to value xk(k=l,., L), denoted by Axg(j), needs to be determined. Hence, a vector Xj=I(xk, Hx(lD)),k=l,., L is formed for Ij, where ux(Ij) can be interpreted as the degree of similarity of lj to a hypothetical (or prototype) pure xk type of the item; or as the degree of presence of value xk in item I In a movie marketing application, most movies are selected for pleasure and expenditure of time. Users choose movies they like and enjoy. Furthermore, users use subjective features of movies such as"funny", "romantic", and ary"(all are kinds of movie genres )to select movies more than objective features such as the director, theatre location and price, which are useful but are less important [10]. We use the movie as an item and movie genre as the attribute to make the method operational and develop the heuristic. Analysis of descriptions of main film genres shows that genre gl of movies(e.g. action)and genre g2 of movies(e.g adventure)are overlapping in terms of their subject matter and other movie attributes [3]. Based on the result of the findings in [10), movies highly liked by users can be grouped into similar categories by subjective features of movies such as genre and MPAA rating. This assertion is also verified in Section 5 using the movie dataset. Indeterminingthegenrecontentofamovie,weuseaheuristicbasedonthegenres'rankordersavailableinImDb.com and provided by the movie producers, instead of the crisp representation-the genre is presence(1)or absence(0) data value. An ideal technique is automatic content analysis of a movie for automatic identification of its genres. The automatic content analysis of a movie for automatic identification of its genres will provide the extent or degree of presence or amount of a genre in a movie. For example, amount of dramatic feature in a movie A, amount of action features in a movie A, etc. However, automatic content analysis technologies are not yet well developed and available. For example, a preliminary research on the automatic identification of movie genres by exploiting audio-visual cues in a movie is reported in[26] In the absence of this technique, we conduct the domain analysis based on the literature in movies [ 3, 26] and the available data provided by movies producers in the movie database. Given the definition of a movie in the space of genre(G), a movie can have one major genre denoted by xI and one or more minor genres x2, x3, and so on, in the decreasing order of their degrees of presence in a movie. The degree of membership of movie lj(=l,., M)to genre xk(k= l,..., N)is denoted by px(j). Hence, for Ij, a vector Gj=((xk, H(D)),k=l,., N can be formed. The following steps are taken in the development of the heuristic for the determination of the membership function pn(j) +Step1:SortxkindescendingorderoftheirdegreeofpresenceinIj.InImDb(wWw.Imdb.com')thegenresof ovie lj are presented in the order of their significance. For example, movie'King Kong(2005)has Action as a major and Adventure as the first minor, Drama as the second minor, Fantasy as the third minor, and Thriller as the 2: Assign higher degrees of membership value to more important genres of a movie. For instance, If Ij has only one genre, then un (lj)=l and un(j)=0 for all k= 2 to N If lj has two genres, then ux(j)=0.9, ux(j)=0.4 and Ax(j)=0 for all k=3 to N If lj has three genres, then Px (j)=0.7 and Hx (j)=0.5,un(j)=0.2 and un(j)=0 for all k=4 to N: and so on I"imdbHistory"vol.2004:IntemetMovieDatabaseInc.,n.d.(http://www.imdb.com/)
80 A. Zenebe, A.F. Norcio / Fuzzy Sets and Systems 160 (2009) 76–94 cases the meaning captured by fuzzy sets is not too sensitive to the variations in the shape [22]. In practice, triangular, trapezoid, Gaussian, S-function, and exponential-like functions are the most commonly used membership functions. Moreover, in practice, a suitable membership function’s shape is assumed a priori and its parameters are determined by domain experts or using machine learning techniques [22]. The former approach, i.e. domain analysis and expertise, is used in this research. For an item described with multiple attributes, more than one attribute can be used for recommendation. Some attributes can also be multi-valued involving overlapping or non-mutually exclusive possible values. One should note that, movies are multi-genres and multi-actors [3]. The values of multi-valued attributes in an item can be represented more accurately within a fuzzy set framework than within a crisp set framework. Items of this type are considered in this research. Moreover, the representation scheme presented for a movie can be generalized and applied to any item with similar characteristics to a movie. A few examples are music, TV shows, restaurants and books. Let an item Ij (j = 1,... ,M) be defined in the space of an attribute X = {x1, x2, x3,...,xL}, then Ij can take multiple values such as x1, x2,..., and xL. The membership function of item Ij to value xk(k = 1, . . . , L), denoted by xk (Ij ), needs to be determined. Hence, a vector Xj = {(xk, xk (Ij )), k = 1,..., L} is formed for Ij , where xk (Ij ) can be interpreted as the degree of similarity of Ij to a hypothetical (or prototype) pure xk type of the item; or as the degree of presence of value xk in item Ij . In a movie marketing application, most movies are selected for pleasure and expenditure of time. Users choose movies they like and enjoy. Furthermore, users use subjective features of movies such as “funny”, “romantic”, and “scary” (all are kinds of movie genres) to select movies more than objective features such as the director, theatre location and price, which are useful but are less important [10]. We use the movie as an item and movie genre as the attribute to make the method operational and develop the heuristic. Analysis of descriptions of main film genres shows that genre g1 of movies (e.g. action) and genre g2 of movies (e.g. adventure) are overlapping in terms of their subject matter and other movie attributes [3]. Based on the result of the findings in [10], movies highly liked by users can be grouped into similar categories by subjective features of movies such as genre and MPAA rating. This assertion is also verified in Section 5 using the movie dataset. In determining the genre content of a movie, we use a heuristic based on the genres’ rank orders, available in IMdb.com and provided by the movie producers, instead of the crisp representation—the genre is presence (1) or absence (0) data value. An ideal technique is automatic content analysis of a movie for automatic identification of its genres. The automatic content analysis of a movie for automatic identification of its genres will provide the extent or degree of presence or amount of a genre in a movie. For example, amount of dramatic feature in a movie A, amount of action features in a movie A, etc. However, automatic content analysis technologies are not yet well developed and available. For example, a preliminary research on the automatic identification of movie genres by exploiting audio-visual cues in a movie is reported in [26]. In the absence of this technique, we conduct the domain analysis based on the literature in movies [3,26] and the available data provided by movies producers in the movie database. Given the definition of a movie in the space of genre (G), a movie can have one major genre denoted by x1 and one or more minor genres x2, x3, and so on, in the decreasing order of their degrees of presence in a movie. The degree of membership of movie Ij (j = 1,... ,M) to genre xk (k = 1,...,N) is denoted by xk (Ij ). Hence, for Ij , a vector Gj = {(xk, xk (Ij )), k = 1,...,N} can be formed. The following steps are taken in the development of the heuristic for the determination of the membership function xk (Ij ). Step 1: Sort xk in descending order of their degree of presence in Ij . In IMDB (www.imdb.com 1 ) the genres of movie Ij are presented in the order of their significance. For example, movie ‘King Kong (2005)’ has Action as a major genre, and Adventure as the first minor, Drama as the second minor, Fantasy as the third minor, and Thriller as the fourth minor genres. Step 2: Assign higher degrees of membership value to more important genres of a movie. For instance, If Ij has only one genre, then xk (Ij ) = 1 and xk (Ij ) = 0 for all k = 2 to N. If Ij has two genres, then xk (Ij ) = 0.9, xk (Ij ) = 0.4 and xk (Ij ) = 0 for all k = 3 to N. If Ij has three genres, then xk (Ij ) = 0.7 and xk (Ij ) = 0.5, xk (Ij ) = 0.2 and xk (Ij ) = 0 for all k = 4 to N; and so on. 1 “IMDb History,” vol. 2004: Internet Movie Database Inc., n.d. http://www.imdb.com/.
A. Zenebe, A F. Norcio/ Fuzzy Sets and Systems 160(2009)76-94 Membership degree of movie to genres for User 7 Movie (lj) Gi (vector for jth movie) 0.683 1.000 0.000 0.000 4 0.438 0.000 1000 0.000 Based on the heuristics illustrated above, the possibility for item Ij to take different values of X varies and the membership function should meet the following four criteria: (1)assigning higher degree of membership to major values than minor values; (2) assigning O to values that are not associated with the item; (3)degrees of membership should be normalized to the range of [0, 1]; and(4)the same value of X at similar rank positions between different items should have varying degrees of membership values if the number of values of X associated with the items are different. We represent this type of heuristic with a Gaussian-like membership function, as shown in(1). Ax,(Ij)=rk/2v where N=Ilil is the number of values of X associated with Ij and rk(I I is a parameter used as a threshold to control the difference between consecutive values of X in/j. Moreover, is the only parameter that needs to be determined For example, using Eq(1)with a set to 1. 2(after various experimental trials), the movie 'King Kong(2005 0.211),(Thriller,.168)). Furthermore, for User 7, Table 1 shows the representation of some of the rated movies represented in terms of genres: for Lil= 5 and xj=[(Action, 1),(Adventure, 0.366), (Drama, 0. 272),(Fanta Ross et al. [23] stated that a membership function(MF)should match"the intuitively plausible semantic description of imprecise properties of the objects in X. "The reason we make the first important genre completely satisfied is movies are advertised exclusively as having a specific genre such as dram a and comedy; as well users have similar perception to movies. The soundness of the representation and inference method are further investigated in Section 5 using the movie dataset The reciprocal function defined as 1/k for k= l to N is also investigated. The problem with the reciprocal function is it does not consider the total number of different genres that exist in movies leading to the same degree of membership value for the same genre at same rank position with different number of genres. A uniform distribution membership of genres in a movie, assuming a movie with multiple genres has equal degree of genre presence for all occurring genres represented by l or 0, is the baseline crisp set representation. The heuristic that leads to the membership function in(1) is developed based on the analysis of the movie dataset literature on movies [10] and preliminary experimental trials conducted on a. This heuristic, for instance, assumes that two genres will not have equal degree of presence in a movie. That is, two or more genres cannot exist in"equal degree of presence"or"in equal amount"in a movie. For example, let us say a movie X has drama and comedy. The movie cannot have dramatic and comedy nature with exact equal amount, i.e. with exact degree of membership 0.5 each. This is also true in the movies database used in this research. This assumption is logical because a movie cannot have exactly the same"content"of two or more genres. If the amount of content is close then a needs to be tuned Moreover, in future research, studying various membership functions is needed in order to get an optimal fuzzy set based recommender system. The basis of our study is, provided that a membership function that satisfies the four criteria identified above. A is used, the recommendations by FTM will be better than crisp-set based approach. The example, one can use movie actresses/actors as the second attribute. The actors in a movie can be represented in a vco. a representation scheme can be extended to recommender systems based on a combination of multiple attributes. Fe A=a1, a2,..., ak) for K actors. The role or importance of an actor or actress ak in a movie mi can be represented by degree of membership associated with the fuzzy variable 'degree of role or importance. That is, A j=l(ak, Ha ( i) for k= I to k], where pa(lj) can be determined heuristically or using machine learning
A. Zenebe, A.F. Norcio / Fuzzy Sets and Systems 160 (2009) 76–94 81 Table 1 Membership degree of movie to genres for User 7 Movie (Ij ) Gj (vector for j th movie) Rating xj1 xj2 xj3 … xj15 xj16 56 4 0.683 1.000 0.438 … 0.000 0.000 79 5 1.000 0.000 0.000 … 0.467 0.000 89 3 0.683 0.000 0.000 … 0.000 1.000 … … … … … …… … 254 2 0.438 0.000 1.000 … 0.000 0.000 Based on the heuristics illustrated above, the possibility for item Ij to take different values of X varies and the membership function should meet the following four criteria: (1) assigning higher degree of membership to major values than minor values; (2) assigning 0 to values that are not associated with the item; (3) degrees of membership should be normalized to the range of [0, 1]; and (4) the same value of X at similar rank positions between different items should have varying degrees of membership values if the number of values of X associated with the items are different. We represent this type of heuristic with a Gaussian-like membership function, as shown in (1). xk (Ij ) = rk/2 √∗|Lj|(rk−1) , (1) where N = |Lj | is the number of values of X associated with Ij and rk (1rk |Lj |) is the rank position of value xk, and > 1 is a parameter used as a threshold to control the difference between consecutive values of X in Ij . Moreover, is the only parameter that needs to be determined. For example, using Eq. (1) with set to 1.2 (after various experimental trials), the movie ‘King Kong (2005)’ is represented in terms of genres: for |Lj | = 5 and xj = {(Action, 1), (Adventure, 0.366), (Drama, 0.272), (Fantasy, 0.211), (Thriller, 0.168)}. Furthermore, for User 7, Table 1 shows the representation of some of the rated movies. Ross et al. [23] stated that a membership function (MF) should match “the intuitively plausible semantic description of imprecise properties of the objects in X.” The reason we make the first important genre completely satisfied is movies are advertised exclusively as having a specific genre such as drama and comedy; as well users have similar perception to movies. The soundness of the representation and inference method are further investigated in Section 5 using the movie dataset. The reciprocal function defined as 1/k for k = 1 to N is also investigated. The problem with the reciprocal function is it does not consider the total number of different genres that exist in movies leading to the same degree of membership value for the same genre at same rank position with different number of genres. A uniform distribution membership of genres in a movie, assuming a movie with multiple genres has equal degree of genre presence for all occurring genres represented by 1 or 0, is the baseline crisp set representation. The heuristic that leads to the membership function in (1) is developed based on the analysis of the movie dataset, literature on movies [10] and preliminary experimental trials conducted on . This heuristic, for instance, assumes that two genres will not have equal degree of presence in a movie. That is, two or more genres cannot exist in “equal degree of presence” or “in equal amount” in a movie. For example, let us say a movie X has drama and comedy. The movie cannot have dramatic and comedy nature with exact equal ‘amount’, i.e. with exact degree of membership 0.5 each. This is also true in the movies database used in this research. This assumption is logical because a movie cannot have exactly the same “content” of two or more genres. If the amount of content is close then needs to be tuned. Moreover, in future research, studying various membership functions is needed in order to get an optimal fuzzy set based recommender system. The basis of our study is, provided that a membership function that satisfies the four criteria identified above. A is used, the recommendations by FTM will be better than crisp-set based approach. The representation scheme can be extended to recommender systems based on a combination of multiple attributes. For example, one can use movie actresses/actors as the second attribute. The actors in a movie can be represented in a vector A = {a1, a2,...,ak} for K actors. The role or importance of an actor or actress ak in a movie mi can be represented by degree of membership associated with the fuzzy variable ‘degree of role or importance’. That is, Aj = {(ak, ak (Ij )), for k = 1 to K}, where ak (Ij ) can be determined heuristically or using machine learning.
A Zenebe, A F. Norcio/Fuzzy Sets and Systems 160(2009)76-94 3. 2. User feedback representation using fuzzy set .. User rating is the most widely used feedback in recommender systems. It is a proxy variable used for measuring ser degrees of interest in an item. User ratings are represented and interpreted as binary values-those liked or disliked. In five-scale ratings, those above 3 are considered as liked. However, user rating is intrinsically imprecise as a may give different ratings to the same item at different times and situations due to the difficulty to make a distinction between rating 4 and 5, and 1 and 2. Moreover, the same rating, say 4 on a scale of 5, given by two users does not necessarily imply equal degrees of interest in an item. For pessimistic users, a rating of 4 may mean strongly liked but for optimistic raters it may mean somewhat liked Is the difference between ratings 3 and 4 the same as the difference between 4 and 5? These all contribute to fuzziness that arises from the human thinking processes instead of randomness associated with the ratings. Therefore, user interest based on user rating is treated as a fuzzy variable and its uncertainty is represented using a possibility distribution function. Let the fuzzy variable degrees of interest in an item(DI)consisting of strongly liked (SL), liked(L), indifferent 1), disliked(D), and strongly disliked (SD) fuzzy values, and associated with user rating(R)expressed in continuum from minimum value(Min) to maximum value(Max). Then, the proposition'a user has strongly liked an item Ihas the possibility distribution function IR()= HsL(R =r), for r between Min and Max Under this interpretation or semantic of the fuzzy variable Dl, the user rating is represented and inferred using a possibility distribution function by treating the rating as a fuzzy number [22]. For instance a rating 4 on 5 scale which refers to strongly liked is represented in terms of its possibility distribution values =IASL(R=4)/5, HL(R 4)/4, .., HsD(R= 4)/1. For instance, for User 7 in Table 1, the possibility distribution of User 7s rating of High on movie 56 can be expressed as: IASL (R= 4)=0.50, H(R=4)=1,HI(R = 4)=0.50, AD(R=4)= 0.25, ASD(R=4)=0.0 F; and possibility distribution of User 7s rating of Very High on movie 79 can be expressed as:{sL(R=5)=1,(R=5)=0.55,(R=5)=0.20,pD(R=5)=0.0,sD(R=5)=0.0 without losing generality, a half triangular fuzzy number, which is the simplest model of uncertain quantity, is used to represent the degree of positive experience a user has in relation to an item. The half triangular fuzzy number membership function, for user rating r on Ii E [Min, Max] and for a fuzzy set A on DI, is defined as HA(i=(-Min)/(Max- Min) As a result, a set of items liked by a user, denoted by E, is defined as: (i: HA(i)>0.5, i.e., li: r>(Min+ Max)/2) 3.3. Inference engine and algorithm Based on the representation scheme defined for items and user feedback, the inference engine consisting of the recommendation score aggregation methods and the similarity measures is defined below. 3.3. 1. Fuzzy set theoretic similarity measures One of the most important issues in recommender systems research is computing similarity between users, and between items(products, events, services, etc. ) This in turns highly depends on the appropriateness and reliability of the methods of representation. The set-theoretic, proximity-based and logic-based are the three classes of measures of similarity [11]. In fuzzy set and possibility framework, similarity of users or items is computed based on the membership unctions of the fuzzy sets associated to the users or item features. Based on the work of Cross and Sudkamp [11], those similarity measures that are relevant for item recommendation application are adapted. For items 1; and lk that are defined as ((xi, Hx (D)),i=l,., N and ((i, Hx(Ik)),i=l,., N, a similarity measure between lj and lk is denoted by S(Ik, Ij), and the different similarity measures are defined n(x,(k),px2() SI(k, ID)= (Fx1(k),Px(1)) S2(k,l)= ∑Hx2()*x1() (△(2(4)2)√(∑(ax()2)
82 A. Zenebe, A.F. Norcio / Fuzzy Sets and Systems 160 (2009) 76–94 3.2. User feedback representation using fuzzy set User rating is the most widely used feedback in recommender systems. It is a proxy variable used for measuring user degrees of interest in an item. User ratings are represented and interpreted as binary values-those liked or disliked. In five-scale ratings, those above 3 are considered as liked. However, user rating is intrinsically imprecise as a user may give different ratings to the same item at different times and situations due to the difficulty to make a distinction between rating 4 and 5, and 1 and 2. Moreover, the same rating, say 4 on a scale of 5, given by two users does not necessarily imply equal degrees of interest in an item. For pessimistic users, a rating of 4 may mean strongly liked but for optimistic raters it may mean somewhat liked. Is the difference between ratings 3 and 4 the same as the difference between 4 and 5? These all contribute to fuzziness that arises from the human thinking processes instead of randomness associated with the ratings. Therefore, user interest based on user rating is treated as a fuzzy variable and its uncertainty is represented using a possibility distribution function. Let the fuzzy variable degrees of interest in an item (DI) consisting of strongly liked (SL), liked (L), indifferent (I ), disliked (D), and strongly disliked (SD) fuzzy values, and associated with user rating (R) expressed in continuum from minimum value (Min) to maximum value (Max). Then, the proposition ‘a user has strongly liked an item I’ has the possibility distribution function R(I ) = SL(R = r), for r between Min and Max. Under this interpretation or semantic of the fuzzy variable DI, the user rating is represented and inferred using a possibility distribution function by treating the rating as a fuzzy number [22]. For instance a rating 4 on 5 scale which refers to strongly liked is represented in terms of its possibility distribution values = {SL(R = 4)/5, L(R = 4)/4,..., SD(R = 4)/1}. For instance, for User 7 in Table 1, the possibility distribution of User 7’s rating of High on movie 56 can be expressed as: {SL(R = 4) = 0.50, L(R = 4) = 1, I(R = 4) = 0.50, D(R = 4) = 0.25, SD(R = 4) = 0.0}; and possibility distribution of User 7’s rating of Very High on movie 79 can be expressed as: {SL(R = 5) = 1, L(R = 5) = 0.55, I(R = 5) = 0.20, D(R = 5) = 0.0, SD(R = 5) = 0.0}. Without losing generality, a half triangular fuzzy number, which is the simplest model of uncertain quantity, is used to represent the degree of positive experience a user has in relation to an item. The half triangular fuzzy number membership function, for user rating r on Ii ∈ [Min, Max] and for a fuzzy set A on DI, is defined as: A(Ii) = (r − Min)/(Max − Min). (2) As a result, a set of items liked by a user, denoted by E, is defined as: {Ii: A(Ii) > 0.5, i.e., Ii:r>(Min+Max)/2}. 3.3. Inference engine and algorithm Based on the representation scheme defined for items and user feedback, the inference engine consisting of the recommendation score aggregation methods and the similarity measures is defined below. 3.3.1. Fuzzy set theoretic similarity measures One of the most important issues in recommender systems research is computing similarity between users, and between items (products, events, services, etc.). This in turns highly depends on the appropriateness and reliability of the methods of representation. The set-theoretic, proximity-based and logic-based are the three classes of measures of similarity [11]. In fuzzy set and possibility framework, similarity of users or items is computed based on the membership functions of the fuzzy sets associated to the users or item features. Based on the work of Cross and Sudkamp [11], those similarity measures that are relevant for item recommendation application are adapted. For items Ij and Ik that are defined as {(xi, xi (Ij )), i = 1,...,N} and {(xi, xi (Ik)), i = 1,...,N}, a similarity measure between Ij and Ik is denoted by S(Ik, Ij ), and the different similarity measures are defined as S1(Ik, Ij ) = i min(xi (Ik), xi (Ij )) i max(xi (Ik), xi (Ij )), (3) S2(Ik, Ij ) = i xi (Ik) ∗ xi (Ij ) (( i (xi (Ik))2)) (( i (xi (Ij )2)) , (4)
A. Zenebe, A F. Norcio/ Fuzzy Sets and Systems 160(2009)76-94 d2(k,In S3(k,l;)=1 Maxlux, (Ik),ur (j)l 2 S4(k,1)= d(lk, Ij where d)=∑(4)-1(1)2-md=∑2*1()-12ra=komj These fuzzy set based similarity measures are: fuzzy set theoretic in (3), cosine in(4), proximity(Minkowski's distance based) in(5), and correlation-like in(6). The similarity measure for the baseline method, where item feature (e.g. genre)and user feedback(e.g rating) are represented as crisp sets is called the crisp set based similarity measure (CSM) and defined as ∑;min(gx2(k),Px1(l) Ss(k,Ij) i max (ux (),ur, iD)) In(7)u (lk)is l if the feature value x; exists in Ii, and assigned to O otherwise. Eqs. (3)and (7)are similar in all spects, except for the difference in the representation schema used. For example, Table 2 shows movies 11=(Copycat 1995: Crime/Mystery /Thriller/Drama), and 12=( Grudge 2004 Horror/Thriller/Mystery) with their crisp(I1 and 12)and fuzzy(G and G2)representations Using crisp set theoretic (7), crisp cosine(similar to (4)except that l or 0 are used instead of membership degree) and crisp distance measures (similar to(5)except that l or 0 are used instead of membership degree) the similarity between /1 and 12 are 0.40, 0.58, and 0.73, respectively. Finally, using the corresponding fuzzy set theoretic similarity measures(3), (4), and (5)the similarity coefficients are 0.24, 0.25, and 0.55. The difference in the representation leads to the variation in similarity measures. The effects of these differences in similarity measures on recommender systems'accuracy are assessed in 3.3.2. Aggregation methods The recommendation decision-making methods used in ftm are case-based decision there are different recom- mendation score aggregation methods for computing and combining the various contributions to recommendation confidence scores in order to make recommendation decision within the framework of fuzzy and possibility theory. The two alternatives that consider both user feedback and similarity between previously liked items and a new item are weighted-sum and max-minimum. They are defined as follows Weighted-sum: For each targeted item], calculate the weighted sum(weighted-sum) recommendation confidence score as R1()=∑HE()S(4,1) where E is a set of positively experienced (liked) items by users, and be(Ik) is the membership of the item /k to the set E. Furthermore, S(k, Ij) is the similarity between /j and lk computed using the similarity measures defined bove. A normalized evaluation score for each Iis recommendation confidence score is obtained using NRI(n) RI(j)/max[Ri(k) Maximum-minimum: For each targeted item Ij, calculate the maximum-minimum(max-min) recommendation con R2(lj)=max(min(S(j, Ik), HE(Ik))
A. Zenebe, A.F. Norcio / Fuzzy Sets and Systems 160 (2009) 76–94 83 S3(Ik, Ij ) = 1 − ⎛ ⎜ ⎜ ⎝ d2(Ik, Ij ) Max i {xi (Ik), xi (Ij )} ⎞ ⎟ ⎟ ⎠ , (5) S4(Ik, Ij ) = 1 − 2 ZIk + ZIj d2(Ik, Ij ) 2, (6) where d2(Ik, Ij ) = i (xi (Ik) − xi (Ij ))2 and ZIa = i (2 ∗ xi (I a) − 1) 2 for a = k or j. These fuzzy set based similarity measures are: fuzzy set theoretic in (3), cosine in (4), proximity (Minkowski’s distance based) in (5), and correlation-like in (6). The similarity measure for the baseline method, where item feature (e.g. genre) and user feedback (e.g. rating) are represented as crisp sets is called the crisp set based similarity measure (CSM) and defined as S5(Ik, Ij ) = i min(xi (Ik), xi (Ij )) i max(xi (Ik), xi (Ij )). (7) In (7) xi (Ik) is 1 if the feature value xi exists in Ii, and assigned to 0 otherwise. Eqs. (3) and (7) are similar in all aspects, except for the difference in the representation schema used. For example, Table 2 shows movies I1 = {Copycat 1995: Crime/Mystery/Thriller/Drama}, and I2 = {Grudge 2004: Horror/Thriller/Mystery} with their crisp (I1 and I2) and fuzzy (G1 and G2) representations. Using crisp set theoretic (7), crisp cosine (similar to (4) except that 1 or 0 are used instead of membership degree) and crisp distance measures (similar to (5) except that 1 or 0 are used instead of membership degree) the similarity between I1 and I2 are 0.40, 0.58, and 0.73, respectively. Finally, using the corresponding fuzzy set theoretic similarity measures (3), (4), and (5) the similarity coefficients are 0.24, 0.25, and 0.55. The difference in the representation leads to the variation in similarity measures. The effects of these differences in similarity measures on recommender systems’ accuracy are assessed in Section 5. 3.3.2. Aggregation methods The recommendation decision-making methods used in FTM are case-based decision. There are different recommendation score aggregation methods for computing and combining the various contributions to recommendation confidence scores in order to make recommendation decision within the framework of fuzzy and possibility theory. The two alternatives that consider both user feedback and similarity between previously liked items and a new item are weighted-sum and max–minimum. They are defined as follows. Weighted-sum: For each targeted item Ij , calculate the weighted sum (weighted-sum) recommendation confidence score as R1(Ij ) = k E(I k)S(Ik, Ij ), (8) where E is a set of positively experienced (liked) items by users, and E(Ik) is the membership of the item Ik to the set E. Furthermore, S(Ik, Ij ) is the similarity between Ij and Ik computed using the similarity measures defined above. A normalized evaluation score for each Ij ’s recommendation confidence score is obtained using NR1(Ij ) = R1(Ij )/ max k [R1(Ik)] Maximum–minimum: For each targeted item Ij , calculate the maximum–minimum (max–min) recommendation confidence score as R2(Ij ) = max k {min(S(Ij , Ik), E(Ik))}, (9)
A. Zenebe, A F. Norcio/ Fuzzy Sets and Systems 160(2009)76-94 Table 2 Movies representation in space of genres Crime Horror Mystery Thriller Drama 0.41 0.47 E.Furthermore,S(, i) is the similarity between I; and Ik computed using one of the similarity measures defindlfi where E is a set of positively experienced (liked)items by users, and FE(Ik)is the membership of the item Ik to the The resulting recommendation confidence score RK(j), either from( 8)or(9), is used to determine the Top-n items for recommendation and can be consider as a degree of support to recommend Ij. A normalized evaluation score obtained by dividing each Ii's recommendation confidence score by the maximum of recommendation confidence RI takes into account all participating elements hence it has a compensatory effects between highly liked and just liked items in E as well as most similar and least similar items. R2 is high as soon as there is at least one good rated item similar to the new item. R2 is similar to Eq (13)in Calmes et al. [9] and Eq (4)in Dubois et al. [14]. There are several variants of aggregator operators, and two common operators are R3(j)=min (max(I-S(j, Ik), HE(k)). R3 is defined in Dubois et al. [13], and is high when all items are similar to the new one. Hence, R2 is considered to be the optimistic counterpart to R3. R4 is just a variant of R2 where product operator is used instead of the min for the and'operator, and it is high in the same circumstances as R2 In future works, the aggregation operators R3, Ra and other suitable operators will be identified and considered. Furthermore, future works will consider linguistic quantifiers 3.3.3. Implementation of algorithms First, the algorithm uses items for which the user has expressed a degree of interest or preference; and then the algo- rithm finds similar items to those that interest the user. Using the defined representation schemes, the various similarity measures and aggregation strategies, various algorithms are designed and implemented. Simulation implementations of the algorithms and simulation runs are performed on an Intel Pentium(r)4, 2.66 GHz of speed, 512 MB of and running with Windows operating systems. The Java 2 programming language and Microsoft Access 2002 are used as software tools 4. Evaluations The performance of the proposed method and its algorithms are evaluated using movies as items. The movie dataset, xperimental setup, simulation runs and evaluation metrics are presented below. 4.. Dataset The benchmark dataset from Movie Lens at the University of Minnesota, which has been widely used in recom- mendation research, is used in this study. The dataset includes movie attributes, user ratings, and user demographi 2(http:/imovielens.umn.edu)
84 A. Zenebe, A.F. Norcio / Fuzzy Sets and Systems 160 (2009) 76–94 Table 2 Movies representation in space of genres Crime Horror Mystery Thriller Drama I1 10 1 1 1 G1 1 0 0.44 0.35 0.29 I2 01 1 1 0 G2 0 1 0.41 0.47 0 where E is a set of positively experienced (liked) items by users, and E(Ik) is the membership of the item Ik to the set E. Furthermore, S(Ik, Ij ) is the similarity between Ij and Ik computed using one of the similarity measures defined in Eqs. (3)–(7). The resulting recommendation confidence score Rk(Ij ), either from (8) or (9), is used to determine the Top-n items for recommendation and can be consider as a degree of support to recommend Ij . A normalized evaluation score is obtained by dividing each Ij ’s recommendation confidence score by the maximum of recommendation confidence scores. R1 takes into account all participating elements hence it has a compensatory effects between highly liked and just liked items in E as well as most similar and least similar items. R2 is high as soon as there is at least one good rated item similar to the new item. R2 is similar to Eq. (13) in Calmès et al. [9] and Eq. (4) in Dubois et al. [14]. There are several variants of aggregator operators, and two common operators are: R3(Ij ) = min k {max((1 − S(Ij , Ik), E(Ik))}, R4(Ij ) = max k (S(Ij , Ik), E(Ik)) . R3 is defined in Dubois et al. [13], and is high when all items are similar to the new one. Hence, R2 is considered to be the optimistic counterpart to R3. R4 is just a variant of R2 where product operator is used instead of the min for the ‘and’ operator, and it is high in the same circumstances as R2. In future works, the aggregation operators R3, R4 and other suitable operators will be identified and considered. Furthermore, future works will consider linguistic quantifiers as presented in Yager [28]. 3.3.3. Implementation of algorithms First, the algorithm uses items for which the user has expressed a degree of interest or preference; and then the algorithm finds similar items to those that interest the user. Using the defined representation schemes, the various similarity measures and aggregation strategies, various algorithms are designed and implemented. Simulation implementations of the algorithms and simulation runs are performed on an Intel Pentium (R) 4, 2.66 GHz of speed, 512 MB of memory, and running with Windows operating systems. The Java 2 programming language and Microsoft Access 2002 are used as software tools. 4. Evaluations The performance of the proposed method and its algorithms are evaluated using movies as items. The movie dataset, experimental setup, simulation runs and evaluation metrics are presented below. 4.1. Dataset The benchmark dataset from MovieLens at the University of Minnesota, 2 which has been widely used in recommendation research, is used in this study. The dataset includes movie attributes, user ratings, and user demographic 2 http://movielens.umn.edu
A. Zenebe, A F. Norcio/ Fuzzy Sets and Systems 160(2009)76-94 information. The dataset consists of 100,000 ratings(1-5)from 943 users on 1682 movies; and each user has rated at least 20 and at most 737 movies. In the dataset. movies are described with: movie id. movie title. release date video release date, IMDb URL, and 20 genres including action, adventure, animation, childrens, comedy, crime documentary, drama, fantasy, film-noir, horror, musical, mystery, romance, sci-fi, thriller, war, western, family, and Genres in the MovieLens dataset are represented with binary values, which do not reflect the true content of movies in the genre space. Therefore, we use the proposed representation scheme by incorporating information about movie genres retrieved from the Internet Movie Database(imdb. com), which is a large database consisting of comprehensive nformation about past, present and upcoming movies. In IMDb, the producers of a movie present genres in the movie in order of their degree of presence, and this information is used in Eq. (1) 4. 2. Evaluation metrics Accuracy is a commonly used metric for a recommender system based on user tasks or goals [18]. The accuracy metrics includes predictive and recommendation accuracy measures. Predictive accuracy measures such as mean absolute error, mean squared error and percentage of correct predictions are found to be less appropriate when the user task is to find 'good items and when the granularity of true value is small because predicting a 4 as 5 or a 3 as 2 makes no difference to the user [18]. Instead, the recommendation accuracy metrics of precision, recall and Fl-measure are more appropriate [18]. The precision and recall are computed using movies for which ratings are provided and held for testing; and movies with ratings 4 and 5 are considered as movies liked by users. This approach of measuring performance is widely used in recommender systems research, e. g [12, 18, 21] Precision measures the ratio of correct recommendations being made. Recall reflects the coverage or hit rate of commendations. Fl-measure =(2* precision recall)/(precision +recall) is a single metric that combines precision and recall 43. Simulation run and experimental design Simulation recommender systems for movies are designed and developed to assess the effectiveness of the proposed ethod The simulation system works as follows (1)For each user, it randomly splits the movie ratings dataset into a training set and a test set. (2)Using the training set, it computes recommendation confidence score for each item in the test set using the different similarity measures and aggregation strategies (3) For each user, using the movies in the testing set, it generates Top-N recommendations and computes the precision, recall and Fl measure (4) Using different random selection of the movies into testing and training sets, 10 different runs are executed to avoid sensitivity to sampling bias, and the mean results are reported The simulation system runs for 100 randomly selected users from 943 users available in the dataset. Each user has rated between 20 and 737 movies. During the simulation runs training size(model size)of 5, 10, 15, 20, 25, and 30 along with the corresponding test size of total number of rated movies minus training size are used. Using different random partition of the movies rated by a user into testing and training sets, 10 different runs are executed to avoid sensitivity to sampling bias. Moreover, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, and 50 are used as values of the variable number of item to be recommended (recommendation size ). Precision, recall, and Fl-measure are the dependent variables. Table 3 presents the independent variables along with the possible values. On each of the dependent variables, a 2 x 5 x 6x 12 factorial analysis [4] is conducted. In this design the fixed factors are aggregation method with two categories, similarity measure with five categories, training size with six categories and recommendation size with 12 categories. The effect of testing size is also studied as a covariate variable. All statistical tests are performed at 5%o level Moreover, (i)using different random selection of the movies into testing and training sets 10 different runs are executed to avoid sensitivity to sampling bias, and average results are reported; (ii) 100 users and related records are selected randomly; (iii)normalization is applied on the recommendation scores in all cases
A. Zenebe, A.F. Norcio / Fuzzy Sets and Systems 160 (2009) 76–94 85 information. The dataset consists of 100,000 ratings (1–5) from 943 users on 1682 movies; and each user has rated at least 20 and at most 737 movies. In the dataset, movies are described with: movie id, movie title, release date, video release date, IMDb URL, and 20 genres including action, adventure, animation, children’s, comedy, crime, documentary, drama, fantasy, film-noir, horror, musical, mystery, romance, sci-fi, thriller, war, western, family, and unknown. Genres in the MovieLens dataset are represented with binary values, which do not reflect the true content of movies in the genre space. Therefore, we use the proposed representation scheme by incorporating information about movie genres retrieved from the Internet Movie Database (imdb.com), which is a large database consisting of comprehensive information about past, present and upcoming movies. In IMDb, the producers of a movie present genres in the movie in order of their degree of presence, and this information is used in Eq. (1). 4.2. Evaluation metrics Accuracy is a commonly used metric for a recommender system based on user tasks or goals [18]. The accuracy metrics includes predictive and recommendation accuracy measures. Predictive accuracy measures such as mean absolute error, mean squared error and percentage of correct predictions are found to be less appropriate when the user task is to find ‘good’ items and when the granularity of true value is small because predicting a 4 as 5 or a 3 as 2 makes no difference to the user [18]. Instead, the recommendation accuracy metrics of precision, recall and F1-measure are more appropriate [18]. The precision and recall are computed using movies for which ratings are provided and held for testing; and movies with ratings 4 and 5 are considered as movies liked by users. This approach of measuring performance is widely used in recommender systems research, e.g. [12,18,21]. Precision measures the ratio of correct recommendations being made. Recall reflects the coverage or hit rate of recommendations. F1-measure = (2∗precision ∗recall)/(precision+recall) is a single metric that combines precision and recall. 4.3. Simulation run and experimental design Simulation recommender systems for movies are designed and developed to assess the effectiveness of the proposed method. The simulation system works as follows: (1) For each user, it randomly splits the movie ratings dataset into a training set and a test set. (2) Using the training set, it computes recommendation confidence score for each item in the test set using the different similarity measures and aggregation strategies. (3) For each user, using the movies in the testing set, it generates Top-N recommendations and computes the precision, recall, and F1 measure. (4) Using different random selection of the movies into testing and training sets, 10 different runs are executed to avoid sensitivity to sampling bias, and the mean results are reported. The simulation system runs for 100 randomly selected users from 943 users available in the dataset. Each user has rated between 20 and 737 movies. During the simulation runs training size (model size) of 5, 10, 15, 20, 25, and 30 along with the corresponding test size of total number of rated movies minus training size are used. Using different random partition of the movies rated by a user into testing and training sets, 10 different runs are executed to avoid sensitivity to sampling bias. Moreover, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, and 50 are used as values of the variable number of item to be recommended (recommendation size). Precision, recall, and F1-measure are the dependent variables. Table 3 presents the independent variables along with the possible values. On each of the dependent variables, a 2 × 5 × 6 × 12 factorial analysis [4] is conducted. In this design the fixed factors are aggregation method with two categories, similarity measure with five categories, training size with six categories and recommendation size with 12 categories. The effect of testing size is also studied as a covariate variable. All statistical tests are performed at 5% level. Moreover, (i) using different random selection of the movies into testing and training sets 10 different runs are executed to avoid sensitivity to sampling bias, and average results are reported; (ii) 100 users and related records are selected randomly; (iii) normalization is applied on the recommendation scores in all cases