Genetic Algorithms for Feature Weighting in Multi-criteria Recommender Systems Genetic Algorithms for Feature Weighting in multi-criteria Recommender Systems Chein-Shung hwang Dept of Information Managemen Chinese culture university aipei, Taiwan hwang@faculty. pccu. edu dor: 10.4156/jcit vol5. issue813 Abstract Recommender systems have been emerging as a powerful technique of e-commerce. The majority of existing recommender systems uses an overall rating value on items for evaluating user's preference opinions. Because users might express their opinions based on some specific features of the item, recommender systems solely based on a single criterion could produce recommendations that do not meet user needs. In this paper, we propose a mechanism for integrating multiple criteria into th Collaborative Filtering (CF) algorithm. Specifically, we present the implementation of Genetic Algorithms (GA) for optimal feature weighting. The proposed system consists of two main parts. First, the weight of each user toward each feature is computed by using GAs. The feature weights are then incorporated into the collaborative filtering process to provide recommendations. Empirical studies have shown that our weighting scheme can be incorporated to improve the performance of multi criteria cF Keywords: Genetic Algorithms, Collaborative Filtering, Multiple Criteria, Recommender Systems 1 Introduction Although the rapid development and popularity of the Internet have brought convenience for people to access a variety of information, products, and services, the internet has also caused a scenario where people have difficulty obtaining relevant information. This scenario is commonly referred to as the problem of"information overload". Recommender systems [1, 2] are emergent to solve the information overload challenges. Recommender systems are personalized information filtering technologies used to suggest items to users that they might like or find interesting. In recent years, recommender systems ave been successfully applied in a broad range of applications, including recommending movies books. news articles. music. etc. Recommender systems can be based on content-based filtering, collaborative filtering, and hybrid Itering [3, 4]. Content-based recommendation suggests items to users that are similar to those items that they were interested in previously. Collaborative filtering( CF)recommends items based on the information about similar items or users. Hybrid recommendation combines both algorithms into one hybrid approach to gain better performance and avoid drawbacks from each individual one p The majority of existing recommender systems uses an overall rating value on items for evaluating s'preferences. The overall rating depends on one single criterion that usually represents the overall preference of users to a particular item. As users might express their opinions based on some specific features of the item, recommender systems merely based on a single criterion could produce recommendations that do not meet user needs. For example, in a movie recommender system, two sers A and B both assign a single-criterion rating of 12(out of 13)for Avatar. The recommender systems will conclude they have the same tastes even if A likes its story and B likes its visuals. This is called a"without distinction of interest problem. Furthermore, even if both users like the same movi features(e.g. actors, visuals, etc. ) they might select different movies. This is because people usually select a movie based on different movie features This situation is referred to as an"unsuitable weight feature" problem [5] Many recommender systems [6-8 have been developed to tackle the above-mentioned problems by using multiple criteria. Roux et al. [6] described a course recommender system
Genetic Algorithms for Feature Weighting in Multi-criteria Recommender Systems Chein-Shung Hwang Genetic Algorithms for Feature Weighting in Multi-criteria Recommender Systems Chein-Shung Hwang Dept. of Information Management Chinese Culture University Taipei, Taiwan cshwang@faculty.pccu.edu.tw doi: 10.4156/jcit.vol5.issue8.13 Abstract Recommender systems have been emerging as a powerful technique of e-commerce. The majority of existing recommender systems uses an overall rating value on items for evaluating user’s preference opinions. Because users might express their opinions based on some specific features of the item, recommender systems solely based on a single criterion could produce recommendations that do not meet user needs. In this paper, we propose a mechanism for integrating multiple criteria into the Collaborative Filtering (CF) algorithm. Specifically, we present the implementation of Genetic Algorithms (GA) for optimal feature weighting. The proposed system consists of two main parts. First, the weight of each user toward each feature is computed by using GAs. The feature weights are then incorporated into the collaborative filtering process to provide recommendations. Empirical studies have shown that our weighting scheme can be incorporated to improve the performance of multicriteria CF. Keywords: Genetic Algorithms, Collaborative Filtering, Multiple Criteria, Recommender Systems 1. Introduction Although the rapid development and popularity of the Internet have brought convenience for people to access a variety of information, products, and services, the internet has also caused a scenario where people have difficulty obtaining relevant information. This scenario is commonly referred to as the problem of “information overload”. Recommender systems [1, 2] are emergent to solve the information overload challenges. Recommender systems are personalized information filtering technologies used to suggest items to users that they might like or find interesting. In recent years, recommender systems have been successfully applied in a broad range of applications, including recommending movies, books, news articles, music, etc. Recommender systems can be based on content-based filtering, collaborative filtering, and hybrid filtering [3, 4]. Content-based recommendation suggests items to users that are similar to those items that they were interested in previously. Collaborative filtering (CF) recommends items based on the information about similar items or users. Hybrid recommendation combines both algorithms into one hybrid approach to gain better performance and avoid drawbacks from each individual one. The majority of existing recommender systems uses an overall rating value on items for evaluating users’ preferences. The overall rating depends on one single criterion that usually represents the overall preference of users to a particular item. As users might express their opinions based on some specific features of the item, recommender systems merely based on a single criterion could produce recommendations that do not meet user needs. For example, in a movie recommender system, two users A and B both assign a single-criterion rating of 12 (out of 13) for Avatar. The recommender systems will conclude they have the same tastes even if A likes its story and B likes its visuals. This is called a “without distinction of interest” problem. Furthermore, even if both users like the same movie features (e.g. actors, visuals, etc.), they might select different movies. This is because people usually select a movie based on different movie features. This situation is referred to as an “unsuitable weight feature” problem [5]. Many recommender systems [6-8] have been developed to tackle the above-mentioned problems by using multiple criteria. Roux et al. [6] described a course recommender system 126
Journal of Convergence Information Technology Volume 5. Number 8 October 2010 based on the multiple criteria decision-making model and the CF method. In this system, for each decision maker, a user-specified set of feature weights was required in order to aggregate measures from each feature. Teng et al. [7 viewed the multi-criteria recommendation problem s a data query problem and utilized the skyline query to eliminate conflicting criteria. Since the recommendation problem was no longer viewed as an optimization problem, the feature weights were not taken into account. UTA-Rec [8] was a utility-based recommender system that worked by employing the preference disaggregation principle. This system implemented the UTA*[91 algorithm to model a user's preference in terms of a set of additive utility functions based on multiple criteria. The criteria weights could be represented by the trade-off among the criteria utility values. However, performance might have been affected when there were only a very small number of ratings available for the active user Genetic algorithms (GAs) are adaptive algorithms based on the Darwinian principle of natural selection and have been successfully applied to a wide range of optimization problems including design, scheduling, routing, control, etc. In this paper, we propose a multi-criteria recommender ystem, which uses GAs for feature weighting. The proposed system consists of three modules. In the MCP (Multi-Criteria Prediction Module), the user-based CF algorithm is used to estimate the ratings for each individual criterion In the AG(Aggregation Module, the ga is used to find an optimal set of feature weights for each active user. Finally, in the rC(Recommendation Module), a list of recommendations is derived and presented. This optimal weighting scheme enables the recommender system to make more accurate predictions of users' likes and dislikes, and hence will provide better recommendations to users 2. Genetic algorithms GAs [1oI[ll are stochastic search techniques that guide a population of solutions using the principles of evolution and natural genetics. Extensive research has been performed exploiting the robust properties of genetic algorithms and demonstrating their capabilities across a broad spectrum of optimization problems, including feature selection and weighting tasks. GAs are modeled loosely on the principles of the evolution via natural selection, employing a population of individuals that undergo selection in the presence of variation-inducing operators, such as mutation and crossover. A fitness function is used to evaluate individuals, and reproductive success varies with fitness A general algorithm is started with a set of solutions(represented by chromosomes) called population. An initial population is created from a random selection of solutions. At every evolutionary step(generation), the solutions in the current population are evaluated according to some predefined quality criterion, referred to as the fitness or fitness function. Solutions from one population are used to form a new population(next generation). Solutions(parents) are selected according to their fitness-the more suitable they are the more chances they have to reproduce. These solutions then"reproduce"to create one or more new solution (offspring), after which the offspring are randomly produced by crossover or mutation. The process of fitness-dependent selection and application of genetic operators to generate successive generations of solutions is repeated many times until a satisfactory solution found. The basic steps of GAs are outlined as follows 1. [Initialization Randomly generate an initial population of solutions and evaluate the fitness function mplete. On] Create a new population by repeating the following steps until the new population 2. 1 [ Selection Select two parent solutions from a population according to their fitness(the better itness, the greater the chance to be selected) 2.2 [ Crossover] With a crossover probability cross over the parents to form a new offspring. If no crossover is performed, the offspring is an exact copy of the parents 3 Mutation With a mutation probability, mutate new offspring at each locus(position in the chromosome) 2.4 [Acceptance] Place new offspring in a new populatic Evaluation] Compute the fitness values for the new population of N solutions 4. [Test] If the stopping criterion is met, stop, and return the best solution in current popula
Journal of Convergence Information Technology Volume 5, Number 8, October 2010 based on the multiple criteria decision-making model and the CF method. In this system, for each decision maker, a user-specified set of feature weights was required in order to aggregate measures from each feature. Teng et al. [7] viewed the multi-criteria recommendation problem as a data query problem and utilized the skyline query to eliminate conflicting criteria. Since the recommendation problem was no longer viewed as an optimization problem, the feature weights were not taken into account. UTA-Rec [8] was a utility-based recommender system that worked by employing the preference disaggregation principle. This system implemented the UTA* [9] algorithm to model a user’s preference in terms of a set of additive utility functions based on multiple criteria. The criteria weights could be represented by the trade-off among the criteria utility values. However, performance might have been affected when there were only a very small number of ratings available for the active user. Genetic algorithms (GAs) are adaptive algorithms based on the Darwinian principle of natural selection and have been successfully applied to a wide range of optimization problems including design, scheduling, routing, control, etc. In this paper, we propose a multi-criteria recommender system, which uses GAs for feature weighting. The proposed system consists of three modules. In the MCP (Multi-Criteria Prediction Module), the user-based CF algorithm is used to estimate the ratings for each individual criterion. In the AG (Aggregation Module), the GA is used to find an optimal set of feature weights for each active user. Finally, in the RC (Recommendation Module), a list of recommendations is derived and presented. This optimal weighting scheme enables the recommender system to make more accurate predictions of users' likes and dislikes, and hence will provide better recommendations to users. 2. Genetic algorithms GAs [10][11] are stochastic search techniques that guide a population of solutions using the principles of evolution and natural genetics. Extensive research has been performed exploiting the robust properties of genetic algorithms and demonstrating their capabilities across a broad spectrum of optimization problems, including feature selection and weighting tasks. GAs are modeled loosely on the principles of the evolution via natural selection, employing a population of individuals that undergo selection in the presence of variation-inducing operators, such as mutation and crossover. A fitness function is used to evaluate individuals, and reproductive success varies with fitness. A general algorithm is started with a set of solutions (represented by chromosomes) called population. An initial population is created from a random selection of solutions. At every evolutionary step (generation), the solutions in the current population are evaluated according to some predefined quality criterion, referred to as the fitness or fitness function. Solutions from one population are used to form a new population (next generation). Solutions (parents) are selected according to their fitness - the more suitable they are the more chances they have to reproduce. These solutions then "reproduce" to create one or more new solution (offspring), after which the offspring are randomly produced by crossover or mutation. The process of fitness-dependent selection and application of genetic operators to generate successive generations of solutions is repeated many times until a satisfactory solution is found. The basic steps of GAs are outlined as follows: 1. [Initialization] Randomly generate an initial population of solutions and evaluate the fitness function. 2. [New population] Create a new population by repeating the following steps until the new population is complete. 2.1 [Selection] Select two parent solutions from a population according to their fitness (the better fitness, the greater the chance to be selected). 2.2 [Crossover] With a crossover probability cross over the parents to form a new offspring. If no crossover is performed, the offspring is an exact copy of the parents. 2.3 [Mutation] With a mutation probability, mutate new offspring at each locus (position in the chromosome). 2.4 [Acceptance] Place new offspring in a new population. 3. [Evaluation] Compute the fitness values for the new population of N solutions. 4. [Test] If the stopping criterion is met, stop, and return the best solution in current population. 127
Genetic Algorithms for Feature Weighting in Multi-criteria Recommender Systems Chein-Shung hwang 5. Loop] Go to step 2 3. Traditional collaborative filtering CF recommends items based on the historical ratings data of similar users. There are two major approaches for CF: user-based (also called memory-based)and model-based. User-based CF identifies users whose interests are similar to an active user and recommends items they like. Model-based CF infers a compact model and then uses it for recommendations. Here we describe the user-based CF In a typical CF scenario, there is a list of n users U=( uI, u2,..., u), a list of m items trix of ratings R=Ir], where r, represents the rating of user u, for item i, CF systems usually consist of two phases: neighborhood formation and prediction computation. In neighborhood formation, the similarity between the active user ua and other users u, is computed. There are two common methods to compute the similarity between users: Pearson correlation and cosine similarity. Pearson correlation measures the extent to which two variables linearly relate with each other. The Pearson correlation between users u and u. is defined as ∑(a-F)(-7) ∑(-元)2∑(-元) where /(a, r)represents the items that both user u, and u, have rated and ra and Fr represent the average rating of users u, and u,, respectively. The cosine similarity views the rating of each user as a multidimensional vector and computes the cosine value between the two vectors i∈(a,r) Once the similarities between active user u, and other users u, are computed, a subset of the nearest neighbors of u, are chosen based on the similarity measure, and the ratings of neighbors are ggregated to generate predictions for the active user. We consider two popular ways to average the ng. The first is the weighted sum approach Pa.j and the second is the adjusted weighted sum approach r∈N(a) I sima I where Pa. represents the prediction for the active user u, for item i, and M(a) represents the set of
Genetic Algorithms for Feature Weighting in Multi-criteria Recommender Systems Chein-Shung Hwang 5. [Loop] Go to step 2. 3. Traditional collaborative filtering CF recommends items based on the historical ratings data of similar users. There are two major approaches for CF: user-based (also called memory-based) and model-based. User-based CF identifies users whose interests are similar to an active user and recommends items they like. Model-based CF infers a compact model and then uses it for recommendations. Here we describe the user-based CF approach [12]. In a typical CF scenario, there is a list of n users U={ uuu n ,,, 21 }, a list of m items I={ m ,,, iii 21 }, and a matrix of ratings R=[ ij r ] , where ij r represents the rating of user ui for item j i . CF systems usually consist of two phases: neighborhood formation and prediction computation. In neighborhood formation, the similarity between the active user a u and other users r u is computed. There are two common methods to compute the similarity between users: Pearson correlation and cosine similarity. Pearson correlation measures the extent to which two variables linearly relate with each other. The Pearson correlation between users ua and ur is defined as ∑∑ ∑ ∈ ∈ ∈ − − −− = ),( , ),( , ),( , , , )()( ( )( ) raIi rir raIi aia raIi riraia ra rrrr rrrr sim 2 2 (1) where raI ),( represents the items that both user ua and ur have rated and ar and rr represent the average rating of users ua and ur , respectively. The cosine similarity views the rating of each user as a multidimensional vector and computes the cosine value between the two vectors. ∑∑ ∑ ∈ ∈ ∈ × = ),( , ),( , ),( ,, , raIi ir raIi ia raIi iria ra rr rr sim 2 2 (2) Once the similarities between active user ua and other users ur are computed, a subset of the nearest neighbors of ua are chosen based on the similarity measure, and the ratings of neighbors are aggregated to generate predictions for the active user. We consider two popular ways to average the rating. The first is the weighted sum approach: ∑ ∑ ∈ ∈ × = )( , )( ,, , || aNr ra aNr jrra ja sim rsim p (3) and the second is the adjusted weighted sum approach: ∑ ∑ ∈ ∈ −× += )( , )( ,, , || )( aNr ra aNr rjrra aja sim rrsim rp (4) where p , ja represents the prediction for the active user ua for item j i and N(a) represents the set of 128
Journal of Convergence Information Technology Volume 5. Number 8 October 2010 nearest neighbors of u 4. Multi-criteria collaborative filtering The CF technique has been successfully applied to solve the recommendation problem with single riterion. When the problem is extended to be multiple conflicting criteria, new techniques are needed in order to effectively incorporate the multi-criteria rating information into the recommendation process. Particularly, in this section describe how to incorporate the multi-criteria rating information into the CF process following the aggregation function-based approach proposed in [13] Aggregation Module(AG), and the Recommendation Model(RC)as shown in Figure dule(MP), the The proposed system consists of three main modules: the Multi-Criteria Prediction Mo The overall system can be viewed as a blackbox which takes, as input, the ratings matrix, and produces, as output, a recommendation list. The ratings matrix R contains the rating set ru drg, i,,..,"y) standing for the ratings of the user u, for item iy, where rg represents the overall rating value and i represents the rating value for criterion k. The MP module uses the single-criterion CF algorithm to produce the recommendation list for each individual criterion. The AG module computes the user's preference for each criterion in terms of weight using GA algorithms. The rC module produces an overall recommendation list based on the results from MP and AG modules Multi-Criteria Agge空ator Prediction Module TOp-N Recommendation List Figure 1. System architecture 4. 1. Aggregation module In the multi-criteria rating environment, different people may place different emphases on these interrelated features. The goal of AG is to find the relationship between the overall rating and the underlying multi-criteria ratings for each user. More specifically, given the ratings data of a user, AG computes his/her preference model in terms of feature weights using GAs In this study, each chromosome x in the ga process is expressed as a set of feature weights where W, =(wI, w2, w,, w4)with 4 genes. Each gene is represented as a movie feature weight and encod with 8 bits. The Ga begins with a random population of chromosomes. For an active user l,, each chromosome is assigned alternately and tested by the fitness function. The fitness function measures the prediction accuracy of products using the weights as defined in the current chromosome (5) Fitness(x)=l-
Journal of Convergence Information Technology Volume 5, Number 8, October 2010 nearest neighbors of ua . 4. Multi-criteria collaborative filtering The CF technique has been successfully applied to solve the recommendation problem with single criterion. When the problem is extended to be multiple conflicting criteria, new techniques are needed in order to effectively incorporate the multi-criteria rating information into the recommendation process. Particularly, in this section, we describe how to incorporate the multi-criteria rating information into the CF process following the aggregation function-based approach proposed in [13]. The proposed system consists of three main modules: the Multi-Criteria Prediction Module (MP), the Aggregation Module (AG), and the Recommendation Model (RC) as shown in Figure 1. The overall system can be viewed as a blackbox which takes, as input, the ratings matrix, and produces, as output, a recommendation list. The ratings matrix R contains the rating set rij = { k ijijij ,,, rrr 10 } standing for the ratings of the user ui for item ij, where 0 ij r represents the overall rating value and k ij r represents the rating value for criterion k. The MP module uses the single-criterion CF algorithm to produce the recommendation list for each individual criterion. The AG module computes the user’s preference for each criterion in terms of weight using GA algorithms. The RC module produces an overall recommendation list based on the results from MP and AG modules. Figure 1. System architecture 4. 1. Aggregation Module In the multi-criteria rating environment, different people may place different emphases on these interrelated features. The goal of AG is to find the relationship between the overall rating and the underlying multi-criteria ratings for each user. More specifically, given the ratings data of a user, AG computes his/her preference model in terms of feature weights using GAs. In this study, each chromosome x in the GA process is expressed as a set of feature weights where ),,,( 4321 = wwwwW xxxxx with 4 genes. Each gene is represented as a movie feature weight and encoded with 8 bits. The GA begins with a random population of chromosomes. For an active user ua , each chromosome is assigned alternately and tested by the fitness function. The fitness function measures the prediction accuracy of products using the weights as defined in the current chromosome. l rp itness xF l j ∑ jaja = − −= 1 00 1 ,, )( (5) 129
Genetic Algorithms for Feature Weighting in Multi-criteria Recommender Systems Chein-Shung hwang Each chromosome x is tested repeatedly, I times, by a leave-one-out cross validation. Pa. and ra. are the predicted and the actual overall ratings of user ua to item ij, respectively The overall prediction can be made by two different approaches, the filter-based and the wrapper based approach, depending on whether the method uses feedback from the subsequent MP module The filter-based method contains algorithms that use no input other than users'own ratings to calculate the overall predictions, whereas the wrapper-based approach uses feedback (i.e. the predicted ratings) from the MP module to facilitate the prediction For the filter-based approach, the predicted overall rating is calculated as a weighted sum o individual original ratings Wr where r. is the actual rating of user u, to item i, for criterion k For the wrapper-based approach, the predicted overall rating is calculated as a weighted sum of individual predicted ratings generated by the MP module n2×p w where Pa, is the predicted rating of user u, to item i for criterion k For each generation evolution, chromosomes for the next generation are selected using the roulette wheel selection scheme to implement proportionate random selection. All of the chromosomes are then paired up using the single-point crossover strategy with a probability Pe After the crossover, for each of the genes of the chromosomes, the gene is mutated with a probability Pm. The algorithm continues to evolve until a maximum number of generations is reached. The chromosome with the highest fitnes value is then selected as the optimal set of feature weights for active user u 4. 2. Multi-Criteria Prediction Module In the MP module, we decompose k-dimensional multi-criteria rating space into k single-rating recommendation problems and use a traditional user-based CF technique to estimate ratings for each criterion. Thus, we decompose the multi-criteria rating matrix R into k single-criterion rating matrices R. The CF algorithm is then implemented k times, one for each criterion. For each R",we first compute the similarity simar between active user u, and other users u, using the Pearson Correlation Method and then estimate the rating value pa for unrated item i using an adjusted 4.3. Recommendation model The goal of the RC module is to produce a recommendation list for an active user. RC modules first edict each unknown overall rating Pay directly by using the feature weight function estimated in the AG module and the multi-criteria rating value Pa estimated in the MP module
Genetic Algorithms for Feature Weighting in Multi-criteria Recommender Systems Chein-Shung Hwang Each chromosome x is tested repeatedly, l times, by a leave-one-out cross validation. 0 ja p , and 0 jar , are the predicted and the actual overall ratings of user ua to item ij, respectively. The overall prediction can be made by two different approaches, the filter-based and the wrapperbased approach, depending on whether the method uses feedback from the subsequent MP module. The filter-based method contains algorithms that use no input other than users’ own ratings to calculate the overall predictions, whereas the wrapper-based approach uses feedback (i.e. the predicted ratings) from the MP module to facilitate the prediction. For the filter-based approach, the predicted overall rating is calculated as a weighted sum of individual original ratings. ∑ ∑ = = × = 4 1 4 0 1 k k x k k ja k x ja w rw p , , (6) where k jar , is the actual rating of user a u to item ij for criterion k. For the wrapper-based approach, the predicted overall rating is calculated as a weighted sum of individual predicted ratings generated by the MP module. ∑ ∑ = = × = 4 1 4 0 1 k k x k k ja k x ja w pw p , , (7) where k ja p , is the predicted rating of user ua to item ij for criterion k. For each generation evolution, chromosomes for the next generation are selected using the roulette wheel selection scheme to implement proportionate random selection. All of the chromosomes are then paired up using the single-point crossover strategy with a probability pc. After the crossover, for each of the genes of the chromosomes, the gene is mutated with a probability pm. The algorithm continues to evolve until a maximum number of generations is reached. The chromosome with the highest fitness value is then selected as the optimal set of feature weights for active user ua . 4. 2. Multi-Criteria Prediction Module In the MP module, we decompose k-dimensional multi-criteria rating space into k single-rating recommendation problems and use a traditional user-based CF technique to estimate ratings for each criterion. Thus, we decompose the multi-criteria rating matrix R into k single-criterion rating matrices k R . The CF algorithm is then implemented k times, one for each criterion. For each k R , we first compute the similarity k ar sim between active user ua and other users ur using the Pearson Correlation Method and then estimate the rating value k aj p for unrated item j i using an adjusted weighted sum approach. 4. 3. Recommendation Model The goal of the RC module is to produce a recommendation list for an active user. RC modules first predict each unknown overall rating 0 aj p directly by using the feature weight function estimated in the AG module and the multi-criteria rating value k aj p estimated in the MP module. 130
Journal of Convergence Information Technology Volume 5. Number 8 October 2010 where Wa=(wa, 14, wa, w)represents the best fit chromosome generated from a GA algorithm for user u, Finally, all the unrated items are sorted in non-increasing order with respect to the overall ratings and the first N items are selected as the Top-N recommended set 5. Experimental evaluation The proposed system is evaluated data collected from yahoo! movies web (http://movies.yahoo.com).InadditiontotheoverallratingeachYahoomovieprovidesanotherfour criteria:story, acting, direction, and visuals. All ratings have 13 possible letter grades ranging from F to A+. For our analysis, we change the letter scale to a numerical scale of 1 to 13, with 1 denoting the worst evaluation, grade F and 13 denoting the best evaluation, grade A+. We collect a total of 11109 rating data from 297 different users and 368 different movies. Each user rates at least 20 movies and each movie is rated by at least 20 users We employ the 5-fold cross-validation approach. First, we randomly divide the dataset into five groups. Then we run five rounds of tests, each time choosing one group of data as the test set and the ther four groups as the training set. The data in the training set is used as the cf database. For each data in the test set, we randomly select 10 non-zero entries and delete the rest. Our recommender system is then evaluated by comparing the Top-N recommendations it makes, given the test data, with the set of deleted items 5. 1. Evaluation Metrics We use precision and recall, two commonly used performance measures in the information retrieval community, to evaluate the quality of a recommendation. Precision is the fraction of recommended movies that the user really likes. Recall is the fraction of interesting movies that are recommended More precisely correctly recommended_movi recall= total movies_ by_users correctly recommended movie precision ded movies (10) However, there is al ways a trade-off between precision and recall. Increasing the number of recommended movies will reduce the precision and increase the recall. To balance both measures, we use another measure FI metric that gives equal weight to precision and recall and is given precision We compute each metric for each test user and the overall average values for the test data are taken measures of the quality of the recommendation 5. 2. Experimental Parameters In this section we define the parameters that will be used during the experiments as follows Recommendation number(RN): 3, 5, 10, 15, 20, 25, 30
Journal of Convergence Information Technology Volume 5, Number 8, October 2010 ∑ ∑ = = × = 4 1 4 0 1 k k a k k ja k a ja w pw p , , (8) where ),,,( 4321 = wwwwW aaaaa represents the best fit chromosome generated from a GA algorithm for user ua . Finally, all the unrated items are sorted in non-increasing order with respect to the overall ratings and the first N items are selected as the Top-N recommended set. 5. Experimental evaluation The proposed system is evaluated using data collected from Yahoo! Movies Web site (http://movies.yahoo.com). In addition to the overall rating, each Yahoo movie provides another four criteria: story, acting, direction, and visuals. All ratings have 13 possible letter grades ranging from F to A+. For our analysis, we change the letter scale to a numerical scale of 1 to 13, with 1 denoting the worst evaluation, grade F and 13 denoting the best evaluation, grade A+. We collect a total of 11109 rating data from 297 different users and 368 different movies. Each user rates at least 20 movies and each movie is rated by at least 20 users. We employ the 5-fold cross-validation approach. First, we randomly divide the dataset into five groups. Then we run five rounds of tests, each time choosing one group of data as the test set and the other four groups as the training set. The data in the training set is used as the CF database. For each data in the test set, we randomly select 10 non-zero entries and delete the rest. Our recommender system is then evaluated by comparing the Top-N recommendations it makes, given the test data, with the set of deleted items. 5. 1. Evaluation Metrics We use precision and recall, two commonly used performance measures in the information retrieval community, to evaluate the quality of a recommendation. Precision is the fraction of recommended movies that the user really likes. Recall is the fraction of interesting movies that are recommended. More precisely, total movies usersbyliked correctly recommende moviesd recall ____ _ _ = (9) total recommende moviesd correctly recommende moviesd precision _ _ _ _ = (10) However, there is always a trade-off between precision and recall. Increasing the number of recommended movies will reduce the precision and increase the recall. To balance both measures, we use another measure F1 metric that gives equal weight to precision and recall and is given as recall precision recall precision F + ×× = 2 1 (11) We compute each metric for each test user and the overall average values for the test data are taken as measures of the quality of the recommendation. 5. 2. Experimental Parameters In this section we define the parameters that will be used during the experiments as follows: - Recommendation number (RN): 3, 5, 10, 15, 20, 25, 30 131
Genetic Algorithms for Feature Weighting in Multi-criteria Recommender Systems Chein-Shung hwang Crossover probability(pe): 0.5,,, 0.8, 0.9, 1.0 Mutation probability (pm): 0.01, 0.02, 0.03, 0.04, 0.05 Population size(PS)20, 30, 50, 75, 100 Generation number (GN):10,20,30,40,50,60,70,80,90,100 5.3. Performance results To demonstrate the effectiveness of the proposed system, we compare our approaches with two other algorithms the single-criterion user-based CF (SUCF) algorithm in which the overall rating is used as the sole criterion and the equally weighted multi-criteria CF(EWMCF) that applies an equal weight to each criterion rating. We denote our two approaches by their abbreviations: GA-based filter method for multi-criteria CF(GAFMCF)and GA-based wrapper method for multi-criteria CF (GAWMCF In the subsequent part of this section, we will first explore the optimal setting of GA parameters and n assess the performance effectiveness of the different approaches 5.3. 1 GA Parameters setting Crossover and mutation operators play the major roles in GA, so defining proper crossover and operators is necessary in order to achieve a better performance of the GA. However, the optimal values of crossover and mutation probabilities are problem specific and cannot be obtained independently [14]. All combinations of crossover and mutation, within given starting ranges, must be investigated in order to allow for the interaction effect. Accordingly, we examine the impacts of various combinations of Pe and Pm on the recommendations quality of the GAWMCF approach. We fix NB to 30, Ps to 50, and produce 5 recommendations for each user. Due to space limitations, Figure 2 shows the performance results for optimal values of Pc=0.9 and Pm=0. 05 by varying the number of generations For more detailed results of various combinations of Pc and pm, please refer to Appendix I 0.7 ▲ Precision 1020304050607080901 Figure 2. Performance results of GAWMCF approach vs. generation number for optimal values of pe 0.9 and Pm=0.05 In Figure 2, the number of generations necessary for convergence is also evaluated. We can see that performance is gradually improved generation by generation. However, the improvement becomes insignificant when the number of generations is greater than 80. Since the ga process is terminated the number of generations for convergence is reached, the greater the number of generations, the r the computational cost. For better computational efficiency, we fix the number of generations to 80 for subsequent experiments The population size is a critical parameter for the performance of GA. There is no clear indication as to how high the population size should be [15]. In general, increasing the population size not only reases the accuracy of the ga but also increases the number of generations to converge. Figure 3
Genetic Algorithms for Feature Weighting in Multi-criteria Recommender Systems Chein-Shung Hwang - Crossover probability (pc): 0.5, 0.6, 0.7, 0.8, 0.9, 1.0 - Mutation probability (pm): 0.01, 0.02, 0.03, 0.04, 0.05 - Population size (PS) 20, 30, 50, 75, 100 - Generation number (GN): 10, 20, 30, 40, 50, 60, 70, 80, 90, 100 5. 3. Performance Results To demonstrate the effectiveness of the proposed system, we compare our approaches with two other algorithms: the single-criterion user-based CF (SUCF) algorithm in which the overall rating is used as the sole criterion and the equally weighted multi-criteria CF (EWMCF) that applies an equal weight to each criterion rating. We denote our two approaches by their abbreviations: GA-based filter method for multi-criteria CF (GAFMCF) and GA-based wrapper method for multi-criteria CF (GAWMCF). In the subsequent part of this section, we will first explore the optimal setting of GA parameters and then assess the performance effectiveness of the different approaches. 5. 3. 1. GA Parameters setting Crossover and mutation operators play the major roles in GA, so defining proper crossover and operators is necessary in order to achieve a better performance of the GA. However, the optimal values of crossover and mutation probabilities are problem specific and cannot be obtained independently [14]. All combinations of crossover and mutation, within given starting ranges, must be investigated in order to allow for the interaction effect. Accordingly, we examine the impacts of various combinations of pc and pm on the recommendations quality of the GAWMCF approach. We fix NB to 30, PS to 50, and produce 5 recommendations for each user. Due to space limitations, Figure 2 shows the performance results for optimal values of pc =0.9 and pm =0.05 by varying the number of generations. For more detailed results of various combinations of pc and pm , please refer to Appendix I. 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 10 20 30 40 50 60 70 80 90 100 Performance Metric GN F1 Recall Precision Figure 2. Performance results of GAWMCF approach vs. generation number for optimal values of pc = 0.9 and pm = 0.05 In Figure 2, the number of generations necessary for convergence is also evaluated. We can see that performance is gradually improved generation by generation. However, the improvement becomes insignificant when the number of generations is greater than 80. Since the GA process is terminated when the number of generations for convergence is reached, the greater the number of generations, the higher the computational cost. For better computational efficiency, we fix the number of generations to 80 for subsequent experiments. The population size is a critical parameter for the performance of GA. There is no clear indication as to how high the population size should be [15]. In general, increasing the population size not only increases the accuracy of the GA but also increases the number of generations to converge. Figure 3 132
Journal of Convergence Information Technology Volume 5. Number 8 October 2010 shows the influence of population size on the performance of the recommender system. As expected, the performance improves as the population size increases, but it reaches a saturation point at PS=50 and any further increment does not improve results. Accordingly, we will fix the population size to 50 while evaluating the performance of various approaches Recall +Precision 10 Figure 3. Performance results of GAWMCF approach vs population size 5.3. 1. Performance comparison Recalling that the CF algorithm uses a nearest-neighbor model for generating predictions, a crucial step in CF is the selection of a neighborhood. The number of nearest neighbors used for neighborhood formation can substantially affect the system's accuracy. However, the optimal size of a neighborhood usually depends on the nature of the problem. a pilot experiment of this study has been conducted and suggested 30 as the optimal value. To evaluate the sensitivity of different recommendation numbers, we fix the number of eighbors to 30 and perform an experiment with different recommendation numbers of 3, 5, 10 20, 25, and 30. The comparisons of precision, recall, and Fl among various approaches are shown from Fig 4 to Fig. 6 As expected, when the number of recommendations increases, the precision drops smoothly but the recall improves gradually. In all cases, the single-criterion approach SUCF performs worse than the other three multi-criteria approaches. Among the three multi-criteria approaches, GAWMCF performs the best, followed by GAFMCF, and then EWMCF. The results demonstrate the effectiveness of the GA-based feature weighting and confirm the previous studies in that the wrapper-based approach outperforms the filter-based approach. Finally according to Figure 6, we can see that the Fl-measure reaches an optimal value for RN aroune 0.7 A-EWMCF SUCE 351015202530 Figure 4. Precision-metric for various approaches vs recommendation number
Journal of Convergence Information Technology Volume 5, Number 8, October 2010 shows the influence of population size on the performance of the recommender system. As expected, the performance improves as the population size increases, but it reaches a saturation point at PS = 50 and any further increment does not improve results. Accordingly, we will fix the population size to 50 while evaluating the performance of various approaches. 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 20 30 50 75 100 Performance Metric PS F1 Recall Precision Figure 3. Performance results of GAWMCF approach vs. population size 5. 3. 1. Performance comparison Recalling that the CF algorithm uses a nearest-neighbor model for generating predictions, a crucial step in CF is the selection of a neighborhood. The number of nearest neighbors used for neighborhood formation can substantially affect the system’s accuracy. However, the optimal size of a neighborhood usually depends on the nature of the problem. A pilot experiment of this study has been conducted and suggested 30 as the optimal value. To evaluate the sensitivity of different recommendation numbers, we fix the number of neighbors to 30 and perform an experiment with different recommendation numbers of 3, 5, 10, 15, 20, 25, and 30. The comparisons of precision, recall, and F1 among various approaches are shown from Fig. 4 to Fig. 6. As expected, when the number of recommendations increases, the precision drops smoothly but the recall improves gradually. In all cases, the single-criterion approach SUCF performs worse than the other three multi-criteria approaches. Among the three multi-criteria approaches, GAWMCF performs the best, followed by GAFMCF, and then EWMCF. The results demonstrate the effectiveness of the GA-based feature weighting and confirm the previous studies in that the wrapper-based approach outperforms the filter-based approach. Finally, according to Figure 6, we can see that the F1-measure reaches an optimal value for RN around 20. 0.1 0.2 0.3 0.4 0.5 0.6 0.7 3 5 10 15 20 25 30 Precision RN GAWMCF GAFMCF EWMCF SUCF Figure 4. Precision-metric for various approaches vs. recommendation number 133
A Chein-Shung hwang 0.4 0.3 0.2 HHGAFMCF AEWMCF 0.0 Figure 5. Recall-metric for various approaches vs recommendation number 0.4 0.3 ← GAWMCF 0.2 |— GAFMCF A-EWMCF 0.1 米SUCF 0.0 510152 0 Figure 6. Fl-metric for various approaches vs recommendation number 6. Conclusions fea In this paper, we have proposed a multi-criteria CF recommender system based on GAs for feature weighting. We treat the multi-criteria recommendations as optimization problems and apply a weighted average method by combining values from different criteria. The proposed approach first uses the traditional user-based CF algorithm to compute the prediction for each single criterion and then aggregates the overall prediction based on the weighting values derived by gas The GAs are implemented by two different approaches: filter and wrapper. Experimental results show that the wrapper-based approach is superior to all the other methods in all cases. The results also confirm that the provision of additional information about user preference in terms of different aspects for an item can potentially increase the recommendation quality. A further study for incorporating more information, such as contextual and content information, is under investigation 7. References 1 G. Adomavicius and A. Tuzhilin, Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions", IEEE Transactions on Knowledge and Data Engineering, vol. 17 734-749.200 R. Burke, " Hybrid recommender systems: Survey and experiments, User Modeling and User Adapted Interaction, vol. 12, no. 4, pp 331-370, 2002 [3]M. Balabanovic and Y. Shoham,"Fab: Content based, collarative ree Communications of the ACM, vol. 40, no 3, pp 66-72, 1997
Genetic Algorithms for Feature Weighting in Multi-criteria Recommender Systems Chein-Shung Hwang 0.0 0.1 0.2 0.3 0.4 3 5 10 15 20 25 30 Recall RN GAWMCF GAFMCF EWMCF SUCF Figure 5. Recall-metric for various approaches vs. recommendation number 0.0 0.1 0.2 0.3 0.4 3 5 10 15 20 25 30 F1 RN GAWMCF GAFMCF EWMCF SUCF Figure 6. F1-metric for various approaches vs. recommendation number 6. Conclusions In this paper, we have proposed a multi-criteria CF recommender system based on GAs for feature weighting. We treat the multi-criteria recommendations as optimization problems and apply a weighted average method by combining values from different criteria. The proposed approach first uses the traditional user-based CF algorithm to compute the prediction for each single criterion and then aggregates the overall prediction based on the weighting values derived by GAs. The GAs are implemented by two different approaches: filter and wrapper. Experimental results show that the wrapper-based approach is superior to all the other methods in all cases. The results also confirm that the provision of additional information about user preference in terms of different aspects for an item can potentially increase the recommendation quality. A further study for incorporating more information, such as contextual and content information, is under investigation. 7. References [1] G. Adomavicius and A. Tuzhilin, “Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions”, IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 6, pp.734–749, 2005. [2] R. Burke, “Hybrid recommender systems: Survey and experiments”, User Modeling and UserAdapted Interaction, vol. 12, no. 4, pp. 331–370, 2002. [3] M. Balabanovic and Y. Shoham, “Fab: Content based, collarative recommendation”, Communications of the ACM, vol. 40, no. 3, pp. 66-72, 1997. 134
Journal of Convergence Information Technology Volume 5. Number 8 October 2010 [4] M. Pazzani, "A framework for collaborative, content-based and demographic filtering, Artificial Intelligence Review, pp. 393-408, 1999 5]K. Chapphannarungsri and S. Maneeroj, Combining multiple criteria and multidimension for movie recommender system", In Proceeding of the International Multiconference of Engineers and Computer Scientist, (IMECS 2009), pp. 698-703, 2009 [6]F. L. Roux, E. Ranjeet, V. Ghai, Y. Gao and J. lu, A Course Recommender System Using Multiple Criteria Decision Making Method, In ISKE-2007 Proceedings, Series 'Advances in Intelligent Systems Research, 2007 [7 W.-G. Teng and H -H Lee, Collaborative Recommendation with Multi-Criteria Ratings", Journal of Computers(Special Issue on Data Mining), vol. 17, no. 4, pp. 69-78, January 2007 [8]K. Lakiotaki, S. Tsafarakis and N. Matsatsinis, "UTA-Rec: A recommender system based on ultiple criteria analysis", In Proceedings of the ACM conference on Recommender sys 219-226.2008 [9]Y. Siskos and D. Yanacopoulos, "UTA STAR-an ordinal regression method for building additive value functions", Investigacao Operational, vol 5, pp. 539-53, 1985 [10]M. Srinivas and L. M. Patnaik,"Genetic algorithms: A survey", IEEE Computer, vol 27, no. 6, pp.17-26,June1994 [11L. Davis, editor. Handbook of Genetic Algorithms. Van Nostrand Reinhold, New York, 1991 [12]P. Resnick, N. lacovou, M. Suchak, P. Bergstrom, and J. Riedl,"GroupLens: An Open Architecture for Collaborative Filtering of Netnews", In Proceedings of ACM Conference on Computer Supported Cooperative World, pp 175-186, 1994 [13]. B. Statnikov and J. B Matusov, Multicrieria Optimization and Engineering, Springer Verlag, March 1995 [14]A Czarn, C MacNish, K Vijayan, B Turlach, and R Gupta, "Statistical exploratory analysis of genetic algorithms", IEEE Transactions on Evolutionary Computation, vol. 8, no. 4, pp. 405-421 August 2004 [15]D E. Goldberg, ""Sizing Populations for Serial and Parallel Genetic Algorithms", In Proceedings of the Third International Conference on Genetic Algorithms. San Mateo, CA: Morgan Kaufman, pp 70-79,1989 Appendix I Table 1. Precision-metric for different values of P. P and GN Precision 102030405060 100 0.5046810.4730481049001005220.5280.535.05380.540 06|046404740483049305081053260534054505530563 0.01070489003051005190330585057505205890591 080.4920.50405130.52805430:5740.5850.5930.5950.596 09049705030.5130523055305620582059006040609 1.0047804870.50305100.512053205530.5630.5640.567 0504700.4720.4780489049705140.5270.53705430.545 0604690473048304920.50605250533055705630.564 0.030704910504051305220538051057705850590593 0.04930506051605056505750584060506080601 1.00480048305050512052505440565057005720573 0.504730475047904910497051905320.5500.590.562 0.604700474048404870.5060.5180.5320.5610.56810.571 0.05 0,704950506051305380.549055110580059105980.59 0.81049640.5080519054205680.5750581060006030604 0104990502051405671057205940601060906100613 1.004830485049705060532056105750.58005840588
Journal of Convergence Information Technology Volume 5, Number 8, October 2010 [4] M. Pazzani, “A framework for collaborative, content-based and demographic filtering”, Artificial Intelligence Review, pp. 393-408, 1999. [5] K. Chapphannarungsri and S. Maneeroj, ”Combining multiple criteria and multidimension for movie recommender system”, In Proceeding of the International Multiconference of Engineers and Computer Scientist, (IMECS 2009), pp. 698-703, 2009. [6] F. L. Roux, E. Ranjeet, V. Ghai, Y. Gao and J. Lu, “A Course Recommender System Using Multiple Criteria Decision Making Method”, In ISKE-2007 Proceedings, Series ‘Advances in Intelligent Systems Research’, 2007. [7] W.-G. Teng and H.-H. Lee, “Collaborative Recommendation with Multi-Criteria Ratings”, Journal of Computers (Special Issue on Data Mining), vol. 17, no. 4, pp. 69-78, January 2007. [8] K. Lakiotaki, S. Tsafarakis and N. Matsatsinis, “UTA-Rec: A recommender system based on multiple criteria analysis”, In Proceedings of the ACM conference on Recommender systems, pp. 219-226, 2008. [9] Y. Siskos and D. Yanacopoulos, “UTA STAR - an ordinal regression method for building additive value functions”, Investigacao Operational, vol. 5, pp. 539-53, 1985. [10]M. Srinivas and L. M. Patnaik, “Genetic algorithms: A survey”, IEEE Computer, vol. 27, no. 6, pp. 17–26, June 1994. [11]L. Davis, editor. Handbook of Genetic Algorithms. Van Nostrand Reinhold, New York, 1991. [12]P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, and J. Riedl, “GroupLens: An Open Architecture for Collaborative Filtering of Netnews”, In Proceedings of ACM Conference on Computer Supported Cooperative World, pp. 175-186, 1994. [13]R.B. Statnikov and J. B. Matusov, Multicrieria Optimization and Engineering, Springer Verlag, March 1995. [14]A Czarn, C MacNish, K Vijayan, B Turlach, and R Gupta, “Statistical exploratory analysis of genetic algorithms”, IEEE Transactions on Evolutionary Computation, vol. 8, no. 4, pp. 405-421, August 2004. [15]D.E. Goldberg, “Sizing Populations for Serial and Parallel Genetic Algorithms”, In Proceedings of the Third International Conference on Genetic Algorithms. San Mateo, CA: Morgan Kaufman. pp. 70-79, 1989. Appendix I Table 1. Precision-metric for different values of Pm, Pc and GN Precision GN pm pc 10 20 30 40 50 60 70 80 90 100 0.01 0.5 0.468 0.473 0.481 0.490 0.510 0.522 0.528 0.535. 0.538 0.540 0.6 0.464 0.474 0.483 0.493 0.508 0.526 0.534 0.545 0.553 0.563 0.7 0.489 0.503 0.510 0.519 0.533 0.558 0.575 0.582 0.589 0.591 0.8 0.492 0.504 0.513 0.528 0.543 0.574 0.585 0.593 0.595 0.596 0.9 0.497 0.503 0.513 0.523 0.553 0.562 0.582 0.590 0.604 0.609 1.0 0.478 0.487 0.503 0.510 0.512 0.532 0.552 0.563 0.564 0.567 0.03 0.5 0.470 0.472 0.478 0.489 0.497 0.514 0.527 0.537 0.543 0.545 0.6 0.469 0.473 0.483 0.492 0.506 0.525 0.533 0.557 0.563 0.564 0.7 0.491 0.504 0.513 0.522 0.538 0.553 0.577 0.585 0.591 0.593 0.8 0.493 0.506 0.516 0.533 0.565 0.575 0.584 0.605 0.608 0.601 0.9 0.497 0.501 0.514 0.548 0.566 0.587 0.597 0.607 0.609 0.610 1.0 0.480 0.483 0.505 0.512 0.525 0.544 0.565 0.570 0.572 0.573 0.05 0.5 0.473 0.475 0.479 0.491 0.497 0.519 0.532 0.550 0.559 0.562 0.6 0.470 0.474 0.484 0.487 0.506 0.518 0.532 0.561 0.568 0.571 0.7 0.495 0.506 0.513 0.538 0.549 0.551 0.580 0.591 0.598 0.599 0.8 0.496 0.508 0.519 0.542 0.568 0.575 0.581 0.600 0.603 0.604 0.9 0.499 0.502 0.514 0.567 0.572 0.594 0.601 0.609 0.610 0.613 1.0 0.483 0.485 0.497 0.506 0.532 0.561 0.575 0.580 0.584 0.588 135