正在加载图片...
Knowledge-Based Systems 24(2011)1310-1316 Contents lists available at Science Direct Knowledge-Based Systems ELSEVIER journalhomepagewww.elsevier.com/locate/knosys Improving collaborative filtering recommender system results and performance using genetic algorithms Jesus Bobadilla* Fernando Ortega, Antonio Hernando, Javier alcala Universidad politecnica de Madrid, Computer Science, Crta. De valencia, Km 7, 28031 Madrid, Spain ARTICLE INFO ABSTRACT This paper presents a metric to measure similarity between users, which is applicable in collaborative fil- Received 18 October 2010 tering processes carried out in recommender systems. The proposed metric is formulated via a simple li eceived in revised form 30 March 2011 ear combination of values and weights. values are calculated for each pair of users between which tl Available online 15 June 2011 similarity is obtained, whilst weights are only calculated once, making use of a prior stage in which a genetic algorithm extracts weightings from the recommender system which depend on the specific nat- ure of the data from each recommender system. The results obtained present significant improvements in collaborative filtering prediction quality. recommendation quality and performance. ecommender systems e 2011 Elsevier B V. All rights reserved. Genetic algorithms 1 Introduction Currently, the fast increase of Web 2.0[18, 23 has led to the proliferation of collaborative websites in which the number of ele- The basic principle of recommender systems(RS)is the expe ments that can be recommended (e.g. blogs)can increase signifi tation that the group of users similar to one given user, (i.e. those cantly when introduced (and not only voted)by the users, which that have rated an important number of elements in a similar way generates new challenges for researchers in the field of rs, at the to the user) can be used to adequately predict that individuals rat- same time as it increases the possibilities and importance of these ings on products the user has no knowledge of This way, a trip to information retrieval technique Senegal could be recommended to an individual who has rated dif- The core of a RS is its filtering algorithms: demographic filtering ferent destinations in the Caribbean very highly, based on the [20 and content-based filtering [21] are the most basic tech- positive ratings about the holiday destination of"Senegal"of an niques: the first is established on the assumption that individuals mportant number of individuals who also rated destinations in with certain common personal attributes(sex, age, country, etc.) the Caribbean very highly. This suggestion(recommendation) will will also have common preferences, whilst content-based filtering often provide the user of the service with inspiring information recommends items similar to the ones the user preferred in the from the collective knowledge of all other users of the service. t. Currently, collaborative filtering(CF)is the most commonly In recent years, RS have played an important role in reducing used and studied technique [ 12, 24. it is based on the principle the negative impact of information overload on those websites set out in the first paragraph of this section, in which in order to where users have the possibility of voting for their preferences make a recommendation to a given user, it first searches for the on a series of articles or services. Movie recommendation websites users of the system who have voted in the most similar way to this are probably the most well-known cases to users and are without a user, to later make the recommendations by taking the items(holi doubt the most well studied by researchers [19, 4, 22], although day destinations in our running example)most highly valued by there are many other fields in which RS have great and increasing the majority of their similar users. mportance, such as e-commerce[15]. e-learning [9, 5] and digital The most significant part of CF algorithms refers to the group of libraries [26, 27. metrics used to determine the similitude between each pair of users [14, 1.7], among which the Pearson correlation metric stands out as a reference Corresponding author. Tel. +34 913365133: fax: +34 913367522. Genetic algorithms (GA) ainly been used in two aspects E-mail addresses: jesus. bobadillaeupmes OL Bobadilla). fortegarequenaegmail of RS: clustering (16, 17, 28) and hybrid user models [10, 13, 2J. A com(F Ortega). ahernadoeeui upmes(A Hernando). jalcalaceui upmes(. Alcala). common technique to improve the features of Rs consists of oi: 10.1016(. knosys. 2011.06006 2011 Elsevier B.V.All rights reserved. 0950-7051sImproving collaborative filtering recommender system results and performance using genetic algorithms Jesus Bobadilla ⇑,1 , Fernando Ortega, Antonio Hernando, Javier Alcalá Universidad Politécnica de Madrid, Computer Science, Crta. De Valencia, Km 7, 28031 Madrid, Spain article info Article history: Received 18 October 2010 Received in revised form 30 March 2011 Accepted 7 June 2011 Available online 15 June 2011 Keywords: Collaborative filtering Recommender systems Similarity measures Metrics Genetic algorithms Performance abstract This paper presents a metric to measure similarity between users, which is applicable in collaborative fil￾tering processes carried out in recommender systems. The proposed metric is formulated via a simple lin￾ear combination of values and weights. Values are calculated for each pair of users between which the similarity is obtained, whilst weights are only calculated once, making use of a prior stage in which a genetic algorithm extracts weightings from the recommender system which depend on the specific nat￾ure of the data from each recommender system. The results obtained present significant improvements in prediction quality, recommendation quality and performance. 2011 Elsevier B.V. All rights reserved. 1. Introduction The basic principle of recommender systems (RS) is the expec￾tation that the group of users similar to one given user, (i.e. those that have rated an important number of elements in a similar way to the user) can be used to adequately predict that individual’s rat￾ings on products the user has no knowledge of. This way, a trip to Senegal could be recommended to an individual who has rated dif￾ferent destinations in the Caribbean very highly, based on the positive ratings about the holiday destination of ‘‘Senegal’’ of an important number of individuals who also rated destinations in the Caribbean very highly. This suggestion (recommendation) will often provide the user of the service with inspiring information from the collective knowledge of all other users of the service. In recent years, RS have played an important role in reducing the negative impact of information overload on those websites where users have the possibility of voting for their preferences on a series of articles or services. Movie recommendation websites are probably the most well-known cases to users and are without a doubt the most well studied by researchers [19,4,22], although there are many other fields in which RS have great and increasing importance, such as e-commerce [15], e-learning [9,5] and digital libraries [26,27]. Currently, the fast increase of Web 2.0 [18,23] has led to the proliferation of collaborative websites in which the number of ele￾ments that can be recommended (e.g. blogs) can increase signifi- cantly when introduced (and not only voted) by the users, which generates new challenges for researchers in the field of RS, at the same time as it increases the possibilities and importance of these information retrieval techniques. The core of a RS is its filtering algorithms: demographic filtering [20] and content-based filtering [21] are the most basic tech￾niques; the first is established on the assumption that individuals with certain common personal attributes (sex, age, country, etc.) will also have common preferences, whilst content-based filtering recommends items similar to the ones the user preferred in the past. Currently, collaborative filtering (CF) is the most commonly used and studied technique [12,24], it is based on the principle set out in the first paragraph of this section, in which in order to make a recommendation to a given user, it first searches for the users of the system who have voted in the most similar way to this user, to later make the recommendations by taking the items (holi￾day destinations in our running example) most highly valued by the majority of their similar users. The most significant part of CF algorithms refers to the group of metrics used to determine the similitude between each pair of users [14,1,7], among which the Pearson correlation metric stands out as a reference. Genetic algorithms (GA) have mainly been used in two aspects of RS: clustering [16,17,28] and hybrid user models [10,13,2]. A common technique to improve the features of RS consists of 0950-7051/$ - see front matter 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.knosys.2011.06.005 ⇑ Corresponding author. Tel.: +34 913365133; fax: +34 913367522. E-mail addresses: jesus.bobadilla@upm.es (J. Bobadilla), fortegarequena@gmail. com (F. Ortega), ahernado@eui.upm.es (A. Hernando), jalcala@eui.upm.es (J. Alcalá). 1 Universidad Politécnica de Madrid & FilmAffinity.com research team. Knowledge-Based Systems 24 (2011) 1310–1316 Contents lists available at ScienceDirect Knowledge-Based Systems journal homepage: www.elsevier.com/locate/knosys
向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有