A Recommender system based on Invasive Weed Optimization Algorithm Hoda Sepehri Rad and Caro Lucas Abstract-Recommender systems intend to help users find Amazon, CDNow, and MovieFinder. These commercial their interested items from among a large number of items, We systems are becoming part of the standard e-business ue our previous work that emphasizes on"prioritized lat can enhance e-commerce sales by user-profile"approach as an effective approach to quality of the recommendations. Prioritized user-profile is nverting browsers to buyers, increasing cross-selling, and approach th building customer loyalty ecommendation by assigning different priority importance Many of the most successful systems are based each of the features of the user-profile for different users. In collaborative filtering and numerous commercial systems, order to find the optimal priorities for each user an like Amazon. com make use of these techniques to offer optimization algorithm is needed. In this paper, we employ a personalized recommendation lists to their customers.The Invasive Weed asic idea of collaborative filtering is to capture a users and simple algorithm inspired from the invasive habits of preferences to build a profile and then search for similar ed the best accuracy in predicting users'interests taste are found and their opinions are used to generate new Genetic Algorithm (GA)and Particle Swarm Despite successes of these systems, low quality of (PSO)and to standard user-based Pearson recommendations usually is one of the major problems of a movie dataset. these systems. Users need recommendations they can trust to INTRODUCTION help them find items they will like and they leave systems that are not continuously accurate for them Utilizing others opinions and suggestions is part of This paper continues our previous work [1] concerning everyday life. We have all experienced how to following the the effectiveness of "prioritized user-profile"approach. advices of our friends or experts in different domains in Prioritized user-profile is an approach based on [21, [3] that arious decision making tasks like what to read, which tries to increase the quality of recommendation. In this doctor to go and etc. may help us. In addition with explosive approach the system is tailored to the preferences of growth of digital information and the well-known individual users by assigning different importance or priority process and find useful information individually has been to different features of the user-profile(e.g. sex, age, rating, information overload" problem in Internet, the ability to etc. ) By considering these priority levels with each feature more limited. Therefore today the development of automatic phase, more accurate prediction of tools to help and lead users to their most relevant users' interests is achieved formation, their interests and needs is more needed In order to find the optimal priority levels for different Recommender systems intend to provide people with users in a manner to best model their characteristics. an recommendations of products or information they are optimization algorithm is needed. Particle Swarm interested in, based on their past preferences, history of Optimization(PSO) and Genetic Algorithm are two different purchase,and demographic information. These systems optimization algorithms have been applied for this recommend various types of web resources, online news, in([2]and [3] respectively. In this paper, another bio-i optimization algorithm named Invasive Weed Optimi Large scale commercial applications of the recommender (wO) is investigated to find priority levels systems can be found at many e-commerce sites, such as TwO is a relatively recent and new algorithm inspired from the growth of weeds in nature that is first introduced in Manuscript received March 31, 2007 14. To the best of our knowledge this algorithm has no is the M.s student at the Robotics and machine een used in any application yet. This paper focuses hool of Electrical and Computer Engineering, College using this optimization algorithm in recommender system versity of Tehran, P OBox: 11365-4563, Tehran, Iran domain and in particular in prioritized user profile problem Processing Center of The organization of this paper is as follows: section I Excellence,School of Electrical and Computer Engineering, College of describes the Invasive Weed Optimization (wO)algorithm ail: lucas a ut ac ir) and School of Cognitive Sciences, IPM, P.O. Box 10395-5746, Tehran, Iran(e-mail: lucas a ipm ir). algorithms and articular the user-based Pearson algorithm which is one of the standard benchmark methods 1-4244-1340-007S25.00@2007IEEE
$EVWUDFW²5HFRPPHQGHU V\VWHPV LQWHQG WR KHOS XVHUV ILQG WKHLULQWHUHVWHGLWHPVIURPDPRQJDODUJHQXPEHURILWHPV:H FRQWLQXH RXU SUHYLRXV ZRUN WKDW HPSKDVL]HV RQ ³SULRULWL]HG XVHUSURILOH´DSSURDFKDVDQHIIHFWLYHDSSURDFKWRLQFUHDVHWKH TXDOLW\ RI WKH UHFRPPHQGDWLRQV 3ULRULWL]HG XVHUSURILOH LV DQ DSSURDFK WKDW WULHV WR LPSOHPHQW PRUH SHUVRQDOL]HG UHFRPPHQGDWLRQ E\ DVVLJQLQJ GLIIHUHQW SULRULW\LPSRUWDQFH WR HDFK RI WKH IHDWXUHV RI WKH XVHUSURILOH IRU GLIIHUHQW XVHUV ,Q RUGHU WR ILQG WKH RSWLPDO SULRULWLHV IRU HDFK XVHU DQ RSWLPL]DWLRQ DOJRULWKP LV QHHGHG ,Q WKLV SDSHU ZH HPSOR\ D QHZ RSWLPL]DWLRQ DOJRULWKP QDPHO\ ,QYDVLYH :HHG 2SWLPL]DWLRQ ,:2 IRU WKLVSXUSRVH,:2LVDUHODWLYHO\QHZ DQG VLPSOH DOJRULWKP LQVSLUHG IURP WKH LQYDVLYH KDELWV RI JURZWK RI ZHHGV LQ QDWXUH ([SHULPHQWDO UHVXOWV VKRZHG WKDW ,:2 DFKLHYHG WKH EHVW DFFXUDF\LQ SUHGLFWLQJ XVHUV¶LQWHUHVWV FRPSDUHG WR WZR RWKHU SULRULWL]HG DSSURDFKHV ZKLFK ZHUH EDVHG RQ *HQHWLF $OJRULWKP *$ DQG 3DUWLFOH 6ZDUP 2SWLPL]DWLRQ 362 DQG WR VWDQGDUG XVHUEDVHG 3HDUVRQ DOJRULWKPRQDPRYLHGDWDVHW , ,1752'8&7,21 8WLOL]LQJ RWKHUV¶ RSLQLRQV DQG VXJJHVWLRQV LV SDUW RI HYHU\GD\OLIH:HKDYHDOOH[SHULHQFHGKRZWRIROORZLQJWKH DGYLFHV RI RXU IULHQGV RU H[SHUWV LQ GLIIHUHQW GRPDLQV LQ YDULRXV GHFLVLRQ PDNLQJ WDVNV OLNH ZKDW WR UHDG ZKLFK GRFWRUWRJRDQGHWFPD\KHOSXV,QDGGLWLRQZLWKH[SORVLYH JURZWK RI GLJLWDO LQIRUPDWLRQ DQG WKH ZHOONQRZQ ³LQIRUPDWLRQ RYHUORDG´ SUREOHP LQ ,QWHUQHW WKH DELOLW\ WR SURFHVV DQG ILQG XVHIXO LQIRUPDWLRQ LQGLYLGXDOO\ KDV EHHQ PRUHOLPLWHG7KHUHIRUHWRGD\WKHGHYHORSPHQWRIDXWRPDWLF WRROV WR KHOS DQG OHDG XVHUV WR WKHLU PRVW UHOHYDQW LQIRUPDWLRQWKHLULQWHUHVWVDQGQHHGVLVPRUHQHHGHG 5HFRPPHQGHU V\VWHPV LQWHQG WR SURYLGH SHRSOH ZLWK UHFRPPHQGDWLRQV RI SURGXFWV RU LQIRUPDWLRQ WKH\ DUH LQWHUHVWHG LQ EDVHG RQ WKHLU SDVW SUHIHUHQFHV KLVWRU\ RI SXUFKDVH DQG GHPRJUDSKLF LQIRUPDWLRQ 7KHVH V\VWHPV UHFRPPHQG YDULRXV W\SHV RI ZHE UHVRXUFHV RQOLQH QHZV PRYLHV ERRNV DQG HWF WR SRWHQWLDOO\ LQWHUHVWHG SDUWLHV /DUJH VFDOH FRPPHUFLDO DSSOLFDWLRQV RI WKH UHFRPPHQGHU V\VWHPV FDQ EH IRXQG DW PDQ\ HFRPPHUFH VLWHV VXFK DV 0DQXVFULSWUHFHLYHG0DUFK + 6HSHKUL 5DG LV WKH 06 VWXGHQW DW WKH 5RERWLFV DQG 0DFKLQH ,QWHOOLJHQFHJURXS6FKRRORI(OHFWULFDODQG&RPSXWHU(QJLQHHULQJ&ROOHJH RI(QJLQHHULQJ8QLYHUVLW\RI7HKUDQ32%R[±7HKUDQ,UDQ HPDLOKVHSHKUL#HFHXWDFLU & /XFDV LV ZLWK WKH &RQWURO DQG ,QWHOOLJHQW 3URFHVVLQJ &HQWHU RI ([FHOOHQFH 6FKRRO RI (OHFWULFDO DQG &RPSXWHU (QJLQHHULQJ &ROOHJH RI (QJLQHHULQJ8QLYHUVLW\RI7HKUDQ32%R[7HKUDQ,UDQH PDLO OXFDV#XWDFLU DQG 6FKRRO RI &RJQLWLYH 6FLHQFHV ,30 32 %R[ ±7HKUDQ,UDQHPDLOOXFDV#LSPLU $PD]RQ &'1RZ DQG 0RYLH)LQGHU 7KHVH FRPPHUFLDO V\VWHPV DUH EHFRPLQJ SDUW RI WKH VWDQGDUG HEXVLQHVV WHFKQRORJ\ WKDW FDQ HQKDQFH HFRPPHUFH VDOHV E\ FRQYHUWLQJEURZVHUVWREX\HUVLQFUHDVLQJFURVVVHOOLQJDQG EXLOGLQJFXVWRPHUOR\DOW\ 0DQ\ RI WKH PRVW VXFFHVVIXO V\VWHPV DUH EDVHG RQ FROODERUDWLYH ILOWHULQJ DQG QXPHURXV FRPPHUFLDO V\VWHPV OLNH $PD]RQFRP PDNH XVH RI WKHVH WHFKQLTXHV WR RIIHU SHUVRQDOL]HG UHFRPPHQGDWLRQ OLVWV WR WKHLU FXVWRPHUV 7KH EDVLF LGHD RI FROODERUDWLYH ILOWHULQJ LV WR FDSWXUH D XVHU¶V SUHIHUHQFHV WR EXLOG D SURILOH DQG WKHQ VHDUFK IRU VLPLODU SURILOHV,QWKLVPDQQHUXVHUVWKDWVKDUHWKHVDPHRUVLPLODU WDVWHDUH IRXQGDQGWKHLURSLQLRQVDUHXVHGWRJHQHUDWHQHZ VXJJHVWLRQV 'HVSLWH VXFFHVVHV RI WKHVH V\VWHPV ORZ TXDOLW\ RI UHFRPPHQGDWLRQV XVXDOO\ LV RQH RI WKH PDMRU SUREOHPV RI WKHVHV\VWHPV8VHUVQHHGUHFRPPHQGDWLRQVWKH\FDQWUXVWWR KHOSWKHP ILQGLWHPVWKH\ ZLOOOLNH DQGWKH\OHDYH V\VWHPV WKDWDUHQRWFRQWLQXRXVO\DFFXUDWHIRUWKHP 7KLV SDSHU FRQWLQXHV RXU SUHYLRXV ZRUN >@ FRQFHUQLQJ WKH HIIHFWLYHQHVV RI SULRULWL]HG XVHUSURILOH DSSURDFK 3ULRULWL]HGXVHUSURILOHLVDQDSSURDFKEDVHGRQ>@>@WKDW WULHV WR LQFUHDVH WKH TXDOLW\ RI UHFRPPHQGDWLRQ ,Q WKLV DSSURDFK WKH V\VWHP LV WDLORUHG WR WKH SUHIHUHQFHV RI LQGLYLGXDOXVHUVE\DVVLJQLQJGLIIHUHQWLPSRUWDQFHRUSULRULW\ WRGLIIHUHQWIHDWXUHVRIWKHXVHUSURILOHHJVH[DJHUDWLQJ HWF%\FRQVLGHULQJWKHVH SULRULW\OHYHOVZLWKHDFK IHDWXUH LQ WKH SURILOH PDWFKLQJ SKDVH PRUH DFFXUDWH SUHGLFWLRQ RI XVHUV¶LQWHUHVWVLVDFKLHYHG ,Q RUGHU WR ILQG WKH RSWLPDO SULRULW\ OHYHOV IRU GLIIHUHQW XVHUV LQ D PDQQHU WR EHVW PRGHO WKHLU FKDUDFWHULVWLFV DQ RSWLPL]DWLRQ DOJRULWKP LV QHHGHG 3DUWLFOH 6ZDUP 2SWLPL]DWLRQ362DQG*HQHWLF$OJRULWKPDUHWZRGLIIHUHQW RSWLPL]DWLRQ DOJRULWKPV KDYH EHHQ DSSOLHG IRUWKLV SXUSRVH LQ>@DQG>@UHVSHFWLYHO\,QWKLVSDSHUDQRWKHUELRLQVSLUHG RSWLPL]DWLRQ DOJRULWKP QDPHG,QYDVLYH:HHG 2SWLPL]DWLRQ ,:2LVLQYHVWLJDWHGWRILQGSULRULW\OHYHOV ,:2 LV D UHODWLYHO\ UHFHQW DQG QHZ DOJRULWKP LQVSLUHG IURPWKHJURZWKRIZHHGVLQQDWXUHWKDWLVILUVWLQWURGXFHGLQ >@ 7R WKH EHVW RI RXU NQRZOHGJH WKLV DOJRULWKP KDV QRW EHHQ XVHG LQ DQ\ DSSOLFDWLRQ \HW 7KLV SDSHU IRFXVHV RQ XVLQJ WKLV RSWLPL]DWLRQ DOJRULWKP LQ UHFRPPHQGHU V\VWHP GRPDLQDQGLQSDUWLFXODULQSULRULWL]HGXVHUSURILOHSUREOHP 7KH RUJDQL]DWLRQ RI WKLV SDSHU LV DV IROORZV VHFWLRQ ,, GHVFULEHVWKH,QYDVLYH:HHG2SWLPL]DWLRQ,:2DOJRULWKP EULHIO\ 6HFWLRQ ,,, GLVFXVVHV FROODERUDWLYH ILOWHULQJ DOJRULWKPV DQG LQ SDUWLFXODU WKH XVHUEDVHG 3HDUVRQ DOJRULWKPZKLFKLVRQHRIWKH VWDQGDUGEHQFKPDUNPHWKRGV $5HFRPPHQGHU6\VWHPEDVHGRQ,QYDVLYH:HHG2SWLPL]DWLRQ $OJRULWKP +RGD6HSHKUL5DGDQG&DUR/XFDV 4297 1-4244-1340-0/07$25.00 c 2007 IEEE
in recommender systems. Section IV describes how the Iwo algorithm is employed in our proposed recommender o, (tenax -iter, (omr -0 inal/toyonal (1) system. Section v then briefly reports on some common (termas) evaluation metrics. Section VI provides experimental results Where o and g are the initial and final value of and analysis. Finally section Vll concludes the paper. SD for the normal distribution respectively. itermax is the IL. INVASIVE WEED OPTIMIZATION maximum number of iterations before ovel numerical stochastic optimization algorithm inspired the nonlinear modulation index usually set to, G algorithm, Or is the Sd at the present tin ep and n is Invasive weed optimization (IwO) is relatively recently 4- Competitive exclusion: After iterations. the from colonizing weeds that is designed and developed first number of plants in a colony will reach its maximum(pm in 14, Weeds are plants whose vigorous, invasive habits of by fast reproduction. However, it is expected that the fitter plants have been reproduced more than undesirable ones. reproduction and distribution, robustness and adaptation to Thus, the final step is to eliminate the inappropriate and the changes in the environment are some of the interesting characteristics have seen in natural behaviors of weeds that weaker plants in a competitive manner for limiting the have been inspired and used in this optimization algorithm. maximum number of plants The algorithm can be summarized in the following four continues until maximum iterations or some other stopping steps [4] criteria are reached and the plant with the best fitness is 1-Initialize a population: A number of seeds Selected as the optimal solution. composing initial population are In [4 the effectiveness of this algorithm has been tested over the problem space. dispread randomly on three simple benchmark examples related to finding the 2- Reproduction: Every seed that has grown to ney minimum value of Sphere, Griewank and Rastrigin plants is allowed to produce other seeds depending on its ions. But to the best of our knowledge. it has not been fitness. In the simple case, the number of seeds each plant applied to any specific application yet. can produce increases linearly from minimum possible seed corresponding to minimum fitness to the maximum number of seeds corresponding to maximum fitness in the population II. COLLABORATIVE FILTERING as illustrated in fig. 1 Recommender systems try to help users find items they No of seeds would like by producing a predicted likeliness score or a list max no, of of top-N recommended items. Recommendations can be based on demographic information of overall top lling items, past buying or visiting behavior of users as ar indicator for future tendencies Collaborative Filtering(CF) is one of the methods for generating recommendations when enough past behaviors and interests of users are known. Recommender systems based on CF are still the most commonly adopted systems Plant's both commercial and research domain The basic idea of CF-based algorithms is to colony fitness recommendations or predictions based on the opinions of Fig. l Seed production procedure in a colony of weeds 1](lower fitness other like-minded and similar users. Opinions can be better situation explicitly specified by the user as a rating score within a 3- Spatial dispersal: The produced seeds in the previous certain numerical scale or implicitly derived from different step are being distributed randomly in the problem space by purchasing/visiting behaviors of the user normal distribution with mean zero and a variance paramet In general, CF-based algorithms fall in two main decreasing over time. By setting the mean parameter equal categories, namely user-based collaborative filtering and to zero, the seeds are distributed randomly such that they item-based collaborative fltering. The former relies on locate near to the parent plant and by decreasing the variance finding similar users based on similarity of interests in items over time, the fitter plants are grouped together and visited in common and the later works by finding similar inappropriate plants are eliminated over time. The standard items based on the entire similarity of the ratings made by deviation(SD)which is the root square of the variance of users on these items nis distribution is calculated in every time step as shown in User-based algorithms are more popular due to their simplicity while maintaining quality of recommend According to two survey papers on CF [5],[6], Pearson 2007 IEEE Congress on Evolutionary Computation(CEC 2007)
LQUHFRPPHQGHUV\VWHPV6HFWLRQ,9GHVFULEHVKRZWKH,:2 DOJRULWKP LV HPSOR\HG LQ RXU SURSRVHG UHFRPPHQGHU V\VWHP 6HFWLRQ 9 WKHQ EULHIO\ UHSRUWV RQ VRPH FRPPRQ HYDOXDWLRQPHWULFV6HFWLRQ9,SURYLGHVH[SHULPHQWDOUHVXOWV DQGDQDO\VLV)LQDOO\VHFWLRQ9,,FRQFOXGHVWKHSDSHU ,, ,19$6,9(:(('237,0,=$7,21 ,QYDVLYH ZHHG RSWLPL]DWLRQ ,:2 LV UHODWLYHO\ UHFHQWO\ QRYHO QXPHULFDO VWRFKDVWLF RSWLPL]DWLRQ DOJRULWKP LQVSLUHG IURP FRORQL]LQJZHHGVWKDWLV GHVLJQHG DQG GHYHORSHG ILUVW LQ >@:HHGVDUHSODQWVZKRVHYLJRURXVLQYDVLYHKDELWVRI JURZWK SRVH VHULRXV WKUHDW WR GHVLUDEOH SODQWV )DVW UHSURGXFWLRQ DQG GLVWULEXWLRQ UREXVWQHVV DQG DGDSWDWLRQ WR WKH FKDQJHVLQWKH HQYLURQPHQW DUH VRPH RIWKHLQWHUHVWLQJ FKDUDFWHULVWLFV KDYH VHHQLQQDWXUDO EHKDYLRUV RIZHHGVWKDW KDYHEHHQLQVSLUHGDQGXVHGLQWKLVRSWLPL]DWLRQDOJRULWKP 7KH DOJRULWKP FDQ EH VXPPDUL]HG LQ WKH IROORZLQJ IRXU VWHSV>@ ,QLWLDOL]H D SRSXODWLRQ $ ILQLWH QXPEHU RI VHHGV FRPSRVLQJ LQLWLDO SRSXODWLRQ DUH EHLQJ GLVSUHDG UDQGRPO\ RYHUWKHSUREOHPVSDFH 5HSURGXFWLRQ (YHU\ VHHG WKDW KDV JURZQ WR QHZ SODQWV LV DOORZHG WR SURGXFH RWKHU VHHGV GHSHQGLQJ RQ LWV ILWQHVV ,QWKH VLPSOH FDVHWKH QXPEHU RI VHHGV HDFK SODQW FDQSURGXFHLQFUHDVHVOLQHDUO\IURPPLQLPXPSRVVLEOHVHHG FRUUHVSRQGLQJWRPLQLPXP ILWQHVVWRWKHPD[LPXPQXPEHU RIVHHGVFRUUHVSRQGLQJWRPD[LPXPILWQHVVLQWKHSRSXODWLRQ DVLOOXVWUDWHGLQILJ )LJ 6HHG SURGXFWLRQ SURFHGXUH LQ D FRORQ\ RI ZHHGV >@ ORZHU ILWQHVV PHDQVEHWWHUVLWXDWLRQ 6SDWLDOGLVSHUVDO7KHSURGXFHGVHHGVLQWKHSUHYLRXV VWHSDUHEHLQJGLVWULEXWHGUDQGRPO\LQWKHSUREOHPVSDFHE\ QRUPDOGLVWULEXWLRQZLWKPHDQ]HURDQGDYDULDQFHSDUDPHWHU GHFUHDVLQJ RYHUWLPH%\ VHWWLQJWKHPHDQ SDUDPHWHU HTXDO WR ]HUR WKH VHHGV DUH GLVWULEXWHG UDQGRPO\ VXFK WKDW WKH\ ORFDWHQHDUWRWKHSDUHQWSODQWDQGE\GHFUHDVLQJWKHYDULDQFH RYHU WLPH WKH ILWWHU SODQWV DUH JURXSHG WRJHWKHU DQG LQDSSURSULDWH SODQWV DUH HOLPLQDWHG RYHUWLPH7KH VWDQGDUG GHYLDWLRQ 6' ZKLFK LV WKH URRW VTXDUH RI WKH YDULDQFH RI WKLVGLVWULEXWLRQLVFDOFXODWHGLQHYHU\WLPHVWHSDVVKRZQLQ HTXDWLRQ Q LQLW ILQDO ILQDO Q LWHU LWHU LWHU LWHU V V V V PD[ PD[ :KHUHV LQLW DQGV ILQDO DUHWKHLQLWLDODQGILQDOYDOXHRI 6' IRU WKH QRUPDO GLVWULEXWLRQ UHVSHFWLYHO\ PD[ LWHU LV WKH PD[LPXP QXPEHU RI LWHUDWLRQV EHIRUH VWRSSLQJ WKH DOJRULWKP V LWHU LVWKH6'DWWKHSUHVHQWWLPH VWHSDQGQLV WKHQRQOLQHDUPRGXODWLRQLQGH[XVXDOO\VHWWR &RPSHWLWLYH H[FOXVLRQ $IWHU VRPH LWHUDWLRQV WKH QXPEHURISODQWVLQDFRORQ\ZLOOUHDFKLWVPD[LPXPSPD[ E\ IDVW UHSURGXFWLRQ+RZHYHULWLV H[SHFWHGWKDWWKH ILWWHU SODQWV KDYH EHHQ UHSURGXFHG PRUH WKDQ XQGHVLUDEOH RQHV 7KXV WKH ILQDO VWHS LV WR HOLPLQDWH WKH LQDSSURSULDWH DQG ZHDNHU SODQWV LQ D FRPSHWLWLYH PDQQHU IRU OLPLWLQJ WKH PD[LPXP QXPEHU RI SODQWV LQ D FRORQ\ 7KH SURFHVV FRQWLQXHV XQWLOPD[LPXPLWHUDWLRQV RU VRPH RWKHU VWRSSLQJ FULWHULD DUH UHDFKHG DQG WKH SODQW ZLWK WKH EHVW ILWQHVV LV VHOHFWHGDVWKHRSWLPDOVROXWLRQ ,Q >@WKH HIIHFWLYHQHVV RIWKLV DOJRULWKP KDV EHHQWHVWHG RQWKUHH VLPSOH EHQFKPDUN H[DPSOHV UHODWHGWR ILQGLQJWKH PLQLPXP YDOXH RI 6SKHUH *ULHZDQN DQG 5DVWULJLQ IXQFWLRQV%XWWRWKHEHVWRIRXUNQRZOHGJHLWKDVQRWEHHQ DSSOLHGWRDQ\VSHFLILFDSSOLFDWLRQ\HW ,,, &2//$%25$7,9(),/7(5,1* 5HFRPPHQGHU V\VWHPV WU\ WR KHOS XVHUV ILQG LWHPV WKH\ ZRXOGOLNHE\SURGXFLQJDSUHGLFWHGOLNHOLQHVVVFRUHRUDOLVW RI WRS1 UHFRPPHQGHG LWHPV 5HFRPPHQGDWLRQV FDQ EH EDVHG RQ GHPRJUDSKLF LQIRUPDWLRQ RI XVHUV RYHUDOO WRS VHOOLQJLWHPVSDVWEX\LQJRUYLVLWLQJEHKDYLRURIXVHUVDVDQ LQGLFDWRUIRUIXWXUHWHQGHQFLHV &ROODERUDWLYH )LOWHULQJ &) LV RQH RI WKH PHWKRGV IRU JHQHUDWLQJ UHFRPPHQGDWLRQV ZKHQ HQRXJK SDVW EHKDYLRUV DQG LQWHUHVWV RI XVHUV DUH NQRZQ 5HFRPPHQGHU V\VWHPV EDVHGRQ&)DUHVWLOOWKHPRVWFRPPRQO\DGRSWHGV\VWHPVLQ ERWKFRPPHUFLDODQGUHVHDUFKGRPDLQ 7KH EDVLF LGHD RI &)EDVHG DOJRULWKPV LV WR SURYLGH UHFRPPHQGDWLRQV RU SUHGLFWLRQV EDVHG RQ WKH RSLQLRQV RI RWKHU OLNHPLQGHG DQG VLPLODU XVHUV 2SLQLRQV FDQ EH H[SOLFLWO\ VSHFLILHG E\ WKH XVHU DV D UDWLQJ VFRUH ZLWKLQ D FHUWDLQ QXPHULFDO VFDOH RULPSOLFLWO\ GHULYHG IURP GLIIHUHQW SXUFKDVLQJYLVLWLQJEHKDYLRUVRIWKHXVHU ,Q JHQHUDO &)EDVHG DOJRULWKPV IDOO LQ WZR PDLQ FDWHJRULHV QDPHO\ XVHUEDVHG FROODERUDWLYH ILOWHULQJ DQG LWHPEDVHG FROODERUDWLYH ILOWHULQJ 7KH IRUPHU UHOLHV RQ ILQGLQJVLPLODUXVHUVEDVHGRQVLPLODULW\RILQWHUHVWVLQLWHPV YLVLWHG LQ FRPPRQ DQG WKH ODWHU ZRUNV E\ ILQGLQJ VLPLODU LWHPV EDVHG RQWKH HQWLUH VLPLODULW\ RIWKH UDWLQJVPDGH E\ XVHUVRQWKHVHLWHPV 8VHUEDVHG DOJRULWKPV DUH PRUH SRSXODU GXH WR WKHLU VLPSOLFLW\ ZKLOH PDLQWDLQLQJ TXDOLW\ RI UHFRPPHQGDWLRQV $FFRUGLQJ WR WZR VXUYH\ SDSHUV RQ &) >@ >@ 3HDUVRQ PLQILWQHVV LQWKH FRORQ\ PD[ILWQHVV LQWKH FRORQ\ 3ODQW V ILWQHVV PD[QRRI VHHGV PLQQRRI VHHGV IORRU QRRIVHHGV 1RRIVHHGV 4298 2007 IEEE Congress on Evolutionary Computation (CEC 2007)
correlation algorithm had best accuracy among different CF- next two subsections. More detailed descriptions can be based algorithms at that time on different available test data. found in [1] though, since then some new algorithms with better mparable results have been proposed, we choose user A. Using active users for generatin based Pearson correlation algorithm as a standard method for comparison due to its simplicity and higher popularity The main formula of this method which calculates the In real world, there may be near one millions of users in a similarity of users to find similar users(neighbors)is pecific recommender system. It is almost impossible to test this entire large database of user profiles for finding similar ∑(R-瓦R,-R sers. Thus, instead of expecting to find all similar users for each test user, the system should search for similar users (2) only in the space containing the best potential similar users ∑(R-R)∑(R Detecting and using active users in a system could be one of the options in defining best potential similar users. Active users are users with more activity in the system and the sers u and v is calculated and probability of finding similarity between each test user and R. and R ratings of these users over all at least one of these active users is more. In tering algorithms, the base comparison criterion between their rated items respectively. Rui and Rr. i denote the rating users is their common visited items and the more items a of these users on item i and the sum is calculated over all common visited items of these users user has rated the greater chance she/he has to be clos manv users. Finally the rating for test user(u)on item i is predicted as In [1] we have shown the he effectiveness of selec follows [51 similar users from the set of active users. In that work. we have shown that by using active users in finding similar PR(u,=k*>sim(u, v)*RR(v, i) ()users phase, the accuracy of predicting the users interests Where the entire database of users usually used in other works k is a normalizing factor such that the sum of the luding [2,3]. Thus, here we choose the same approach similarities is equal to 1 in selecting the similar users for getting better results in all v is a user in neighborhood of user tl the methods implemented B. Using IwO algorithm for finding users' preferences IV. IWO ALGORITHM IN RECOMMENDER SYSTEM DOMAIN Most CF-based algorithms and in particular, user-based As described in the previous section, the main idea of CF- Pearson correlation algorithm, which was described, rely based algorithms is utilizing of neighbor users for only on past behavior/ratings of users for comparing users each test user to suggest new items to her/him. Hence, the Thus, the similarity definition is based on ratings of those main step in these algorithms is how to define similarity items that both users have rated in the past. To find really between each pair of users an how to select similar or closer neighbors, as proposed in one of the early works in neighbor users. It is very vital that only the best and really recommender systems domain [7, other users closest users be used in generating new recommendations characteristics like demographic information of users, could for each user. For this purpose, we have used two simple be an important source of understanding users'interests approaches in our previous work [1] to find the more similar Reference [8], also emphasizes incorporating user's users for each test user and thereby increase the prediction personality or lifestyle concept into the user-profile capability of the recommender system. The main idea of the matching process of a recommender system in first approach is detecting and making use of "active users" personalized television advertisement domain. The actual to increase the probability of finding more similar users and lifestyle information available in recommender systems is ving more information about users' interests. The second restricted to demographic information, and they focus on approach is based on [2][3] and is the core of the system only this information. The main motivation of their work is where the IwO algorithm is applied. In this approach, it is to rely on demographic information of users for finding tried to implement more personalized recommendation by similar users, instead only relying on past ratings of assigning different priority importance to each of the visited items. In this manner, the sparsity problem could be Teatures of the user-profile for different users. These two alleviated(i.e. sparsity problem refers to situation when a few items ratings out of a large number of ser are available). Information about prey rated items Test user or active user is the user for whom the system is generating for each user is used only after neighbo bor selection and for 2007 IEEE Congress on Evolutionary Computation(CEC 2007) 4299
FRUUHODWLRQDOJRULWKPKDGEHVWDFFXUDF\DPRQJGLIIHUHQW&) EDVHGDOJRULWKPVDWWKDWWLPHRQGLIIHUHQWDYDLODEOHWHVWGDWD $OWKRXJK VLQFH WKHQ VRPH QHZ DOJRULWKPV ZLWK EHWWHU FRPSDUDEOH UHVXOWV KDYH EHHQ SURSRVHG ZH FKRRVH XVHU EDVHG 3HDUVRQ FRUUHODWLRQ DOJRULWKP DV D VWDQGDUG PHWKRG IRU FRPSDULVRQ GXH WR LWV VLPSOLFLW\ DQG KLJKHU SRSXODULW\ 7KH PDLQ IRUPXOD RI WKLV PHWKRG ZKLFK FDOFXODWHV WKH VLPLODULW\RIXVHUVWRILQGVLPLODUXVHUVQHLJKERUVLV ¦ ¦ ¦ ] L ] L X L X Y L Y ] L X L X Y L Y 5 5 5 5 5 5 5 5 VLP X Y +HUH WKH VLPLODULW\ RI XVHUV X DQG Y LV FDOFXODWHG DQG 5 X DQG 5Y DUH WKH DYHUDJH UDWLQJV RI WKHVH XVHUV RYHU DOO WKHLU UDWHGLWHPV UHVSHFWLYHO\5XLDQG5YL GHQRWHWKH UDWLQJ RI WKHVH XVHUV RQ LWHP L DQGWKH VXP LV FDOFXODWHG RYHU DOO FRPPRQYLVLWHGLWHPVRIWKHVHXVHUV )LQDOO\WKHUDWLQJIRUWHVWXVHUXRQLWHPLLVSUHGLFWHGDV IROORZV>@ 35XL ¦ ] Y N VLP X Y 55 Y L :KHUH N LV D QRUPDOL]LQJ IDFWRU VXFK WKDW WKH VXP RI WKH VLPLODULWLHVLVHTXDOWR YLVDXVHULQQHLJKERUKRRGRIXVHUX ]LVWKHVL]HRIQHLJKERUKRRG ,9 ,:2$/*25,7+0,15(&200(1'(56@WRILQGWKHPRUHVLPLODU XVHUV IRU HDFKWHVW XVHU DQGWKHUHE\LQFUHDVHWKH SUHGLFWLRQ FDSDELOLW\RIWKHUHFRPPHQGHUV\VWHP7KHPDLQLGHDRIWKH ILUVWDSSURDFKLVGHWHFWLQJDQGPDNLQJXVHRIDFWLYHXVHUV WRLQFUHDVHWKHSUREDELOLW\RIILQGLQJPRUHVLPLODUXVHUVDQG KDYLQJPRUHLQIRUPDWLRQ DERXW XVHUV LQWHUHVWV7KH VHFRQG DSSURDFKLV EDVHG RQ >@ >@ DQGLVWKH FRUH RIWKH V\VWHP ZKHUHWKH ,:2 DOJRULWKPLV DSSOLHG ,QWKLV DSSURDFKLWLV WULHG WR LPSOHPHQW PRUH SHUVRQDOL]HG UHFRPPHQGDWLRQ E\ DVVLJQLQJ GLIIHUHQW SULRULW\ LPSRUWDQFH WR HDFK RI WKH IHDWXUHV RI WKH XVHUSURILOH IRU GLIIHUHQW XVHUV 7KHVH WZR FRPSOHPHQWDU\ DSSURDFKHV ZLOO EH GLVFXVVHG EULHIO\ LQ WKH 7HVW XVHU RU DFWLYH XVHU LV WKH XVHU IRU ZKRP WKH V\VWHP LV JHQHUDWLQJ UHFRPPHQGDWLRQV QH[W WZR VXEVHFWLRQV 0RUH GHWDLOHG GHVFULSWLRQV FDQ EH IRXQGLQ>@ $ 8VLQJDFWLYHXVHUVIRUJHQHUDWLQJSRWHQWLDOVLPLODU XVHUV ,QUHDOZRUOGWKHUHPD\EHQHDURQHPLOOLRQVRIXVHUVLQD VSHFLILFUHFRPPHQGHUV\VWHP,WLVDOPRVWLPSRVVLEOHWRWHVW WKLVHQWLUHODUJHGDWDEDVHRIXVHUSURILOHVIRUILQGLQJVLPLODU XVHUV7KXVLQVWHDGRIH[SHFWLQJWRILQGDOOVLPLODUXVHUVIRU HDFK WHVW XVHU WKH V\VWHP VKRXOG VHDUFK IRU VLPLODU XVHUV RQO\LQWKHVSDFHFRQWDLQLQJWKHEHVWSRWHQWLDOVLPLODUXVHUV 'HWHFWLQJDQGXVLQJDFWLYHXVHUVLQDV\VWHPFRXOGEHRQHRI WKH RSWLRQV LQ GHILQLQJ EHVW SRWHQWLDO VLPLODU XVHUV $FWLYH XVHUV DUH XVHUV ZLWK PRUH DFWLYLW\ LQ WKH V\VWHP DQG WKH SUREDELOLW\ RI ILQGLQJ VLPLODULW\ EHWZHHQHDFKWHVW XVHUDQG DW OHDVW RQH RI WKHVH DFWLYH XVHUV LV PRUH ,Q FROODERUDWLYH ILOWHULQJ DOJRULWKPV WKH EDVH FRPSDULVRQ FULWHULRQ EHWZHHQ XVHUV LV WKHLU FRPPRQ YLVLWHG LWHPV DQG WKH PRUH LWHPV D XVHU KDV UDWHG WKH JUHDWHU FKDQFH VKHKH KDV WR EH FORVH WR PDQ\XVHUV ,Q >@ ZH KDYH VKRZQ WKH HIIHFWLYHQHVV RI VHOHFWLQJ VLPLODU XVHUV IURPWKH VHW RI DFWLYH XVHUV ,QWKDWZRUNZH KDYH VKRZQ WKDW E\ XVLQJ DFWLYH XVHUV LQ ILQGLQJ VLPLODU XVHUV SKDVH WKH DFFXUDF\ RI SUHGLFWLQJ WKH XVHUV LQWHUHVWV FDQ EHLQFUHDVHGFRPSDUHGWR IL[HG RU UDQGRP VDPSOLQJ RI WKH HQWLUH GDWDEDVH RI XVHUV XVXDOO\ XVHG LQ RWKHU ZRUNV LQFOXGLQJ >@>@7KXVKHUHZHFKRRVHWKHVDPHDSSURDFK LQ VHOHFWLQJWKH VLPLODUXVHUV IRUJHWWLQJEHWWHU UHVXOWVLQDOO WKHPHWKRGVLPSOHPHQWHG % 8VLQJ,:2DOJRULWKPIRUILQGLQJXVHUV SUHIHUHQFHV 0RVW &)EDVHG DOJRULWKPV DQG LQ SDUWLFXODU XVHUEDVHG 3HDUVRQ FRUUHODWLRQ DOJRULWKP ZKLFK ZDV GHVFULEHG UHO\ RQO\ RQ SDVW EHKDYLRUUDWLQJV RI XVHUV IRU FRPSDULQJ XVHUV 7KXV WKH VLPLODULW\ GHILQLWLRQ LV EDVHG RQ UDWLQJV RI WKRVH LWHPVWKDW ERWK XVHUV KDYH UDWHGLQWKH SDVW 7R ILQG UHDOO\ FORVHU QHLJKERUV DV SURSRVHG LQ RQH RI WKH HDUO\ ZRUNV LQ UHFRPPHQGHU V\VWHPV GRPDLQ >@ RWKHU XVHUV¶ FKDUDFWHULVWLFVOLNHGHPRJUDSKLFLQIRUPDWLRQRIXVHUVFRXOG EH DQ LPSRUWDQW VRXUFH RI XQGHUVWDQGLQJ XVHUV¶ LQWHUHVWV 5HIHUHQFH >@ DOVR HPSKDVL]HV LQFRUSRUDWLQJ XVHU¶V SHUVRQDOLW\ RU OLIHVW\OH FRQFHSW LQWR WKH XVHUSURILOH PDWFKLQJ SURFHVV RI D UHFRPPHQGHU V\VWHP LQ D SHUVRQDOL]HG WHOHYLVLRQ DGYHUWLVHPHQW GRPDLQ 7KH DFWXDO OLIHVW\OH LQIRUPDWLRQ DYDLODEOH LQ UHFRPPHQGHU V\VWHPV LV UHVWULFWHG WR GHPRJUDSKLF LQIRUPDWLRQ DQG WKH\ IRFXV RQ RQO\WKLVLQIRUPDWLRQ7KHPDLQPRWLYDWLRQRIWKHLUZRUNLV WR UHO\ RQ GHPRJUDSKLF LQIRUPDWLRQ RI XVHUV IRU ILQGLQJ VLPLODU XVHUV LQVWHDG RI RQO\ UHO\LQJ RQ SDVW UDWLQJV RI YLVLWHGLWHPV ,QWKLVPDQQHUWKHVSDUVLW\SUREOHPFRXOGEH DOOHYLDWHG LH VSDUVLW\ SUREOHP UHIHUV WR VLWXDWLRQ ZKHQ D IHZ LWHPV UDWLQJV RXW RI D ODUJH QXPEHU RI LWHPV IRU HDFK XVHUDUHDYDLODEOH,QIRUPDWLRQDERXWSUHYLRXVO\UDWHGLWHPV IRU HDFK XVHU LV XVHG RQO\ DIWHU QHLJKERU VHOHFWLRQ DQG IRU 2007 IEEE Congress on Evolutionary Computation (CEC 2007) 4299
predicting the test users ratings. However, it m choosing the k top nearest users (K is the size of legitimate decision to rely on only demographic neighborhood) or by choosing those who have a distance r finding similar users, especially when less than a specified threshold. In this work, we selectee information about how users rated differe users based on the second approach, considering all users who have a distance less than half of the average distance o o. In [2]and [3], new approaches very similar to [8] are all the users to the test user as similar users to this user demographic information(fixed part)and rating information optimization algorithms like IwO, Pso and ga to a n a specific item(variable part). Fig. 2 shows an overview problem is what best solutions are and how to define the of user- profile in these approaches. This profile is the base fitness function. We chose the fitness function as the same as of comparison of each pair of users the function described in [2],[3]. In these works, the fitness function is defined as a classification problem and the ating demographic demographic demographic average number of correct predicted ratings for items in the as At first all the rated items for the test user is partitioned Fig 2. Profile(u, i-profile for user u with rating on item i and the training and test sets and then neighbor users are selected equation (4) based on a pa solution in the optimization algorithm) and items in training t The main idea of these works is that not only set. Finally, based on the opinions of neighbors found and by corporating demographic information of users in profile- matching process of CF-based algorithms is important, but applying equation(3), the rating for each item in training set is calculated. These predicted ratings on training set should also the importance of each feature of a user-profile is not be used according to the evaluation metric to describe the The motivation behind this idea is that"different users place V. EVALUATION METRICS different importance or priority on each feature of the user profile. For example if a male user prefers to be given Several collaborative filtering evaluation metrics have recommendations based on the opinions of other men, ther his feature weight for gender would be higher than other been proposed in literatures mostly concerning predictive features] accuracy metrics. These metrics measure how predicted In order to find the weight associated to each component ratings are close to actual user ratings. In the following of user-profile, two different bio-inspired optimization subsections we briefly describe some of these measures algorithms namely Genetic Algorithm [3 and Particle A. Zero tolerance and At-Most-One tolerance Swarm Optimization [2] have been proposed Here, we apply Invasive Weed Optimization (Two) These two metrics are the metrics proposed and used in algorithm [4] for the same purpose with some little changes [21, [3] and because these works are related to the approach in selecting the potential similar users as described in adopted in this paper, we briefly describe them here Zero tolerance calculates the number of ratings that the After the optimal weights have been found, the two system predicted correctly out of the total number of items profiles are compared according to equation (4)based on rated by user. At-Most-One tolerance is the same as zero Euclidean distance of two profiles tolerance, except that it tolerates to at most one difference etween the actual and predicted ratings to compensate the ds(u,)=,∑∑wr*d,(u,y (4) effect of vulnerability of perfect number matching of zero olerance. None of these methods consider the actual difference of incorrect ratings and only cour u is the test user and v is the user who t user i(a≠y) B. Root Mean Square error is the number of common items which both users have rated The most widely used metrics in accuracy metrIcs are gis the test user's weights for feature_f mean absolute error (MAE) and root mean square error diffu, y is the difference in profile value for feature(RMSE)metrics. MaE considers only the absolute value of between users u and v on the item i the difference of predicted and real ratings, but RMsE After calculating the distance of each test user with all squares the error before summing, thereby considers large users, like standard user-based Pearson algorithm, the most values of errors more pronounced than small values similar users are selected for generating recommendations Equation(5)shows the calculation of RMSE measure for the test user. The most similar users can be selected by 4300 2007 IEEE Congress on Evolutionary Computation(CEC 2007)
SUHGLFWLQJWKHWHVW XVHU¶V UDWLQJV+RZHYHULWPD\ QRW EHD OHJLWLPDWHGHFLVLRQWRUHO\RQRQO\GHPRJUDSKLFLQIRUPDWLRQ IRU ILQGLQJ VLPLODU XVHUV HVSHFLDOO\ ZKHQ D VHULHV RI LQIRUPDWLRQ DERXW KRZ XVHUV UDWHG GLIIHUHQW LWHPV LV DYDLODEOH ,Q >@ DQG >@ QHZ DSSURDFKHV YHU\ VLPLODU WR >@ DUH SURSRVHG ,QWKHVHZRUNVHDFKXVHUSURILOHFRQVLVWVRIERWK GHPRJUDSKLFLQIRUPDWLRQIL[HGSDUWDQGUDWLQJLQIRUPDWLRQ RQDVSHFLILFLWHPYDULDEOHSDUW)LJVKRZVDQRYHUYLHZ RIXVHUSURILOHLQWKHVHDSSURDFKHV7KLVSURILOHLVWKHEDVH RIFRPSDULVRQRIHDFKSDLURIXVHUV UDWLQJ GHPRJUDSKLF IHDWXUH GHPRJUDSKLF IHDWXUH GHPRJUDSKLF IHDWXUHQ )LJ 3URILOH X L SURILOH IRU XVHU X ZLWK UDWLQJ RQ LWHP L DQG GHPRJUDSKLFIHDWXUHVQ 7KH PDLQ LGHD RI WKHVH ZRUNV LV WKDW QRW RQO\ LQFRUSRUDWLQJ GHPRJUDSKLF LQIRUPDWLRQ RI XVHUV LQ SURILOH PDWFKLQJ SURFHVV RI &)EDVHG DOJRULWKPV LV LPSRUWDQW EXW DOVRWKHLPSRUWDQFH RI HDFK IHDWXUH RI D XVHUSURILOHLV QRW WKH VDPH IRU DOO XVHUV DQG GLIIHUHQW LPSRUWDQFH ZHLJKWLQJ VKRXOGEHDVVLJQHGWRWKHVHIHDWXUHVLQFOXGLQJUDWLQJIHDWXUH 7KHPRWLYDWLRQEHKLQGWKLVLGHDLVWKDW³GLIIHUHQWXVHUVSODFH GLIIHUHQWLPSRUWDQFHRUSULRULW\RQHDFK IHDWXUHRIWKHXVHU SURILOH )RU H[DPSOH LI D PDOH XVHU SUHIHUV WR EH JLYHQ UHFRPPHQGDWLRQVEDVHG RQWKH RSLQLRQVRI RWKHUPHQWKHQ KLV IHDWXUH ZHLJKW IRU JHQGHU ZRXOG EH KLJKHU WKDQ RWKHU IHDWXUHV´>@ ,QRUGHUWR ILQGWKHZHLJKWDVVRFLDWHGWRHDFKFRPSRQHQW RI XVHUSURILOH WZR GLIIHUHQW ELRLQVSLUHG RSWLPL]DWLRQ DOJRULWKPV QDPHO\ *HQHWLF $OJRULWKP >@ DQG 3DUWLFOH 6ZDUP2SWLPL]DWLRQ>@KDYHEHHQSURSRVHG +HUH ZH DSSO\ ,QYDVLYH :HHG 2SWLPL]DWLRQ ,:2 DOJRULWKP>@IRUWKHVDPHSXUSRVHZLWKVRPHOLWWOHFKDQJHV LQ VHOHFWLQJ WKH SRWHQWLDO VLPLODU XVHUV DV GHVFULEHG LQ SUHYLRXVVXEVHFWLRQDQGLQWKHHYDOXDWLRQFULWHULD $IWHU WKH RSWLPDO ZHLJKWV KDYH EHHQ IRXQG WKH WZR SURILOHV DUH FRPSDUHG DFFRUGLQJ WR HTXDWLRQ EDVHG RQ (XFOLGHDQGLVWDQFHRIWZRSURILOHV ¦¦ ] L Q I I L I GLVW X Y Z GLII X Y :KHUH XLVWKHWHVWXVHUDQGYLVWKHXVHUZKRPD\EHDQHLJKERU RIXVHUXX z Y ]LVWKH QXPEHU RIFRPPRQLWHPVZKLFK ERWK XVHUV KDYH UDWHG ZILVWKHWHVWXVHU¶VZHLJKWVIRUIHDWXUHI GLIILIXY LV WKH GLIIHUHQFH LQ SURILOH YDOXH IRU IHDWXUH I EHWZHHQXVHUVXDQGYRQWKHLWHPL $IWHU FDOFXODWLQJ WKH GLVWDQFH RI HDFK WHVW XVHU ZLWK DOO XVHUVOLNH VWDQGDUG XVHUEDVHG3HDUVRQDOJRULWKPWKHPRVW VLPLODU XVHUV DUH VHOHFWHG IRU JHQHUDWLQJ UHFRPPHQGDWLRQV IRUWKHWHVW XVHU7KHPRVW VLPLODU XVHUVFDQ EH VHOHFWHGE\ FKRRVLQJ WKH . WRS QHDUHVW XVHUV . LV WKH VL]H RI QHLJKERUKRRG RU E\ FKRRVLQJ WKRVH ZKR KDYH D GLVWDQFH OHVV WKDQ D VSHFLILHG WKUHVKROG ,Q WKLV ZRUN ZH VHOHFWHG XVHUV EDVHG RQ WKH VHFRQG DSSURDFK FRQVLGHULQJ DOO XVHUV ZKRKDYHDGLVWDQFHOHVVWKDQKDOIRIWKHDYHUDJHGLVWDQFHRI DOOWKHXVHUVWRWKHWHVWXVHUDVVLPLODUXVHUVWRWKLVXVHU 7KH PDLQ TXHVWLRQ ZKHQ DSSO\LQJ WKH ILWQHVV EDVHG RSWLPL]DWLRQ DOJRULWKPV OLNH ,:2 362 DQG *$ WR D SUREOHP LV ZKDW EHVW VROXWLRQV DUH DQG KRZ WR GHILQH WKH ILWQHVVIXQFWLRQ:HFKRVHWKHILWQHVVIXQFWLRQDVWKHVDPHDV WKHIXQFWLRQGHVFULEHGLQ>@>@,QWKHVHZRUNVWKHILWQHVV IXQFWLRQ LV GHILQHG DV D FODVVLILFDWLRQ SUREOHP DQG WKH DYHUDJHQXPEHURIFRUUHFWSUHGLFWHGUDWLQJVIRULWHPVLQWKH WUDLQLQJVHWLVXVHGDVILWQHVVVFRUHIRUHDFKVHWRISULRULWLHV $WILUVWDOOWKHUDWHGLWHPVIRUWKHWHVWXVHULVSDUWLWLRQHGLQWR WKHWUDLQLQJDQGWHVWVHWVDQGWKHQQHLJKERUXVHUVDUHVHOHFWHG E\ HTXDWLRQ EDVHG RQ D SDUWLFXODU SULRULW\ VHW HDFK VROXWLRQLQWKHRSWLPL]DWLRQDOJRULWKPDQGLWHPVLQWUDLQLQJ VHW)LQDOO\EDVHGRQWKHRSLQLRQVRIQHLJKERUVIRXQGDQGE\ DSSO\LQJHTXDWLRQWKHUDWLQJIRUHDFKLWHPLQWUDLQLQJVHW LVFDOFXODWHG7KHVHSUHGLFWHG UDWLQJVRQWUDLQLQJ VHW VKRXOG EH XVHG DFFRUGLQJ WR WKH HYDOXDWLRQ PHWULF WR GHVFULEH WKH ILQDOILWQHVVYDOXHIRUHDFKVROXWLRQ 9 (9$/8$7,210(75,&6 6HYHUDO FROODERUDWLYH ILOWHULQJ HYDOXDWLRQ PHWULFV KDYH EHHQ SURSRVHG LQ OLWHUDWXUHV PRVWO\ FRQFHUQLQJ SUHGLFWLYH DFFXUDF\ PHWULFV 7KHVH PHWULFV PHDVXUH KRZ SUHGLFWHG UDWLQJV DUH FORVH WR DFWXDO XVHU UDWLQJV ,Q WKH IROORZLQJ VXEVHFWLRQVZHEULHIO\GHVFULEHVRPHRIWKHVHPHDVXUHV $ =HURWROHUDQFHDQG$W0RVW2QHWROHUDQFH 7KHVH WZR PHWULFV DUH WKH PHWULFV SURSRVHG DQG XVHG LQ >@>@DQGEHFDXVHWKHVHZRUNVDUHUHODWHGWRWKHDSSURDFK DGRSWHGLQWKLVSDSHUZHEULHIO\GHVFULEHWKHPKHUH =HUR WROHUDQFH FDOFXODWHV WKH QXPEHU RI UDWLQJV WKDW WKH V\VWHP SUHGLFWHGFRUUHFWO\ RXW RIWKHWRWDO QXPEHU RILWHPV UDWHG E\ XVHU $W0RVW2QH WROHUDQFH LV WKH VDPH DV ]HUR WROHUDQFH H[FHSW WKDW LW WROHUDWHV WR DW PRVW RQH GLIIHUHQFH EHWZHHQWKH DFWXDO DQG SUHGLFWHG UDWLQJVWR FRPSHQVDWHWKH HIIHFW RI YXOQHUDELOLW\ RI SHUIHFW QXPEHU PDWFKLQJ RI ]HUR WROHUDQFH 1RQH RI WKHVH PHWKRGV FRQVLGHU WKH DFWXDO GLIIHUHQFHRILQFRUUHFWUDWLQJVDQGRQO\FRXQWWKHQXPEHURI FRUUHFWLQFRUUHFWUDWLQJV % 5RRW0HDQ6TXDUH(UURU 7KH PRVW ZLGHO\ XVHG PHWULFV LQ DFFXUDF\ PHWULFV DUH PHDQ DEVROXWH HUURU 0$( DQG URRW PHDQ VTXDUH HUURU 506(PHWULFV0$(FRQVLGHUVRQO\WKHDEVROXWHYDOXHRI WKH GLIIHUHQFH RI SUHGLFWHG DQG UHDO UDWLQJV EXW 506( VTXDUHV WKH HUURU EHIRUH VXPPLQJ WKHUHE\ FRQVLGHUV ODUJH YDOXHV RI HUURUV PRUH SURQRXQFHG WKDQ VPDOO YDOXHV (TXDWLRQVKRZVWKHFDOFXODWLRQRI506(PHDVXUH 4300 2007 IEEE Congress on Evolutionary Computation (CEC 2007)
For each test user. all the items which that user has rated (PR(u,)-RR(,i)) re split into almost 4 equal size parts and each of these 4 RMSe parts is considered in turn as test set and other remaining 3 arts are considered as training set (4-folding cross validation approach). Test items are not used in fitness re PR (u, i) and RR(u i) refer to predicted rating and evaluation of the optimization algorithm and final feature real rating value for user u on item i respectively. n is the weights computation total number of items user u has rated The predicted ratings over all test items in each of the fo test sets are computed based on equation (3). These predicted ratings for each of the test users are VI. EXPERIMENTS gainst actual ratings for that user and the evaluation is done A. Dataset based on one of the metrics described in section v. The final evaluation would be the average of the evaluation over each For experimental results, we choose the public of the four splitted test sets MovieLens' dataset usually used in recommender systems In general the user-profnle is as the profile depicted earlier research papers including all related previous works [1], [2], in fig. 2 and is consisted of rating information and demographic information available in dataset. Fig. 3 depicts This dataset has recently converted in two small and large user-profile in a special domain(movie domain) based on size versions. Small size version which is usually adapted in MovieLens dataset. In this user-protile, 19 genres research works contains 100 000 ratings in [1-5] scale for frequencies indicate the average frequency of movies genres 943 users on 1682 movies where each user has rated at least that user visited 20 items. Demographic information available for each user in this dataset is age, gender and occupation. Also, each movie is assigned to at least one of the 19 genres(genres ike western, romantic, etc, including a specific genre called m unknown genre describing movies which are not in neither The settings of the genetic Algorithm and the particle The union of all these 19 genres features along with 3 Swarm Optimization were selected very similar to their demographic features creates the profile of each test user. respective related works [2], [3] Optimization algorithms like IwO are applied to find the n each iterations of the Ga, the top 40% of the whole ormalized priority or weight of each of these total 22 population are selected randomly to be the parents of the features for each of test users next generation. Two offspring were produced from every pair of parents using single point crossover and mutation B. Experimental setups operators with probability of 1.0 and 0.01 respectively as gested The final next generation is selected To show the difference in performance of our randomly from the current generation and the offspring recommender systems which is based on IWO algor produced in the current iteration to fix the number have compared our approach with standard population. Unlike 3] which always kept a fixed number of Pearson algorithm (PA)5], and other related the best individuals of the current generation for the next works which had applied Pso [2] and GA [3] optimization iteration, in the version we implemented, individuals are algorithms for the same task selected randomly with a probability proportional to their To be comparable to these previous works, the setups of fitness values. The less the fitness value(the the experiments is chosen with little changes as the same of the more chance the individual would have to be in future one of the experiments have been done in [2]. In this generations too. We deemed this char n our xperiment, 10 users were picked from the entire database of implementation necessary to get better results. users randomly. Each of these users was considered as a test To be comparable to the ga which works with unsigned user in turn. Other 40 users were also picked to provide binary encoding of the variables, the feature weights in [2] recommendations to test users. In [2], [3] these other 40 are also selected in the range of 0 to 255 corresponding to users were picked by fixed or random selection approach, binary encoding of 8 bits for each of the feature of the user but as described previously, we chose top 40 ac this purpose to provide better recommendations as the implementation for all 3 optimization algorithms effectiveness of this approach has been shown in our The PsO algorithm was implemented in the same way revious work [ 1] ur previous work [1]. This version of the PSo algorithm has little difference with the version proposed in [2]A described in [1], in PSo there is no guarantee that particles positions will be kept inside the sea 2007 IEEE Congress on Evolutionary Computation(CEC 2007) 4301
X Q L 1 35 X L 55 X L 506( ¦ :KHUH35 XL DQG55XL UHIHUWR SUHGLFWHG UDWLQJ DQG UHDO UDWLQJ YDOXH IRU XVHU X RQLWHP L UHVSHFWLYHO\ QLVWKH QXPEHURILWHPVZKLFKSUHGLFWLRQLVGRQH IRUDQG1XLVWKH WRWDOQXPEHURILWHPVXVHUXKDVUDWHG 9, (;3(5,0(176 $ 'DWDVHW )RU H[SHULPHQWDO UHVXOWV ZH FKRRVH WKH SXEOLF 0RYLH/HQV GDWDVHW XVXDOO\ XVHG LQ UHFRPPHQGHU V\VWHPV UHVHDUFKSDSHUVLQFOXGLQJDOOUHODWHGSUHYLRXVZRUNV>@>@ >@ 7KLVGDWDVHWKDVUHFHQWO\FRQYHUWHGLQWZRVPDOODQGODUJH VL]HYHUVLRQV6PDOOVL]HYHUVLRQZKLFKLVXVXDOO\DGDSWHGLQ UHVHDUFK ZRUNV FRQWDLQV ¶ UDWLQJV LQ >@ VFDOH IRU XVHUVRQPRYLHVZKHUHHDFKXVHUKDVUDWHGDWOHDVW LWHPV 'HPRJUDSKLFLQIRUPDWLRQ DYDLODEOH IRU HDFK XVHU LQ WKLV GDWDVHW LV DJH JHQGHU DQG RFFXSDWLRQ $OVR HDFK PRYLH LV DVVLJQHG WR DW OHDVW RQH RI WKH JHQUHV JHQUHV OLNHZHVWHUQURPDQWLFHWFLQFOXGLQJDVSHFLILFJHQUHFDOOHG XQNQRZQ JHQUH GHVFULELQJPRYLHVZKLFK DUH QRWLQ QHLWKHU RIRWKHUGHWHUPLQHGJHQUHV 7KH XQLRQ RI DOO WKHVH JHQUHV IHDWXUHV DORQJ ZLWK GHPRJUDSKLF IHDWXUHV FUHDWHV WKH SURILOH RI HDFK WHVW XVHU 2SWLPL]DWLRQ DOJRULWKPV OLNH ,:2 DUH DSSOLHG WR ILQG WKH QRUPDOL]HG SULRULW\ RU ZHLJKW RI HDFK RI WKHVH WRWDO IHDWXUHVIRUHDFKRIWHVWXVHUV % ([SHULPHQWDOVHWXSV 7R VKRZ WKH GLIIHUHQFH LQ SHUIRUPDQFH RI RXU SURSRVHG UHFRPPHQGHUV\VWHPVZKLFKLVEDVHGRQ,:2DOJRULWKPZH KDYH FRPSDUHG RXU DSSURDFK ZLWK VWDQGDUG XVHUEDVHG 3HDUVRQ DOJRULWKP 3$ >@ DQG RWKHU UHODWHG SUHYLRXV ZRUNVZKLFKKDGDSSOLHG362 >@DQG*$ >@RSWLPL]DWLRQ DOJRULWKPVIRUWKHVDPHWDVN 7R EHFRPSDUDEOHWRWKHVH SUHYLRXVZRUNVWKH VHWXSV RI WKHH[SHULPHQWVLVFKRVHQZLWKOLWWOHFKDQJHVDVWKHVDPHRI RQH RI WKH H[SHULPHQWV KDYH EHHQ GRQH LQ >@ ,Q WKLV H[SHULPHQWXVHUVZHUHSLFNHGIURPWKHHQWLUHGDWDEDVHRI XVHUVUDQGRPO\(DFKRIWKHVHXVHUVZDVFRQVLGHUHGDVDWHVW XVHU LQ WXUQ 2WKHU XVHUV ZHUH DOVR SLFNHG WR SURYLGH UHFRPPHQGDWLRQV WR WHVW XVHUV ,Q >@ >@ WKHVH RWKHU XVHUV ZHUH SLFNHG E\ IL[HG RU UDQGRP VHOHFWLRQ DSSURDFK EXWDVGHVFULEHGSUHYLRXVO\ZHFKRVHWRSDFWLYHXVHUVIRU WKLV SXUSRVH WR SURYLGH EHWWHU UHFRPPHQGDWLRQV DV WKH HIIHFWLYHQHVV RI WKLV DSSURDFK KDV EHHQ VKRZQ LQ RXU SUHYLRXVZRUN>@ ZZZJURXSOHQVRUJ )RUHDFKWHVWXVHUDOOWKHLWHPVZKLFKWKDWXVHUKDVUDWHG DUH VSOLWLQWR DOPRVW HTXDO VL]H SDUWV DQG HDFK RIWKHVH SDUWVLVFRQVLGHUHGLQWXUQDVWHVWVHWDQGRWKHU UHPDLQLQJ SDUWV DUH FRQVLGHUHG DV WUDLQLQJ VHW IROGLQJ FURVV YDOLGDWLRQ DSSURDFK 7HVW LWHPV DUH QRW XVHG LQ ILWQHVV HYDOXDWLRQ RI WKH RSWLPL]DWLRQ DOJRULWKP DQG ILQDO IHDWXUH ZHLJKWVFRPSXWDWLRQ 7KHSUHGLFWHGUDWLQJVRYHUDOOWHVWLWHPVLQHDFKRIWKHIRXU WHVW VHWV DUH FRPSXWHG EDVHG RQ HTXDWLRQ 7KHVH SUHGLFWHG UDWLQJV IRU HDFK RI WKH WHVW XVHUV DUH FRPSDUHG DJDLQVWDFWXDOUDWLQJVIRUWKDWXVHUDQGWKHHYDOXDWLRQLVGRQH EDVHGRQRQHRIWKHPHWULFVGHVFULEHGLQVHFWLRQ97KHILQDO HYDOXDWLRQZRXOGEHWKHDYHUDJHRIWKHHYDOXDWLRQRYHUHDFK RIWKHIRXUVSOLWWHGWHVWVHWV ,QJHQHUDOWKHXVHUSURILOHLVDVWKHSURILOHGHSLFWHGHDUOLHU LQ ILJ DQG LV FRQVLVWHG RI UDWLQJ LQIRUPDWLRQ DQG GHPRJUDSKLFLQIRUPDWLRQDYDLODEOHLQGDWDVHW)LJGHSLFWV XVHUSURILOH LQ D VSHFLDO GRPDLQ PRYLH GRPDLQ EDVHG RQ 0RYLH/HQV GDWDVHW ,Q WKLV XVHUSURILOH JHQUHV IUHTXHQFLHVLQGLFDWHWKHDYHUDJHIUHTXHQF\RIPRYLHVJHQUHV WKDWXVHUYLVLWHG « 5DWLQJ$JH*HQGHU2FFXSDWLRQ*HQUH)UHTXHQFLHV ) )LJ8VHUSURILOHLQ0RYLH/HQVGDWDVHW>@ 7KH VHWWLQJV RI WKH *HQHWLF $OJRULWKP DQG WKH 3DUWLFOH 6ZDUP 2SWLPL]DWLRQ ZHUH VHOHFWHG YHU\ VLPLODU WR WKHLU UHVSHFWLYHUHODWHGZRUNV>@>@ ,Q HDFK LWHUDWLRQV RI WKH *$ WKH WRS RI WKH ZKROH SRSXODWLRQ DUH VHOHFWHG UDQGRPO\ WR EH WKH SDUHQWV RI WKH QH[W JHQHUDWLRQ 7ZR RIIVSULQJ ZHUH SURGXFHG IURP HYHU\ SDLU RI SDUHQWV XVLQJ VLQJOH SRLQW FURVVRYHU DQG PXWDWLRQ RSHUDWRUV ZLWK SUREDELOLW\ RI DQG UHVSHFWLYHO\ DV VXJJHVWHG LQ >@ 7KH ILQDO QH[W JHQHUDWLRQ LV VHOHFWHG UDQGRPO\ IURP WKH FXUUHQW JHQHUDWLRQ DQG WKH RIIVSULQJ SURGXFHG LQ WKH FXUUHQW LWHUDWLRQ WR IL[ WKH QXPEHU RI SRSXODWLRQ8QOLNH>@ZKLFKDOZD\VNHSWDIL[HGQXPEHURI WKH EHVW LQGLYLGXDOV RI WKH FXUUHQW JHQHUDWLRQ IRU WKH QH[W LWHUDWLRQ LQ WKH YHUVLRQ ZH LPSOHPHQWHG LQGLYLGXDOV DUH VHOHFWHG UDQGRPO\ ZLWK D SUREDELOLW\ SURSRUWLRQDO WR WKHLU ILWQHVVYDOXHV7KHOHVVWKHILWQHVVYDOXHWKHEHWWHUILWQHVV WKH PRUH FKDQFH WKH LQGLYLGXDO ZRXOG KDYH WR EH LQ IXWXUH JHQHUDWLRQV WRR :H GHHPHG WKLV FKDQJH LQ RXU LPSOHPHQWDWLRQQHFHVVDU\WRJHWEHWWHUUHVXOWV 7REHFRPSDUDEOHWRWKH*$ZKLFKZRUNVZLWKXQVLJQHG ELQDU\ HQFRGLQJ RIWKH YDULDEOHVWKH IHDWXUHZHLJKWVLQ >@ DUH DOVR VHOHFWHGLQWKH UDQJH RI WR FRUUHVSRQGLQJWR ELQDU\HQFRGLQJRIELWVIRUHDFKRIWKHIHDWXUHRIWKHXVHU SURILOH 7KXV ZH FKRVH WKH VDPH HQFRGLQJ LQ RXU LPSOHPHQWDWLRQIRUDOORSWLPL]DWLRQDOJRULWKPV 7KH362DOJRULWKPZDVLPSOHPHQWHGLQWKHVDPHZD\DV RXU SUHYLRXV ZRUN >@ 7KLV YHUVLRQ RI WKH 362 DOJRULWKP KDV OLWWOH GLIIHUHQFH ZLWK WKH YHUVLRQ SURSRVHG LQ >@ $V GHVFULEHGLQ>@LQ362WKHUHLVQRJXDUDQWHHWKDWSDUWLFOHV¶ SRVLWLRQVZLOOEHNHSWLQVLGHWKHVHDUFKVSDFHDQGSDUWLFOHV¶ 2007 IEEE Congress on Evolutionary Computation (CEC 2007) 4301
positions can take values outside their lower and upper C. Results limits. Thus to avoid this position constraint violation that ometimes even results to divergence of the algorithm, we We evaluated the GA, PSO, Iwo and Pearson approaches dded an extra constraint on position of particles with based on all the metrics described in section v. While the considering reflective behavior for particles. In this definition of zero tolerance and at-most-one tolerance ha approach, particles have to bounce back when they reach or drawbacks discussed in section V. the methods almost each of the side constraints in the search space and the bel ame manner. Thus, due to limitation in space, direction of velocity is reversed after that. Other motion we only report on evaluations based on the RMSe metric as quations are the same as the conventional PSo algorithm it is more reliable and popular evaluation metrics 10]that is the same version adopted in [2] recommender systems literatures Table I shows the settings of Iwo method used in our Results obtained from the user-based Pearson algorithm xperiments. These parameters are set as the parameters ways remain the same, but results of other three suggested in [4] or are set experimentally, but further approaches are variable due to random initialization of the research works should be investigated to study the effect of feature weights in these algorithms. Thus, we plotted the and [3] respectively unless mentioned explicitly uns was chosen as the final result as based on [2] it is ssumed that in real world scenario the system could be run TABLE I- VALUE OF TWO ALGORITHM PARAMETERS Number of initial plants in 10 offline and the best set of feature weights for each user could be saved and used during generating recommendations fo Min number of seeds him/her Max number of seeds Fig. 4 shows the results of comparison of the Iwo Initial value of standard deviation 0.1 algorithm with other approaches. The first three approaches Final value of standard deviation so and Two) are prioritized user-profile recommender systems which try to assign different priority The number of population in GA, the number of particles levels to each of the features of the profile of difterent users in PSo algorithm and the maximum number of plant based on the corresponding optimization algorithms namely, same. Although, the IWO algorithm has another paramet. Pearson user-based algorithms described in section l % ve population(pmax) relatively have the same meaning in these three optimization algorithms and were selected to be the weed optimization. The fourth algorithm(PA) is the normal which determines the initial number of plants in population, As it is derived from the fig. 4, for all users except users after reaching the maximum number of plants(Pma ), the number I and 3, the quality of the recommendations with rioritized user-profile approaches are better than PA other iterations approach. With this comparison, the effectiveness of incorporating demographic and content information in The termination criteria for all these algorithms was set to profile of users in addition to the original rating information reaching the maximum number of iterations or having some and assigning different priority levels for each features of nprovement in results. The definition of the fitness function in all these algorithms is also the same and is as the same as the fitness function described in section Iv 口GA口PsQ Table II summarizes the value of general parameters set RMSE measure for 10 u 口Mo口PA d used equally in all three optimization algorithms considered in this paper. TABLE II- VALUE OF GENERAL PARAMETERS SET EQUALLY IN THREE TIMIZATION ALGORITHMS (GA, PSO AND TWO mber of iterations without population/ Max number of plants in commander system based on 3 different optimization algorithms and with ormal PA recommender system(lower numbers indicate better results) moreover. it could een cases tna prioritized user-profile approaches (GA, PSo and Two) 4302 2007 IEEE Congress on Evolutionary Computation(CEC 2007)
SRVLWLRQV FDQ WDNH YDOXHV RXWVLGH WKHLU ORZHU DQG XSSHU OLPLWV 7KXV WR DYRLG WKLV SRVLWLRQ FRQVWUDLQW YLRODWLRQ WKDW VRPHWLPHV HYHQ UHVXOWV WR GLYHUJHQFH RI WKH DOJRULWKP ZH DGGHG DQ H[WUD FRQVWUDLQW RQ SRVLWLRQ RI SDUWLFOHV ZLWK FRQVLGHULQJ UHIOHFWLYH EHKDYLRU IRU SDUWLFOHV ,Q WKLV DSSURDFK SDUWLFOHV KDYH WR ERXQFH EDFN ZKHQ WKH\ UHDFK HDFK RI WKH VLGH FRQVWUDLQWV LQ WKH VHDUFK VSDFH DQG WKH GLUHFWLRQ RI YHORFLW\ LV UHYHUVHG DIWHU WKDW 2WKHU PRWLRQ HTXDWLRQV DUH WKH VDPH DV WKH FRQYHQWLRQDO 362 DOJRULWKP >@WKDWLVWKHVDPHYHUVLRQDGRSWHGLQ>@ 7DEOH , VKRZV WKH VHWWLQJV RI ,:2 PHWKRG XVHG LQ RXU H[SHULPHQWV 7KHVH SDUDPHWHUV DUH VHW DV WKH SDUDPHWHUV VXJJHVWHG LQ >@ RU DUH VHW H[SHULPHQWDOO\ EXW IXUWKHU UHVHDUFKZRUNVVKRXOGEHLQYHVWLJDWHGWRVWXG\WKHHIIHFWRI HDFK SDUDPHWHU LQ WHUPV RI FRVW EHQHILW DQDO\VLV 7KH SDUDPHWHUV RI362DQG*$DUH DOVR VHWDV VXJJHVWHGLQ >@ DQG>@UHVSHFWLYHO\XQOHVVPHQWLRQHGH[SOLFLWO\ 7$%/( , 9$/8(2),:2$/*25,7+03$5$0(7(56 1XPEHURILQLWLDOSODQWVLQ SRSXODWLRQ 0LQQXPEHURIVHHGV 0D[QXPEHURIVHHGV ,QLWLDOYDOXHRIVWDQGDUGGHYLDWLRQ )LQDOYDOXHRIVWDQGDUGGHYLDWLRQ 7KHQXPEHURISRSXODWLRQLQ*$WKHQXPEHURISDUWLFOHV LQ 362 DOJRULWKP DQG WKH PD[LPXP QXPEHU RI SODQW SRSXODWLRQ SPD[ UHODWLYHO\KDYHWKHVDPHPHDQLQJLQWKHVH WKUHH RSWLPL]DWLRQ DOJRULWKPV DQG ZHUH VHOHFWHG WR EH WKH VDPH $OWKRXJK WKH ,:2 DOJRULWKP KDV DQRWKHU SDUDPHWHU ZKLFKGHWHUPLQHVWKHLQLWLDOQXPEHURISODQWVLQSRSXODWLRQ DIWHU UHDFKLQJ WKH PD[LPXP QXPEHU RI SODQWV SPD[ WKH WRWDOQXPEHURISODQWVDUHNHSWIL[HGHTXDOWRWKLVYDOXHLQDOO RWKHULWHUDWLRQV 7KHWHUPLQDWLRQFULWHULDIRUDOOWKHVHDOJRULWKPVZDVVHWWR UHDFKLQJWKHPD[LPXPQXPEHURILWHUDWLRQVRUKDYLQJVRPH IL[HG QXPEHU RI LWHUDWLRQV ZLWKRXW DQ\ QRWLFHDEOH LPSURYHPHQWLQUHVXOWV7KHGHILQLWLRQRIWKHILWQHVVIXQFWLRQ LQDOOWKHVHDOJRULWKPVLVDOVRWKHVDPHDQGLVDVWKHVDPHDV WKHILWQHVVIXQFWLRQGHVFULEHGLQVHFWLRQ,9 7DEOH ,, VXPPDUL]HVWKH YDOXH RI JHQHUDO SDUDPHWHUV VHW DQG XVHG HTXDOO\ LQ DOO WKUHH RSWLPL]DWLRQ DOJRULWKPV FRQVLGHUHGLQWKLVSDSHU 7$%/( ,, 9$/8(2)*(1(5$/3$5$0(7(566(7(48$//@ LW LV DVVXPHGWKDWLQUHDOZRUOGVFHQDULRWKHV\VWHPFRXOGEHUXQ RIIOLQHDQGWKHEHVWVHWRIIHDWXUHZHLJKWVIRUHDFKXVHUFRXOG EH VDYHG DQG XVHG GXULQJ JHQHUDWLQJ UHFRPPHQGDWLRQV IRU KLPKHU )LJ VKRZV WKH UHVXOWV RI FRPSDULVRQ RI WKH ,:2 DOJRULWKPZLWKRWKHUDSSURDFKHV7KHILUVWWKUHHDSSURDFKHV *$ 362 DQG ,:2 DUH SULRULWL]HG XVHUSURILOH UHFRPPHQGHU V\VWHPVZKLFKWU\WR DVVLJQ GLIIHUHQW SULRULW\ OHYHOVWRHDFKRIWKHIHDWXUHVRIWKHSURILOHRIGLIIHUHQWXVHUV EDVHGRQWKHFRUUHVSRQGLQJRSWLPL]DWLRQDOJRULWKPVQDPHO\ JHQHWLFDOJRULWKPSDUWLFOHVZDUPRSWLPL]DWLRQDQGLQYDVLYH ZHHGRSWLPL]DWLRQ7KHIRXUWKDOJRULWKP3$LVWKHQRUPDO 3HDUVRQXVHUEDVHGDOJRULWKPVGHVFULEHGLQVHFWLRQ,,, $VLWLVGHULYHG IURPWKH ILJ IRUDOOXVHUVH[FHSWXVHUV QXPEHU DQG WKH TXDOLW\ RI WKH UHFRPPHQGDWLRQV ZLWK SULRULWL]HG XVHUSURILOH DSSURDFKHV DUH EHWWHU WKDQ 3$ DSSURDFK :LWK WKLV FRPSDULVRQ WKH HIIHFWLYHQHVV RI LQFRUSRUDWLQJ GHPRJUDSKLF DQG FRQWHQW LQIRUPDWLRQ LQ SURILOHRIXVHUVLQDGGLWLRQWRWKHRULJLQDOUDWLQJLQIRUPDWLRQ DQG DVVLJQLQJ GLIIHUHQW SULRULW\ OHYHOV IRU HDFK IHDWXUHV RI XVHUSURILOHFRXOGEHVHHQ 506(PHDVXUHIRUXVHUV 8VHUV 506( *$ 362 ,:2 3$ )LJ &RPSDULVRQ RI 506( PHDVXUH LQ SULRULWL]HG XVHUSURILOH UHFRPPHQGHUV\VWHPEDVHGRQGLIIHUHQWRSWLPL]DWLRQDOJRULWKPVDQGZLWK QRUPDO3$UHFRPPHQGHUV\VWHPORZHUQXPEHUVLQGLFDWHEHWWHUUHVXOWV 0RUHRYHU LW FRXOG EH VHHQ WKDW IRU DOO FDVHV WKDW SULRULWL]HG XVHUSURILOH DSSURDFKHV *$ 362 DQG ,:2 4302 2007 IEEE Congress on Evolutionary Computation (CEC 2007)
produced better results, the quality of the Iwo approach is very soon it stocked in local minimum. This inability may be higher than the other two approaches. Note that the Pso and due to ga structure that can only work with discrete values IwO algorithms work with continuous values for feature In addition, mutation and crossover operators may not have weights, but GA approach can only work with discrete sufficient power to produce new solutions in a manner to alues cover all the search space ig. 5 shows the average rmse measure for all 10 users considered in our experiments By considering the average results the difference between Iwo algorithm and other approaches is now clearer. The standard deviation of the Fitness value over time GA, PSO and lwo based approaches are 0.34, 0.38 and 0.3 1.15 respectively which shows that IwO approach in average has 1.1 the best results with the lowest variance average RMSE measure 273 1357911131517192123252729 1.15 1.1 Fig. 6. The evolution procedure of fitness function of three optimization PSO algorithm shows us another important point. Unlike other two algorithms (Iwo and ga)which have decreasing fitness function over time, there are some Fig.5. Comparison of average RMSE for 10 users in different approaches fluctuations in the fitness of the Pso algorithm. In fact, lower numbers indicate better results although we could fixed some of the abnormal behavior with adding the" reflective behavior" for particles when reaching One is to notice that our results have a little difference one of the side constrains, these fluctuations are still can be with the results reported in [2] comparing the GA and PSo seen. Indeed in PSO algorithm, the best global answer found produced better results than GA approach. This difference in combination of other, parameters (ike the best position nd PSO implementation described before. In addition. it particle)in each iteration to affect the motion direction of all should be noted that our experiments were little different particles. Hence, this global best fitness is not saved and than the experiments considered in [2]. Unlike that work may be loosed in future iterations. It is seen from fig6that which select similar users for each test users from the subset many times, the answer found in some iteration is moved of randomly or fixed set of users, as it is mentioned in towards another possibly worse answer generated by the section IV, we made use of active users for this purpose to motion of all particles in future iterations TWO algorithm does not have these inefficiencies. The Fig 6 shows the procedure of changes in fitness function best plants which have the best fitness never removed from of the three optimization algorithms over time The fitness the population and at the same time, newer plants are function is based on the average fitness value of these produced from these best plants with normally distributed spatial dispersal to search for better results and near the xperiments during 30 iterations As it is derived from this figure although the algorithm stabilized later than other approaches VIL. CONCLUSION iteration number 13), it achieved the lowest(the best) among these three algorithms This paper employed the Invasive Weed Optimization ooking at the ga performance, it is clear that in average, (wo)algorith it is stabilized faster than other algorithms(after iteration algorithm inspired colonizing weeds, in a real 7), but the answer found with this algorithm is not application namely recommender systems. This algorithm the best answer. Indeed, ga did not find the lower was used for finding the optimal priorities for each features positions which were found by Iwo algorithm and of profile of different users to tailor the system to the 2007 IEEE Congress on Evolutionary Computation(CEC 2007) 4303
SURGXFHG EHWWHU UHVXOWVWKH TXDOLW\ RIWKH ,:2DSSURDFKLV KLJKHUWKDQWKHRWKHUWZRDSSURDFKHV1RWHWKDWWKH362DQG ,:2 DOJRULWKPV ZRUN ZLWK FRQWLQXRXV YDOXHV IRU IHDWXUH ZHLJKWV EXW *$ DSSURDFK FDQ RQO\ ZRUN ZLWK GLVFUHWH YDOXHV )LJVKRZVWKHDYHUDJH506(PHDVXUHIRUDOOXVHUV FRQVLGHUHG LQ RXU H[SHULPHQWV %\ FRQVLGHULQJ WKH DYHUDJH UHVXOWV WKH GLIIHUHQFH EHWZHHQ ,:2 DOJRULWKP DQG RWKHU DSSURDFKHV LV QRZ FOHDUHU 7KH VWDQGDUG GHYLDWLRQ RI WKH *$362DQG,:2EDVHGDSSURDFKHVDUHDQG UHVSHFWLYHO\ZKLFKVKRZVWKDW,:2DSSURDFKLQDYHUDJHKDV WKHEHVWUHVXOWVZLWKWKHORZHVWYDULDQFH DYHUDJH506(PHDVXUH 506( *$ 362 ,:2 3$ )LJ&RPSDULVRQ RI DYHUDJH506( IRU XVHUVLQ GLIIHUHQW DSSURDFKHV ORZHUQXPEHUVLQGLFDWHEHWWHUUHVXOWV 2QH LV WR QRWLFH WKDW RXU UHVXOWV KDYH D OLWWOH GLIIHUHQFH ZLWKWKH UHVXOWV UHSRUWHGLQ >@FRPSDULQJWKH*$DQG362 DSSURDFKHV ,Q WKDW ZRUN WKH 362 DSSURDFK LQ DYHUDJH SURGXFHGEHWWHUUHVXOWVWKDQ*$DSSURDFK7KLVGLIIHUHQFHLQ UHVXOWV FRXOG EH GXH WR WKH FKDQJHV ZH DSSOLHG LQ RXU *$ DQG 362 LPSOHPHQWDWLRQ GHVFULEHG EHIRUH ,Q DGGLWLRQ LW VKRXOG EH QRWHG WKDW RXU H[SHULPHQWV ZHUH OLWWOH GLIIHUHQW WKDQ WKH H[SHULPHQWV FRQVLGHUHG LQ >@ 8QOLNH WKDW ZRUN ZKLFKVHOHFWVLPLODUXVHUVIRUHDFKWHVWXVHUVIURPWKHVXEVHW RI UDQGRPO\ RU IL[HG VHW RI XVHUV DV LW LV PHQWLRQHG LQ VHFWLRQ ,9ZHPDGHXVHRIDFWLYHXVHUV IRUWKLVSXUSRVHWR LPSURYHRXUUHVXOWV )LJVKRZVWKHSURFHGXUHRIFKDQJHVLQILWQHVVIXQFWLRQ RI WKH WKUHH RSWLPL]DWLRQ DOJRULWKPV RYHUWLPH 7KH ILWQHVV IXQFWLRQ LV EDVHG RQ WKH DYHUDJH ILWQHVV YDOXH RI WKHVH DOJRULWKPV IRU DOO XVHUV FRQVLGHUHG LQ SUHYLRXV H[SHULPHQWVGXULQJLWHUDWLRQV $V LW LV GHULYHG IURP WKLV ILJXUH DOWKRXJK WKH ,:2 DOJRULWKP VWDELOL]HG ODWHU WKDQ RWKHU DSSURDFKHV DIWHU LWHUDWLRQQXPEHULWDFKLHYHGWKHORZHVWWKHEHVWILWQHVV DPRQJWKHVHWKUHHDOJRULWKPV /RRNLQJDWWKH*$SHUIRUPDQFHLWLVFOHDUWKDWLQDYHUDJH LW LV VWDELOL]HG IDVWHU WKDQ RWKHU DOJRULWKPV DIWHU LWHUDWLRQ QXPEHU EXWWKHDQVZHU IRXQGZLWKWKLVDOJRULWKPLV QRW JOREDOO\WKHEHVWDQVZHU,QGHHG*$GLGQRWILQGWKHORZHU EHWWHU SRVLWLRQVZKLFKZHUH IRXQG E\ ,:2DOJRULWKPDQG YHU\VRRQLWVWRFNHGLQORFDOPLQLPXP7KLVLQDELOLW\PD\EH GXHWR*$VWUXFWXUHWKDWFDQRQO\ZRUNZLWKGLVFUHWHYDOXHV ,QDGGLWLRQPXWDWLRQDQGFURVVRYHURSHUDWRUVPD\QRWKDYH VXIILFLHQW SRZHU WR SURGXFH QHZ VROXWLRQV LQ D PDQQHU WR FRYHUDOOWKHVHDUFKVSDFH )LWQHVVYDOXHRYHUWLP H ,WHUDWLRQ )LWQHVV ,:2 362 *$ )LJ 7KH HYROXWLRQ SURFHGXUH RI ILWQHVV IXQFWLRQ RI WKUHH RSWLPL]DWLRQ DOJRULWKP 362DOJRULWKP VKRZVXVDQRWKHULPSRUWDQWSRLQW8QOLNH RWKHU WZR DOJRULWKPV ,:2 DQG *$ ZKLFK KDYH D GHFUHDVLQJ ILWQHVV IXQFWLRQ RYHU WLPH WKHUH DUH VRPH IOXFWXDWLRQV LQ WKH ILWQHVV RI WKH 362 DOJRULWKP ,Q IDFW DOWKRXJKZHFRXOGIL[HGVRPHRIWKHDEQRUPDOEHKDYLRUZLWK DGGLQJWKH³UHIOHFWLYHEHKDYLRU´IRUSDUWLFOHVZKHQUHDFKLQJ RQHRIWKHVLGHFRQVWUDLQVWKHVHIOXFWXDWLRQVDUHVWLOOFDQEH VHHQ,QGHHGLQ362DOJRULWKPWKHEHVWJOREDODQVZHUIRXQG E\ DOO SDUWLFOHV LV QRW XVHG GLUHFWO\ DQG LV XVHG LQ FRPELQDWLRQ RI RWKHU SDUDPHWHUV OLNH WKH EHVW SRVLWLRQ IRXQG E\ HDFK SDUWLFOH DQG WKH FXUUHQW GLUHFWLRQ RI WKDW SDUWLFOHLQHDFKLWHUDWLRQWRDIIHFWWKHPRWLRQGLUHFWLRQRIDOO SDUWLFOHV +HQFH WKLV JOREDO EHVW ILWQHVV LV QRW VDYHG DQG PD\EHORRVHGLQIXWXUHLWHUDWLRQV,WLVVHHQIURPILJWKDW PDQ\ WLPHV WKH DQVZHU IRXQG LQ VRPH LWHUDWLRQ LV PRYHG WRZDUGV DQRWKHU SRVVLEO\ ZRUVH DQVZHU JHQHUDWHG E\ WKH PRWLRQRIDOOSDUWLFOHVLQIXWXUHLWHUDWLRQV ,:2 DOJRULWKP GRHV QRW KDYH WKHVH LQHIILFLHQFLHV 7KH EHVWSODQWVZKLFKKDYHWKHEHVW ILWQHVVQHYHU UHPRYHG IURP WKH SRSXODWLRQ DQG DW WKH VDPH WLPH QHZHU SODQWV DUH SURGXFHG IURP WKHVH EHVW SODQWV ZLWK QRUPDOO\ GLVWULEXWHG VSDWLDO GLVSHUVDO WR VHDUFK IRU EHWWHU UHVXOWV DQG QHDU WKH SRWHQWLDOEHVWVROXWLRQV 9,, &21&/86,21 7KLV SDSHU HPSOR\HG WKH ,QYDVLYH :HHG 2SWLPL]DWLRQ ,:2 DOJRULWKP ZKLFK LV D QXPHULFDO RSWLPL]DWLRQ DOJRULWKP LQVSLUHG IURP FRORQL]LQJ ZHHGV LQ D UHDO DSSOLFDWLRQ QDPHO\ UHFRPPHQGHU V\VWHPV 7KLV DOJRULWKP ZDVXVHGIRUILQGLQJWKHRSWLPDOSULRULWLHVIRUHDFKIHDWXUHV RI SURILOH RI GLIIHUHQW XVHUV WR WDLORU WKH V\VWHP WR WKH 2007 IEEE Congress on Evolutionary Computation (CEC 2007) 4303
preferences of individual users and thereby generating more personalized and accurate recommendations to them Experimental results showed that assigning different priority vels to profile of different users(prioritized user-profile proach) outperformed the standard non-adaptive user- based Pearson algorithm in most cases and in all cases that ized user oac TWO algorithm produced more accurate results compared to other optimization algorithms, GA and Pso approaches which have been applied in previous works. The evolution of fitness function over time of these three optimization algorithms demonstrated that the iwo algorithm has the more chance to avoid local minimum points compared to GA and less fluctuations compared to Pso due to its continuous and normally distributed dispersal structure over search space which has a decreasing variance parameter centered on each parent Hence, Two algorithm is more powerful in finding ats in this special ACKNOWLEDGEMENTS We would like to thank the researchers at the University of Minnesota, especially GroupLens member for making the MovieLens dataset publicly available REFERENCES [] H. Sepehri, C. Lucas, "On the Effectiveness of Prioritized User- recommender Systems, proceedings of the IEEE third workshop of WPRSIUI in conjunction with 23rd ICDE, Istanbul, Turkey, April [2] s. Uiin and P. J. Bentley,"Particle Swarm Optimization Recommender System", In Proceedings of 2002 IEEE International olutionary Computation, pp. 124-131A. 3] S. Ujin and PJ. Bentely, "Learning User Preferences using volution". In Proceedings of the 4 Asia-P 4 Mehrabian, C. Lucas, and S. Mohagheghi,"A Novel Numerical ptimization Algorithm Inspired from Weed Colonization cological Informatics, rol. 1. no. 4, December 2006, pp. 355-30 [5] Breese, Heckermen, and Kadie, "Empirical Analysis of Predictive Algorithms for Collaborative Filtering", technical report, Microsoft [6] J. Herlocker, J. Konstan, A. Borchers, and J. Riedl, "An algorithmic amework for performing collaborative filtering", In Proceedings of CM SIGIR 门7M. Pazzani, amework for collaborative. content-based and [8] G. Lekakos, G. M. Giaglis, " Improving the prediction accuracy of commendation algorithms: Approaches anchored on human actors, journal of Interacting with Computers, 18(2006), pp. 410- L. G. Terveen, and J. ried Transactions on Information Systems, Vol. 22, No ry2004 Applications and Resources", In Proceedings of the 2001 Congress on Evolutionary Computation, voL. 1, pp. 81-8 4304 2007 IEEE Congress on Evolutionary Computation(CEC 2007)
SUHIHUHQFHVRILQGLYLGXDOXVHUVDQGWKHUHE\JHQHUDWLQJPRUH SHUVRQDOL]HG DQG DFFXUDWH UHFRPPHQGDWLRQV WR WKHP ([SHULPHQWDOUHVXOWVVKRZHGWKDWDVVLJQLQJGLIIHUHQWSULRULW\ OHYHOV WR SURILOH RI GLIIHUHQW XVHUV SULRULWL]HG XVHUSURILOH DSSURDFK RXWSHUIRUPHG WKH VWDQGDUG QRQDGDSWLYH XVHU EDVHG3HDUVRQDOJRULWKPLQPRVWFDVHVDQGLQDOOFDVHVWKDW SULRULWL]HG XVHUSURILOH DSSURDFK DFKLHYHG EHWWHU UHVXOWV ,:2DOJRULWKPSURGXFHGPRUHDFFXUDWHUHVXOWVFRPSDUHGWR RWKHU RSWLPL]DWLRQ DOJRULWKPV *$ DQG 362 DSSURDFKHV ZKLFK KDYH EHHQ DSSOLHGLQ SUHYLRXV ZRUNV 7KH HYROXWLRQ RI ILWQHVV IXQFWLRQ RYHU WLPH RI WKHVH WKUHH RSWLPL]DWLRQ DOJRULWKPV GHPRQVWUDWHG WKDW WKH ,:2 DOJRULWKP KDV WKH PRUH FKDQFH WR DYRLG ORFDO PLQLPXP SRLQWV FRPSDUHG WR *$ DQG OHVV IOXFWXDWLRQV FRPSDUHG WR 362 GXH WR LWV FRQWLQXRXVDQGQRUPDOO\GLVWULEXWHGGLVSHUVDOVWUXFWXUHRYHU VHDUFK VSDFH ZKLFK KDV D GHFUHDVLQJ YDULDQFH SDUDPHWHU FHQWHUHG RQ HDFK SDUHQW SODQW +HQFH ,:2 DOJRULWKP LV PRUH SRZHUIXOLQ ILQGLQJWKH RSWLPDO SRLQWVLQWKLV VSHFLDO DSSOLFDWLRQ $&.12:/('*(0(176 :HZRXOGOLNHWRWKDQNWKH UHVHDUFKHUVDWWKH8QLYHUVLW\RI 0LQQHVRWD HVSHFLDOO\ *URXS/HQV PHPEHU IRU PDNLQJ WKH 0RYLH/HQVGDWDVHWSXEOLFO\DYDLODEOH 5()(5(1&(6 >@ + 6HSHKUL & /XFDV ³2Q WKH (IIHFWLYHQHVV RI 3ULRULWL]HG 8VHU 3URILOH DQG 'HWHFWLQJ $FWLYH 8VHUV LQ &ROODERUDWLYH )LOWHULQJ 5HFRPPHQGHU6\VWHPV´SURFHHGLQJVRIWKH ,(((WKLUGZRUNVKRSRI :356,8, LQ FRQMXQFWLRQ ZLWK UG ,&'( ,VWDQEXO 7XUNH\ $SULO >@ 6 8MMLQ DQG 3 - %HQWOH\ ³3DUWLFOH 6ZDUP 2SWLPL]DWLRQ 5HFRPPHQGHU 6\VWHP´ ,Q 3URFHHGLQJV RI ,((( ,QWHUQDWLRQDO FRQIHUHQFHRQ(YROXWLRQDU\&RPSXWDWLRQSS$ >@ 6 8MMLQ DQG 3- %HQWHO\ ³/HDUQLQJ 8VHU 3UHIHUHQFHV XVLQJ (YROXWLRQ´ ,Q 3URFHHGLQJV RI WKH WK $VLD3DFLILF &RQIHUHQFH RQ 6LPXODWHG(YROXWLRQDQG/HDUQLQJ6LQJDSRUH >@ 5 0HKUDELDQ & /XFDV DQG 6 0RKDJKHJKL $ 1RYHO 1XPHULFDO 2SWLPL]DWLRQ $OJRULWKP ,QVSLUHG IURP :HHG &RORQL]DWLRQ (FRORJLFDO,QIRUPDWLFVYROQR'HFHPEHUSS >@ %UHHVH +HFNHUPHQ DQG .DGLH ³(PSLULFDO $QDO\VLV RI 3UHGLFWLYH $OJRULWKPV IRU &ROODERUDWLYH )LOWHULQJ´ WHFKQLFDO UHSRUW 0LFURVRIW 5HVHDUFK2FWREHU >@ -+HUORFNHU -.RQVWDQ$%RUFKHUVDQG -5LHGO³$QDOJRULWKPLF IUDPHZRUN IRU SHUIRUPLQJ FROODERUDWLYH ILOWHULQJ´ ,Q 3URFHHGLQJV RI $&06,*,5 >@ 0 3D]]DQL ³$ IUDPHZRUN IRU FROODERUDWLYH FRQWHQWEDVHG DQG GHPRJUDSKLF ILOWHULQJ´ $UWLILFLDO ,QWHOOLJHQFH 5HYLHZ ± SS ± >@ * /HNDNRV * 0 *LDJOLV ³,PSURYLQJ WKH SUHGLFWLRQ DFFXUDF\ RI UHFRPPHQGDWLRQ DOJRULWKPV $SSURDFKHV DQFKRUHG RQ KXPDQ IDFWRUV´MRXUQDO RI ,QWHUDFWLQJZLWK&RPSXWHUV SS ± >@ - / +HUORFNHU - $ .RQVWDQ / * 7HUYHHQ DQG - 5LHGO ³(YDOXDWLQJ FROODERUDWLYH ILOWHULQJ UHFRPPHQGHU V\VWHPV´ $&0 7UDQVDFWLRQV RQ ,QIRUPDWLRQ 6\VWHPV9RO 1R -DQXDU\ SS± >@ 5&(EHUKDUW<6KL³3DUWLFOH6ZDUP2SWLPL]DWLRQ'HYHORSPHQWV $SSOLFDWLRQVDQG5HVRXUFHV´,Q3URFHHGLQJVRIWKH&RQJUHVVRQ (YROXWLRQDU\&RPSXWDWLRQYROSS 4304 2007 IEEE Congress on Evolutionary Computation (CEC 2007)