UMS ¼ w1OS þ w2USS þ w_中国高校课件下载中心

点击下载：《电子商务 E-business》阅读文献：Utilizing various sparsity measures for enhancing accuracy of collaborative recommender systems based on local and global similarities

正在加载图片...

D. Anand, KK Bharadwaj/ Expert Systems with Applications 38(2011)5101-5109 UMS=W,OS+W2 USS+W3LGR+WAUIS1+wsUIS2 3 Uniform mutation and arithmetic crossover operators are used here W= 1 as the basic mutation and crossover operators respectively, in our Genetic algorithms can be effectively employed to learn experiments weights of each of the five proposed a-estimates for every user, 3. 3. Fitness function in an offline learning process. Such a set of weights can then b The fitness function is the factor which drives the ga process used to compute the value of a during prediction for the active towards convergence to the optimal solution. Chromosomes which user. The next subsection discusses the genetic operators and the are more optimal, are allowed to breed and mix their datasets by fitness function utilized to generate the set of weights. any of several techniques, producing a new generation that will be even better. Choosing a fitness function for finding the set of 3. 1. Automatic weighting of sparsity measures using real-valued weights for the sparsity measures, requires that the weight shoul genetic algorith be evaluated via the error that the weight produces while predict Genetic algorithms base their operation on the Darwanian prin- ing each rating in the user's training set. An individual whose error ciple of"survival of the fittest"and utilize artificial evolution to get is lesser is a fitter individual. The weights are learnt for the active enhanced solutions with each iteration. The GA process starts with user by trying to predict each rating in the user's training set using a population of candidate solutions known as chromosome the particular weight to combine the five sparsity genotype. Each chromosome in the population has an associated determining how well the unified sparsity measure balances local ness and these scores are used in a competition to determine and global predictions. The fitness function is the average error which chromosomes are used to form new ones(De Jong, 1988). among all such predictions for the active user As the population evolves from one generation of chromosomes with the genetics inspired operations of crossover mutation and fitness =Ti ITal-pn山 inversion(Mitchell, 1998), the solutions tend to approach the gle bal optimum regardless of the topography of the fitness landscape where T is the set of all ratings in the training set of the user, rai is The traditional chromosome representation technique of using the actual rating for item i by the active user, and pray is the pre- bit strings, suffer from slow convergence due to the large chromo dicted score computed using the combined framework. The tructure,especially in optimization problems involvin weights to be applied for each of the measures of spar al parameters Practical experience favors utilization of the sity(wl, w2,-.w5) to arrive at a unified sparsity measures, (eq most natural representation of solutions rather than enforce a (13) can be computed by employing real valued GA as described ary vector ation for (Back, Fogel, Michalewicz, 1997). Since the weights for each of the sparsity measures are real numbers, a floating point representation of 4. Proposed recommender system framework chromosomes is most suited for our purpose. Thus each chromo some consists of a vector of five real numbers representing weights The proposed estimates of o can be utilized to integrate predic corresponding to five different sparsity measures. The population tions from locally and globally similar users and arrive at the final then evolves over several generations to arrive at the weight vector predicted vote for the active user. The main steps of the proposed recommender system framework are given below: 3. 2.2. Genetic operators Step 1: Compute local user similarities. Genetic operators drive the search process in GA Crossover and Compute the SvsS similarity between all pairs of users, mutation are the two basic genetic operators. while crossover al using the formula(6). Let sim(x, y) refer to the local simi- tion operators for real-valued GA was introduced by Michale Step 2: corby between users x and y as computed in this step lows creation of two new individuals by allowing two parent chi mosomes to exchange meaningful information, mutation is used to mute the global user similarities. naintain the genetic diversity of the population by introducing a Compute the global similarities between all pairs of users, ompletely new member into the population. Crossover and mut by first constructing the user graph based on local similar ities, and then finding the maximin distance between (1992). Let a chromosome be represented by n real n users, as described in Section 2. Let simd(x, y)refer to the bers(genes) where the ith gene lying in the range [a, b. Given global similarity between users x and y as computed two parent chromosomes X=(x1, x2,..., xn) and Y=y1, y2, ...,yn). this step. some of the different types of crossover and muta r real Step 3: Obtain predicted ratings using local and global neighbors. valued GA, are discussed below(Houck, Joines, The predicted rating for an item i for active user u is based Michalewicz, 1992). on Resnicks prediction formula(resnick et al., 1994). A gene i is randomly selected and its value is modified to a uni orm random number between ai and b (17) U(ai, bi), if i=j otherwise (14) where r is the mean rating for user i, simy is the similarity between Arithmetic crossover The above formula can be used to arrive at predictions from local Arithmetic crossover generates two new complementary neighborhood by setting simy to simL(i ) and N()to the local neigh ffspring as a linear combination of the parent chromosomes. borhood for user i. A similar method can be adopted to dictions from global neighborhood. Let prkk and pri be the X=X+(1-)Y (15) predicted ratings using local and global similarities respectively Step 4: Merge the local and global ratings The predicted ratings from local and global neighborhoods are where t=U(0, 1)and X' and y are the generated offspring. combined using the following formula:UMS ¼ w1OS þ w2USS þ w3LGR þ w4UIS1 þ w5UIS2 ð13Þ where P5 i¼1wi ¼ 1. Genetic algorithms can be effectively employed to learn weights of each of the five proposed a-estimates for every user, in an offline learning process. Such a set of weights can then be used to compute the value of a during prediction for the active user. The next subsection discusses the genetic operators and the fitness function utilized to generate the set of weights. 3.2.1. Automatic weighting of sparsity measures using real-valued genetic algorithm Genetic algorithms base their operation on the Darwanian principle of ‘‘survival of the fittest” and utilize artificial evolution to get enhanced solutions with each iteration. The GA process starts with a population of candidate solutions known as chromosome or genotype. Each chromosome in the population has an associated fitness and these scores are used in a competition to determine which chromosomes are used to form new ones (De Jong, 1988). As the population evolves from one generation of chromosomes to a new population by using a kind of natural selection together with the genetics inspired operations of crossover, mutation and inversion (Mitchell, 1998), the solutions tend to approach the global optimum regardless of the topography of the fitness landscape. The traditional chromosome representation technique of using bit strings, suffer from slow convergence due to the large chromosome structure, especially in optimization problems involving several parameters Practical experience favors utilization of the most natural representation of solutions rather than enforce a binary vector representation for many problems (Bäck, Fogel, & Michalewicz, 1997). Since the weights for each of the sparsity measures are real numbers, a floating point representation of chromosomes is most suited for our purpose. Thus each chromosome consists of a vector of five real numbers representing weights corresponding to five different sparsity measures. The population then evolves over several generations to arrive at the weight vector which is the fittest. 3.2.2. Genetic operators Genetic operators drive the search process in GA. Crossover and mutation are the two basic genetic operators. While crossover allows creation of two new individuals by allowing two parent chromosomes to exchange meaningful information, mutation is used to maintain the genetic diversity of the population by introducing a completely new member into the population. Crossover and mutation operators for real-valued GA was introduced by Michalewicz (1992). Let a chromosome be represented by n real numbers(genes) where the ith gene lying in the range [ai,bi]. Given two parent chromosomes X = hx1,x2,...,xni and Y = hy1,y2,...,yni, some of the different types of crossover and mutation for realvalued GA, are discussed below (Houck, Joines, & Kay, 1995; Michalewicz, 1992). Uniform mutation A gene i is randomly selected and its value is modified to a uniform random number between ai and bi: x0 j ¼ Uðai; biÞ; if i ¼ j xj; otherwise ð14Þ Arithmetic crossover Arithmetic crossover generates two new complementary offspring as a linear combination of the parent chromosomes. X0 ¼ sX þ ð1 sÞY Y0 ¼ ð1 sÞX þ sY ð15Þ where s = U(0,1) and X0 and Y0 are the generated offspring. Uniform mutation and arithmetic crossover operators are used as the basic mutation and crossover operators respectively, in our experiments. 3.2.3. Fitness function The fitness function is the factor which drives the GA process towards convergence to the optimal solution. Chromosomes which are more optimal, are allowed to breed and mix their datasets by any of several techniques, producing a new generation that will be even better. Choosing a fitness function for finding the set of weights for the sparsity measures, requires that the weight should be evaluated via the error that the weight produces while predicting each rating in the user’s training set. An individual whose error is lesser is a fitter individual. The weights are learnt for the active user by trying to predict each rating in the user’s training set using the particular weight to combine the five sparsity measures, and determining how well the unified sparsity measure balances local and global predictions. The fitness function is the average error among all such predictions for the active user. fitness ¼ 1 jTj X i2T jra;i pra;ij ð16Þ where T is the set of all ratings in the training set of the user, ra,i is the actual rating for item i by the active user, and pra,I is the predicted score computed using the combined framework.. The weights to be applied for each of the measures of sparsity(w1,w2,...w5) to arrive at a unified sparsity measures, (Eq. (13)), can be computed by employing real valued GA as described in this section. 4. Proposed recommender system framework The proposed estimates of a can be utilized to integrate predictions from locally and globally similar users and arrive at the final predicted vote for the active user. The main steps of the proposed recommender system framework are given below: Step 1: Compute local user similarities. Compute the SVSS similarity between all pairs of users, using the formula (6). Let simL(x,y) refer to the local similarity between users x and y as computed in this step. Step 2: Compute the global user similarities. Compute the global similarities between all pairs of users, by first constructing the user graph based on local similarities, and then finding the maximin distance between users, as described in Section 2. Let simG(x,y) refer to the global similarity between users x and y as computed in this step. Step 3: Obtain predicted ratings using local and global neighbors. The predicted rating for an item i for active user u is based on Resnick’s prediction formula (Resnick et al., 1994). pri;k ¼ ri þ P j2NðiÞsimi;j ðrj;k rjÞ P j2NðiÞjsimi;jj ð17Þ where ri is the mean rating for user i, simi,j is the similarity between users i and j and N(i) is the neighborhood of user i. The above formula can be used to arrive at predictions from local neighborhood by setting simij to simL(i,j) and N(i) to the local neighborhood for user i. A similar method can be adopted to arrive at predictions from global neighborhood. Let prL i;k and prG i;k be the predicted ratings using local and global similarities respectively. Step 4 : Merge the local and global ratings. The predicted ratings from local and global neighborhoods are combined using the following formula: D. Anand, K.K. Bharadwaj / Expert Systems with Applications 38 (2011) 5101–5109 5105

<<向上翻页向下翻页>>

点击下载：《电子商务 E-business》阅读文献：Utilizing various sparsity measures for enhancing accuracy of collaborative recommender systems based on local and global similarities