正在加载图片...
main tasks: profile selection, profile matching and best ws is the active user's weight for feature profile collection i is a common movie item, where profile(A, i) and profile, i)exists 2.2.1. Profile Selection difi, An)is the difference in profile value for In an ideal world, the entire database of profiles would be feature f between users A and j on movie item i used to select the best possible profiles. However this is not al ways a feasible option, especially when the dataset is Note that before this calculation is made, the profile very large or if resources are not available. As a result, values are normalised to ensure they lie between 0 and 1 most systems opt for random sampling and this process is When the weight for any feature is zero, that feature is the responsibility of the profile selection part of the ignored. This way we enable feature selection to be algorithm adaptive to each user's preferences. The difference in profile values for occupation is either 0, if the two users 2.2.2. Profile Matching have the same occupation, or I otherwise After profile selection, the profile matching process then computes the distance or similarity between the selected 2.2.3. Best Profile Collection Once the Euclidean distances, euclidean(A,), have been that most current recommender systems use standard picked by the profile selection process, the "best profile algorithms that consider only"voting information"as the collection algorithm is called. This ranks every profile) feature on which the comparison between two profiles is according to its similarity to profile/d). The system then made. However in real life, the way in which two people Simply selects the users whose Euclidean distance is above are said to be similar is not based solely on whether they a certain threshold value(considered most similar to the have complimentary opinions on a specific subject, e.g active user) as the neighbourhood of A. This value is a movie ratings, but also on other factors, such as their system constant that can be changed. To make a background and personal details. If we apply this to the recommendation, given an active user A and a profile matcher, issues such as demographic and lifestyle neighbourhood set of similar profiles to A, it is necessary information which include user's age, gender and to find movie items seen(and liked) by the users in the efer rences of movie genres must also be taken into neighbourhood set that the active user has not seen. These are then presented to the active user through a user priority on each feature. Our approach shows how weights interface defining user's priorities can be evolved by a genetic algorithm of weights as shown below in Figure 2 euclidean(A, D similarity(A Figure 2: Phenotype of an individual in the population. weights(A)口 where w, is the weight associated with feature f whose genotype is a string of binary values. Each individual contains 22 genes, which are evolved by an elitist genetic Genetic algorithm(described in section 2.3) The comparison between two profiles can now be onducted using a modified Euclidean distance function, Figure 3: Calculating the similarity between A andj. which takes into account multiple features. Euclidean(A, j) is the similarity between active user A and user j 2.3. Genetic algorithm ellc Wf*d团,f(A,) An elitist genetic algorithm was chosen for this task, where where: A is the active user a quarter of the best individuals in the population are kept j is a user provided by the profile selection for the next generation. When creating a new generation, process, where j≠A individuals are selected randomly out of the top 40% of is the number of common movies that users a the whole population to be parents. Two offspring are andj have rated produced from every pair of parents, using single-pointmain tasks: profile selection, profile matching and best profile collection. 2.2.1. Profile Selection In an ideal world, the entire database of profiles would be used to select the best possible profiles. However this is not always a feasible option, especially when the dataset is very large or if resources are not available. As a result, most systems opt for random sampling and this process is the responsibility of the profile selection part of the algorithm. 2.2.2. Profile Matching After profile selection, the profile matching process then computes the distance or similarity between the selected profiles and the active user's profile using a distance function. From the analysis of Breese et. al [3], it seems that most current recommender systems use standard algorithms that consider only “voting information” as the feature on which the comparison between two profiles is made. However in real life, the way in which two people are said to be similar is not based solely on whether they have complimentary opinions on a specific subject, e.g., movie ratings, but also on other factors, such as their background and personal details. If we apply this to the profile matcher, issues such as demographic and lifestyle information which include user’s age, gender and preferences of movie genres must also be taken into account. Every user places a different importance or priority on each feature. Our approach shows how weights defining user’s priorities can be evolved by a genetic algorithm. A potential solution to the problem of evolving feature weights, w(A), for the active user, A is represented as a set of weights as shown below in Figure 2. w1 w2 w3 … w22 Figure 2: Phenotype of an individual in the population. where wf is the weight associated with feature f whose genotype is a string of binary values. Each individual contains 22 genes, which are evolved by an elitist genetic algorithm (described in section 2.3). The comparison between two profiles can now be conducted using a modified Euclidean distance function, which takes into account multiple features. Euclidean(A,j) is the similarity between active user A and user j: ∑∑= = = z i f i f f euclidean A j w diff A j 1 2 , 22 1 ( , ) * ( , ) where: A is the active user j is a user provided by the profile selection process, where j ≠ A z is the number of common movies that users A and j have rated. wf, is the active user’s weight for feature f i is a common movie item, where profile(A,i) and profile(j,i) exists. diffi,f(A,j) is the difference in profile value for feature f between users A and j on movie item i. Note that before this calculation is made, the profile values are normalised to ensure they lie between 0 and 1. When the weight for any feature is zero, that feature is ignored. This way we enable feature selection to be adaptive to each user’s preferences. The difference in profile values for occupation is either 0, if the two users have the same occupation, or 1 otherwise. 2.2.3. Best Profile Collection Once the Euclidean distances, euclidean(A,j), have been found between profile(A) and profile(j) for all values of j picked by the profile selection process, the "best profile collection" algorithm is called. This ranks every profile(j) according to its similarity to profile(A). The system then simply selects the users whose Euclidean distance is above a certain threshold value (considered most similar to the active user) as the neighbourhood of A. This value is a system constant that can be changed. To make a recommendation, given an active user A and a neighbourhood set of similar profiles to A, it is necessary to find movie items seen (and liked) by the users in the neighbourhood set that the active user has not seen. These are then presented to the active user through a user interface. Figure 3: Calculating the similarity between A and j. 2.3. Genetic Algorithm An elitist genetic algorithm was chosen for this task, where a quarter of the best individuals in the population are kept for the next generation. When creating a new generation, individuals are selected randomly out of the top 40% of the whole population to be parents. Two offspring are produced from every pair of parents, using single-point Genetic Algorithm similarity( , ) A j Genotype to phenotype mapping DB profile selection profile(A,i) profile(j,i) weights(A) euclidean(A,j) =
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有