Journal of Mathematical Modelling and Algorithms (2005)4: 181-198 Springer 2005 DOI:10.1007/s10852-0045336-7 A Recommender System based on Idiotypic Artificial immune Networks STEVE CAYZER and UWE AICKELIN,* THewlett-Packard Laboratories, Filton Road, Bristol BS12 60Z, UK. e-mail: steve cayzer@ hp. cor School of Computer Science, University of Nottingham, NG& IBB, UK. e- mail: ira@cs. nott. ac uk (Received: 10 February 2003: in final form: 3 March 2004) Abstract. The immune system is a complex biological system with a highly distributed, adaptive d self-organising nature. This paper presents an Artificial Immune System(AlS)that exploits some of these characteristics and is applied to the task of film recommendation by Collaborative Filtering (CR). Natural evolution and in particular the immune system have not been designed for classical optimisation. However, for this problem, we are not interested in finding a single optimum. Rather we intend to identify a sub-set of good matches on which recommendations can be based. It is our hypothesis that an Als built on two central aspects of the biological immune system will be an ideal candidate to achieve this: Antigen-antibody interaction for matching and idiotypic antibody antibody interaction for diversity. Computational results are presented in support of this conjecture and compared to those found by other CF techniques Mathematics Subject Classifications(2000): 68RXx, 68TXX, 90Bxx Key words: artificial immune systems, idiotypic networks. 1. Introductio Over the last few years, a novel computational intelligence technique, inspired by biology, has emerged: AIs. This section introduces AIS and shows how it can be used for solving computational problems. In essence, the immune system used here as inspiration to create an unsupervised machine-learning algorithm. The immune system metaphor will be explored, involving a brief overview of the basic immunological theories that are relevant to our work. We also introduce the basic concepts of CF. 1. 1. OVERVIEW OF THE IMMUNE SYSTEM a detailed overview of the immune system can be found in many textbooks, for instance [19]. Briefly, the purpose of the immune system is to protect the body against infection and includes a set of mechanisms collectively termed humoral
Journal of Mathematical Modelling and Algorithms (2005) 4: 181–198 © Springer 2005 DOI: 10.1007/s10852-004-5336-7 A Recommender System based on Idiotypic Artificial Immune Networks STEVE CAYZER1 and UWE AICKELIN2, 1Hewlett-Packard Laboratories, Filton Road, Bristol BS12 6QZ, UK. e-mail: steve.cayzer@hp.com 2School of Computer Science, University of Nottingham, NG8 1BB, UK. e-mail: uxa@cs.nott.ac.uk (Received: 10 February 2003; in final form: 3 March 2004) Abstract. The immune system is a complex biological system with a highly distributed, adaptive and self-organising nature. This paper presents an Artificial Immune System (AIS) that exploits some of these characteristics and is applied to the task of film recommendation by Collaborative Filtering (CF). Natural evolution and in particular the immune system have not been designed for classical optimisation. However, for this problem, we are not interested in finding a single optimum. Rather we intend to identify a sub-set of good matches on which recommendations can be based. It is our hypothesis that an AIS built on two central aspects of the biological immune system will be an ideal candidate to achieve this: Antigen–antibody interaction for matching and idiotypic antibody– antibody interaction for diversity. Computational results are presented in support of this conjecture and compared to those found by other CF techniques. Mathematics Subject Classifications (2000): 68Rxx, 68Txx, 90Bxx. Key words: artificial immune systems, idiotypic networks. 1. Introduction Over the last few years, a novel computational intelligence technique, inspired by biology, has emerged: AIS. This section introduces AIS and shows how it can be used for solving computational problems. In essence, the immune system is used here as inspiration to create an unsupervised machine-learning algorithm. The immune system metaphor will be explored, involving a brief overview of the basic immunological theories that are relevant to our work. We also introduce the basic concepts of CF. 1.1. OVERVIEW OF THE IMMUNE SYSTEM A detailed overview of the immune system can be found in many textbooks, for instance [19]. Briefly, the purpose of the immune system is to protect the body against infection and includes a set of mechanisms collectively termed humoral Corresponding author
182 immunity. This refers to a population of circulating white blood cells called B lymphocytes, and the antibodies they create The features that are particularly relevant to our research are matching, diversity and distributed control. Matching refers to the binding between antibodies and ntigens. Diversity refers to the fact that, in order to achieve optimal antigen space coverage, antibody diversity must be encouraged [15]. Distributed control means that there is no central controller, rather, the immune system is governed by local interactions between cells and antibodies The idiotypic effect builds on the premise that antibodies can match other anti- bodies as well as antigens. It was first proposed by Jerne [17] and formalised into a model by Farmer et al. [ll]. The theory is currently debated by immunologists, with no clear consensus yet on its effects in the humoral immune system [14] The idiotypic network hypothesis builds on the recognition that antibodies can match other antibodies as well as antigens. Hence, an antibody may be matched by other antibodies, which in turn may be matched by yet other antibodies. This activation can continue to spread through the population and potentially has much explanatory power. It could, for example, help explain how the memory of past infections is maintained. Furthermore, it could result in the suppression of similar antibodies thus encouraging diversity in the antibody pool. The idiotypic network has been formalised by a number of theoretical immunologists [21] There are many more features of the immune system, including adaptation immunological memory and protection against auto-immune attack. Since these are not directly relevant to this work, they will not be reviewed here 1.2.OERⅤ IEW OF CE In this paper, we are using an AIS as a CF technique extending earlier work ( 5-7D CF is the term for a broad range of algorithms that use similarity measures to obtain recommendations. The best-known example is probably the "people whe bought this also bought" feature of the internet company Amazon [2]. However, any problem domain where users are required to rate items is amenable to CF tech- niques. Commercial applications are usually called recommender systems [22] A canonical example is movie recommendation In traditional ce. the items to be recommended are treated as 'black boxes That is, your recommendations are based purely on the votes of your neighbours, and not on the content of the item. The preferences of a user, usually a set of votes on an item, comprise a user profile, and these profiles are compared in order to build a neighbourhood. The key decisions to be made are Data encoding: Perhaps the most obvious representation for a user profile is a string of numbers, where the length is the number of items, and the position is the item identifier. Each number represents thevote' for an item. Votes are sometimes binary(e.g, Did you visit this web page? ) but can also be integers in a range(say [0, 5])
182 STEVE CAYZER AND UWE AICKELIN immunity. This refers to a population of circulating white blood cells called Blymphocytes, and the antibodies they create. The features that are particularly relevant to our research are matching, diversity and distributed control. Matching refers to the binding between antibodies and antigens. Diversity refers to the fact that, in order to achieve optimal antigen space coverage, antibody diversity must be encouraged [15]. Distributed control means that there is no central controller, rather, the immune system is governed by local interactions between cells and antibodies. The idiotypic effect builds on the premise that antibodies can match other antibodies as well as antigens. It was first proposed by Jerne [17] and formalised into a model by Farmer et al. [11]. The theory is currently debated by immunologists, with no clear consensus yet on its effects in the humoral immune system [14]. The idiotypic network hypothesis builds on the recognition that antibodies can match other antibodies as well as antigens. Hence, an antibody may be matched by other antibodies, which in turn may be matched by yet other antibodies. This activation can continue to spread through the population and potentially has much explanatory power. It could, for example, help explain how the memory of past infections is maintained. Furthermore, it could result in the suppression of similar antibodies thus encouraging diversity in the antibody pool. The idiotypic network has been formalised by a number of theoretical immunologists [21]. There are many more features of the immune system, including adaptation, immunological memory and protection against auto-immune attack. Since these are not directly relevant to this work, they will not be reviewed here. 1.2. OVERVIEW OF CF In this paper, we are using an AIS as a CF technique extending earlier work ([5–7]). CF is the term for a broad range of algorithms that use similarity measures to obtain recommendations. The best-known example is probably the “people who bought this also bought” feature of the internet company Amazon [2]. However, any problem domain where users are required to rate items is amenable to CF techniques. Commercial applications are usually called recommender systems [22]. A canonical example is movie recommendation. In traditional CF, the items to be recommended are treated as ‘black boxes’. That is, your recommendations are based purely on the votes of your neighbours, and not on the content of the item. The preferences of a user, usually a set of votes on an item, comprise a user profile, and these profiles are compared in order to build a neighbourhood. The key decisions to be made are: • Data encoding: Perhaps the most obvious representation for a user profile is a string of numbers, where the length is the number of items, and the position is the item identifier. Each number represents the ‘vote’ for an item. Votes are sometimes binary (e.g., Did you visit this web page?), but can also be integers in a range (say [0, 5])
A RECOMMENDER SYSTEM BASED ON IDIOTYPIC ARTIFICIAL IMMUNE NETWORKS 183 Similarity Measure: The most common method to compare two users is a correlation-based measure like Pearson or Spearman, which gives two neigh bours a matching score between-l and 1. Vector based, e. g, cosine of the angle between vectors, and probabilistic methods are alternative approaches The canonical example is the k-Nearest-Neighbour algorithm, which uses a match ing method to select k reviewers with high similarity measures. The votes from these reviewers, suitably weighted, are used to make predictions and recommenda Many improvements on this method are possible [13]. For example, the user profiles are usually extremely sparse because many items are not rated. This means that similarity measurements are both inefficient(the so-called 'curse of dimen ionality)and difficult to calculate due to the small overlap. Default votes are sometimes used for items a user has not explicitly voted on, and these can increase e overlap size [4]. Dimensionality reduction methods, such as Single Value De composition, both improve efficiency and increase overlap [3]. Other pre-proces- sing methods are often used, e.g., clustering [1]. Content-based information can be used to enhance the pure CF approach [13, 9]. Finally, the weighting of each neighbour can be adjusted by training, and there are many learning algorithm available for this [10]. All these improvements could in principle be applied to our Als but in the interests of a clear and uncluttered comparison we have kept the Ce algorithm as simple as possible The evaluation of a CF algorithm usually centres on its accuracy. There is a difference between prediction(given a movie, predict a given user's rating of that a more natural fit to the movie domain. We present results evaluating both these 1. 3. USING AN AIS FOR CF To us, the attraction of the immune system is that if an adaptive pool of antibodies can produce 'intelligent behaviour, can we harness the power of this computation to tackle the problem of preference matching and recommendation? Thus, in the first instance we intend to build a model where known user preferences are our pool of antibodies and the new preferences to be matched is the antigen in question Our conjecture is that if the concentrations of those antibodies that provide a better match are allowed to increase over time, we should end up with a subset of good matches. However, we are not interested in optimising, 1. e in finding the one best match. Instead, we require a set of antibodies that are a close match but which are at the same time distinct from each other for successful recommendation Thi is where we propose to harness the idiotypic effects of binding antibodies to similar antibodies to encourage diversity
A RECOMMENDER SYSTEM BASED ON IDIOTYPIC ARTIFICIAL IMMUNE NETWORKS 183 • Similarity Measure: The most common method to compare two users is a correlation-based measure like Pearson or Spearman, which gives two neighbours a matching score between −1 and 1. Vector based, e.g., cosine of the angle between vectors, and probabilistic methods are alternative approaches. The canonical example is the k-Nearest-Neighbour algorithm, which uses a matching method to select k reviewers with high similarity measures. The votes from these reviewers, suitably weighted, are used to make predictions and recommendations. Many improvements on this method are possible [13]. For example, the user profiles are usually extremely sparse because many items are not rated. This means that similarity measurements are both inefficient (the so-called ‘curse of dimensionality’) and difficult to calculate due to the small overlap. Default votes are sometimes used for items a user has not explicitly voted on, and these can increase the overlap size [4]. Dimensionality reduction methods, such as Single Value Decomposition, both improve efficiency and increase overlap [3]. Other pre-processing methods are often used, e.g., clustering [1]. Content-based information can be used to enhance the pure CF approach [13, 9]. Finally, the weighting of each neighbour can be adjusted by training, and there are many learning algorithms available for this [10]. All these improvements could in principle be applied to our AIS but in the interests of a clear and uncluttered comparison we have kept the CF algorithm as simple as possible. The evaluation of a CF algorithm usually centres on its accuracy. There is a difference between prediction (given a movie, predict a given user’s rating of that movie) and recommendation (given a user, suggest movies that are likely to attract a high rating). Prediction is easier to assess quantitatively but recommendation is a more natural fit to the movie domain. We present results evaluating both these behaviours. 1.3. USING AN AIS FOR CF To us, the attraction of the immune system is that if an adaptive pool of antibodies can produce ‘intelligent’ behaviour, can we harness the power of this computation to tackle the problem of preference matching and recommendation? Thus, in the first instance we intend to build a model where known user preferences are our pool of antibodies and the new preferences to be matched is the antigen in question. Our conjecture is that if the concentrations of those antibodies that provide a better match are allowed to increase over time, we should end up with a subset of good matches. However, we are not interested in optimising, i.e. in finding the one best match. Instead, we require a set of antibodies that are a close match but which are at the same time distinct from each other for successful recommendation. This is where we propose to harness the idiotypic effects of binding antibodies to similar antibodies to encourage diversity
184 STEVE CAYZER AND UWE AICKELIN The next section presents more details of our problem and explains the ais nodel we intend to use. We then describe the experimental set-up, present and review results and discuss some possibilities for future work 2.1. APPLICATION OF THE AIS TO THE EACHMOVIE TASKS The eachmovie database [8] is a public database, which records explicit votes of users for movies. It holds 2.811.983 votes taken from 72.916 users on 1. 628 films The task is to use this data to make predictions and recommendations. In the former ide an estima In the latter we present a ranked list of movies that the user might like The basic approach of CF, is to use information from a neighbourhood to make useful predictions and recommendations. The central task we set ourselves is to identify a suitable neighbourhood. The S WAMI (Shared wisdom through the amal amation of Many Interpretations) framework [12]is a publicly accessible software for CF experiments. Its central algorithm is as follows Select a set of test users randomly from the database FOR each test user t Reserve a vote of this user, i. e Hide from predictor From remaining votes create a new training user t Select neighbourhood of k reviewers based on t Use neighbourhood to predict vote Compare this with actual vote and collect statistics NEXT t The code shown in bold indicates a place where S WAMI allows an implementation dependent choice of algorithm We use an Als to perform selection and prediction as below 2.2. ALGORITHM CHOICES e use the s wami data encoding: User idl, score, id (idn, scorenll, where id corresponds to the unique identifier of the movie being rated and score to this user's score for that movie. This captures the essential features of the data available Eachmovie vote data links a person with a movie signs a score(taken from the set{0.,0.2,0.4,0.6,0.8,1 User demographic information(e. g, Age and gender)is provided but this is not used in our encoding ontent information about movies(e. g, Category) is similarly not used
184 STEVE CAYZER AND UWE AICKELIN The next section presents more details of our problem and explains the AIS model we intend to use. We then describe the experimental set-up, present and review results and discuss some possibilities for future work. 2. Algorithms 2.1. APPLICATION OF THE AIS TO THE EACHMOVIE TASKS The eachmovie database [8] is a public database, which records explicit votes of users for movies. It holds 2,811,983 votes taken from 72,916 users on 1,628 films. The task is to use this data to make predictions and recommendations. In the former case, we provide an estimated vote for a previously unseen movie. In the latter case, we present a ranked list of movies that the user might like. The basic approach of CF, is to use information from a neighbourhood to make useful predictions and recommendations. The central task we set ourselves is to identify a suitable neighbourhood. The SWAMI (Shared Wisdom through the Amalgamation of Many Interpretations) framework [12] is a publicly accessible software for CF experiments. Its central algorithm is as follows: Select a set of test users randomly from the database FOR each test user t Reserve a vote of this user, i.e. Hide from predictor From remaining votes create a new training user t Select neighbourhood of k reviewers based on t Use neighbourhood to predict vote Compare this with actual vote and collect statistics NEXT t The code shown in bold indicates a place where SWAMI allows an implementationdependent choice of algorithm. We use an AIS to perform selection and prediction as below. 2.2. ALGORITHM CHOICES We use the SWAMI data encoding: User = {{id1,score1},{id2,score2},..., {idn,scoren}}, where id corresponds to the unique identifier of the movie being rated and score to this user’s score for that movie. This captures the essential features of the data available. Eachmovie vote data links a person with a movie and assigns a score (taken from the set {0, 0.2, 0.4, 0.6, 0.8, 1.0} where 0 is the worst). User demographic information (e.g., Age and gender) is provided but this is not used in our encoding. Content information about movies (e.g., Category) is similarly not used
A RECOMMENDER SYSTEM BASED ON IDIOTYPIC ARTIFICIAL IMMUNE NETWORKS 185 23. SIMILARITY MEASURE The Pearson measure is used to compare two users u and u )(u-b) 1(u1-i)2∑1(U2-5) where u and v are users, n is the number of overlapping votes (i.e. Movies for which both u and u have voted), ui is the vote of user u for movie i and is the average vote of user u over all films(not just the overlapping votes). The measure is amended as follows if n=0. r= NoOverlapDefault if∑(u1-l)2∑(u-b)2=0.r= Zero VarianceDefault (2) ifn <P, r=p(where P =overlap penalty) The two default values are required because it is impossible to calculate a Pearson measure in such cases. Both were set to 0. Some experimentation showed that an overlap penalty P was beneficial(this lowers the absolute correlation for users with only a small overlap) but that the exact value was not critical. We chose a value of 100 because this is the maximum overlap expected 2. 4. NEIGHBOURHOOD SELECTION For a Simple Pearson(SP) predictor, neighbourhood selection means choosing the best k(absolute)correlation scores, where k is the neighbourhood size. Not every potential neighbour will have rated the film to be predicted. Reviewers who did not vote on the film are not added to the neighbourhood We have chosen the sp as a benchmark for our ais recommender because it is the de facto standard for recom- mender algorithms and also the usual starting point for more complex neighbour hood selection schemes. Furthermore, the AIs recommender is both sufficiently different and of similar complexity to warrant a fair comparison For the aIs predictor, a more involved procedure is required Initialise AIs Encode user for whom to make predictions as antigen A WHILE (AIS not stabilised )&(Reviewers available)dO Add next user as an antibody ab Calculate matching scores between ab and ag Calculate matching scores between ab and other antibodies WHILE (AIS at full size)&(AIS not stable)DO Iterate AIs OD
A RECOMMENDER SYSTEM BASED ON IDIOTYPIC ARTIFICIAL IMMUNE NETWORKS 185 2.3. SIMILARITY MEASURE The Pearson measure is used to compare two users u and v: r = n i=1(ui − ¯u)(vi − ¯v) n i=1(ui − ¯u)2 n i=1(vi − ¯v)2 , (1) where u and v are users, n is the number of overlapping votes (i.e. Movies for which both u and v have voted), ui is the vote of user u for movie i and • is the average vote of user u over all films (not just the overlapping votes). The measure is amended as follows: if n = 0, r = NoOverlapDefault if n i=1 (ui − ¯u)2n i=1 (vi − ¯v)2 = 0, r = ZeroVarianceDefault (2) if n<P, r = n P r (where P = overlap penalty) The two default values are required because it is impossible to calculate a Pearson measure in such cases. Both were set to 0. Some experimentation showed that an overlap penalty P was beneficial (this lowers the absolute correlation for users with only a small overlap) but that the exact value was not critical. We chose a value of 100 because this is the maximum overlap expected. 2.4. NEIGHBOURHOOD SELECTION For a Simple Pearson (SP) predictor, neighbourhood selection means choosing the best k (absolute) correlation scores, where k is the neighbourhood size. Not every potential neighbour will have rated the film to be predicted. Reviewers who did not vote on the film are not added to the neighbourhood. We have chosen the SP as a benchmark for our AIS recommender because it is the de facto standard for recommender algorithms and also the usual starting point for more complex neighbourhood selection schemes. Furthermore, the AIS recommender is both sufficiently different and of similar complexity to warrant a fair comparison. For the AIS predictor, a more involved procedure is required: Initialise AIS Encode user for whom to make predictions as antigen Ag WHILE (AIS not stabilised) & (Reviewers available) DO Add next user as an antibody Ab Calculate matching scores between Ab and Ag Calculate matching scores between Ab and other antibodies WHILE (AIS at full size) & (AIS not stable) DO Iterate AIS OD OD
186 Our Ais behaves as follows: At each step(iteration) an antibodys concentration is increased by an amount dependent on its matching to the antigen and decreased by an amount which depends on its matching to other antibodies. In absence of either, an antibodys concentration will slowly decrease over time. Antibodies with a suf- ficiently low concentration are removed from the system, whereas antibodies with a high concentration may saturate An Als iteration is governed by the following ue to Farmer et al. [11] antibodies I death recognised recognised kI liji- where N is the number of antibodies and n is the number of antigens, xi(or xi)is the concentration of antibody i (or j), yi is the concentration of antigen j, c is a rate constant, kI is a suppressive effect and k2 is the death rate, m ii is the matching function between antibody i and antibody (or antigen)j As can be seen from the above equation, the nature of an idiotypic interaction an be either positive or neg reover, if the matching function is symmetric, then the balance between"I am recognised"and"Antibodies recognised"(para meters c and kI in the equation) wholly determines whether the idiotypic effect is positive or negative, and we can simplify the equation. We can simplify the equa tion still further if we only allow one antigen in the AIs. The simplified equation looks like thi dxi=k> n= k mjxx一k3x, (4) where ki is stimulation, k2 suppression and k3 death rate, m; is the correlation be- tween antibody i and the(sole)antigen, x; (or xj)is the concentration of antibody i(or j), y is the concentration of the( sole)antigen, mij is the correlation between antibodies i and j, n is the number of antibodies. In the new equation, the first term is simplified as we only have one antigen, and the suppression term is normalised to allow a" like for like'comparison between the different rate constants. ki and k, were varied as described in Section 3. k3 was fixed at 0.1, while the concentration range was set at 0-100(initially 10). We fixed n at 100. The matching function is the absolute value of the Pearson cor relation measure. This allows us to have both positively and negatively correlated users in our neighbourhood, which increases the pool of neighbours available to The AIs is considered stable after iterating for ten iterations without changing in size. Stabilisation thus means that a sufficient number of 'good neighbours have been identified and therefore a prediction can be made. 'Poor'neighbours would be expected to drop out of the als after a few iterations
186 STEVE CAYZER AND UWE AICKELIN Our AIS behaves as follows: At each step (iteration) an antibody’s concentration is increased by an amount dependent on its matching to the antigen and decreased by an amount which depends on its matching to other antibodies. In absence of either, an antibody’s concentration will slowly decrease over time. Antibodies with a suf- ficiently low concentration are removed from the system, whereas antibodies with a high concentration may saturate. An AIS iteration is governed by the following equation, due to Farmer et al. [11]: dxi dt = c antibodies recognised − I am recognised + antigens recognised − death rate = c N j=1 mjixixj − k1 N j=1 mijxixj +n j=1 mjixiyj − k2xi, (3) where N is the number of antibodies and n is the number of antigens, xi (or xj ) is the concentration of antibody i (or j ), yi is the concentration of antigen j , c is a rate constant, k1 is a suppressive effect and k2 is the death rate, mji is the matching function between antibody i and antibody (or antigen) j . As can be seen from the above equation, the nature of an idiotypic interaction can be either positive or negative. Moreover, if the matching function is symmetric, then the balance between “I am recognised” and “Antibodies recognised” (parameters c and k1 in the equation) wholly determines whether the idiotypic effect is positive or negative, and we can simplify the equation. We can simplify the equation still further if we only allow one antigen in the AIS. The simplified equation looks like this: dxi dt = k1mixiy − k2 n n j=1 mijxixj − k3xi, (4) where k1 is stimulation, k2 suppression and k3 death rate, mi is the correlation between antibody i and the (sole) antigen, xi (or xj ) is the concentration of antibody i (or j ), y is the concentration of the (sole) antigen, mij is the correlation between antibodies i and j , n is the number of antibodies. In the new equation, the first term is simplified as we only have one antigen, and the suppression term is normalised to allow a ‘like for like’ comparison between the different rate constants. k1 and k2 were varied as described in Section 3. k3 was fixed at 0.1, while the concentration range was set at 0–100 (initially 10). We fixed n at 100. The matching function is the absolute value of the Pearson correlation measure. This allows us to have both positively and negatively correlated users in our neighbourhood, which increases the pool of neighbours available to us. The AIS is considered stable after iterating for ten iterations without changing in size. Stabilisation thus means that a sufficient number of ‘good’ neighbours have been identified and therefore a prediction can be made. ‘Poor’ neighbours would be expected to drop out of the AIS after a few iterations.
A RECOMMENDER SYSTEM BASED ON IDIOTYPIC ARTIFICIAL IMMUNE NETWORKS 187 Once the ais has stabilised using the above algorithm, we use the antibody concentration to weigh the neighbours. However, early experiments showed that the most recently added antibodies were at a disadvantage compared to earlier antibodies. This is because they have had no time to mature (i.e. increase in con- centration). Likewise, the earliest antibodies had saturated. To overcome this, we reset the concentrations and allow a limited run of the ais to differentiate the Reset als(set all antibodies to initial concentrations) WHILE (No antibody at maximum concentration) DO Iterate AIs OD 2.5. PREDICtION We predict a rating pi by using a weighted average over N, the neighbourhood of u. which was taken as the entire aIs u ∑neNl Wwy= ruoxu (NB relative not absolute) where wu is the weight between users u and u, ruy is the correlation score between u and u, and xu is the concentration of the antibody corresponding to user u 2,6. EVALUATION Prediction Accuracy: We take the mean absolute error, where n, is the number of predictions actual -predicted MAE= Mean number of recommendations: This is the total number of unique films rated by the neighbours Mean overlap size: This is the number of recommendations that the user has also seen Mean accuracy of recommendations: Each overlapped film has an actual vote (from the antigen) and a predicted vote( from the neighbours). The overlapped films were ranked on both actual and predicted vote, breaking ties by movie ID. The two ranked lists were compared using Kendalls Tau t. This measure eflects the level of concordance in the lists by counting the number of dis- cordant pairs. To do this we order the films by vote and apply the following formulae
A RECOMMENDER SYSTEM BASED ON IDIOTYPIC ARTIFICIAL IMMUNE NETWORKS 187 Once the AIS has stabilised using the above algorithm, we use the antibody concentration to weigh the neighbours. However, early experiments showed that the most recently added antibodies were at a disadvantage compared to earlier antibodies. This is because they have had no time to mature (i.e. increase in concentration). Likewise, the earliest antibodies had saturated. To overcome this, we reset the concentrations and allow a limited run of the AIS to differentiate the concentrations: Reset AIS (set all antibodies to initial concentrations) WHILE (No antibody at maximum concentration) DO Iterate AIS OD 2.5. PREDICTION We predict a rating pi by using a weighted average over N, the neighbourhood of u, which was taken as the entire AIS. pi = ¯u + v∈N wuv(vi − ¯v) v∈N wuv , (5) wuv = ruvxv (NB relative not absolute), where wuv is the weight between users u and v, ruv is the correlation score between u and v, and xv is the concentration of the antibody corresponding to user v. 2.6. EVALUATION • Prediction Accuracy: We take the mean absolute error, where np is the number of predictions: MAE = |actual − predicted| np . (6) • Mean number of recommendations: This is the total number of unique films rated by the neighbours. • Mean overlap size: This is the number of recommendations that the user has also seen. • Mean accuracy of recommendations: Each overlapped film has an actual vote (from the antigen) and a predicted vote (from the neighbours). The overlapped films were ranked on both actual and predicted vote, breaking ties by movie ID. The two ranked lists were compared using Kendall’s Tau τ . This measure reflects the level of concordance in the lists by counting the number of discordant pairs. To do this we order the films by vote and apply the following formulae:
188 STEVE CAYZER AND UWE AICKELIN 4N D(ri, rj i=1j=i+1 ri> D(i,ri) 0 otherwise where n is the overlap size and ri is the rank of film i as recommended by the neighbourhood. Note that i here refers to the antigen rank of the film, not the film ID. N is the number of discordant pairs, or, equivalently, the expected cost of a bubble sort to reconcile the two lists D is set to one if the rankings are discordant Mean number of reviewers this is the number of reviewers looked at before the ais stabilised Mean number of neighbours: This is the final number of neighbours in the stabilised aIs Experiments were carried out on a Pentium 700 with 256 MB RAM, running Win- dows 2000. The ais was coded in JavaTM jDK13 Each run involved looking at up to 15,000 reviewers(a random sample of up to 20% of the EachMovie data set) to provide predictions and recommendations for 100 users. Averaged statistics are then taken for each run Runtimes ranged from 5 to 60 minutes, largely dependent on the number of reviewer 3. 1. EXPERIMENTS ON SIMPLE AIS Initial experiments concentrated on a simple Als, with no idiotypic effects. The goal was to find a good stimulation rate, but also to ensure that the" baseline system operates similarly to a SP predictor. Therefore, we set the suppression rate to zero, and varied only the stimulation rate, i.e. The weighting given to antigen binding Other parameters had been fixed by preliminary experiments to values that worked well with both aIs and SP The graphs show averaged results over five runs at each stimulation rate. The bars show standard deviations. In order to have a fair comparison, the sP para- meters(neighbourhood and number of reviewers looked at)match the als values for each rate. In Figure 2, we show the prediction error, number of recommenda- tions, number of overlaps and recommendation accuracy for each algorithm. Note that low prediction error values are better, whereas for the other measures we are looking for high values
188 STEVE CAYZER AND UWE AICKELIN τ = 1 − 4ND n(n − 1) , ND = n i=1 n j=i+1 D(ri, rj ), (7) D(ri, rj ) = 1 if ri > rj , 0 otherwise, where n is the overlap size and ri is the rank of film i as recommended by the neighbourhood. Note that i here refers to the antigen rank of the film, not the film ID. ND is the number of discordant pairs, or, equivalently, the expected cost of a bubble sort to reconcile the two lists. D is set to one if the rankings are discordant. • Mean number of reviewers: This is the number of reviewers looked at before the AIS stabilised. • Mean number of neighbours: This is the final number of neighbours in the stabilised AIS. 3. Experiments Experiments were carried out on a Pentium 700 with 256 MB RAM, running Windows 2000. The AIS was coded in Java™ JDK1.3. Each run involved looking at up to 15,000 reviewers (a random sample of up to 20% of the EachMovie data set) to provide predictions and recommendations for 100 users. Averaged statistics are then taken for each run. Runtimes ranged from 5 to 60 minutes, largely dependent on the number of reviewers. 3.1. EXPERIMENTS ON SIMPLE AIS Initial experiments concentrated on a simple AIS, with no idiotypic effects. The goal was to find a good stimulation rate, but also to ensure that the ‘baseline’ system operates similarly to a SP predictor. Therefore, we set the suppression rate to zero, and varied only the stimulation rate, i.e. The weighting given to antigen binding. Other parameters had been fixed by preliminary experiments to values that worked well with both AIS and SP. The graphs show averaged results over five runs at each stimulation rate. The bars show standard deviations. In order to have a fair comparison, the SP parameters (neighbourhood and number of reviewers looked at) match the AIS values for each rate. In Figure 2, we show the prediction error, number of recommendations, number of overlaps and recommendation accuracy for each algorithm. Note that low prediction error values are better, whereas for the other measures we are looking for high values.
A RECOMMENDER SYSTEM BASED ON IDIOTYPIC ARTIFICIAL IMMUNE NETWORKS 189 Figure 1. Effect of stimulation rate on neighbourhood and reviewers E相eat候 ulation on Figure 2. Effect of stimulation rate on prediction and recommendatio It can be seen that the simple als gives broadly similar prediction performance to the SP. The Mean Absolute Error(MAE) measurements from different runs are not normally distributed, so a non-parametric statistic is appropriate. We performed a wilcoxon analysis, which showed no significant difference between prediction errors of SP and Als (at a 95% confidence level). In addition the choice of an appropriate stimulation rate did make a significant difference(comparing a rate of 0.2 with 0.02 at the 95% level) For recommendation, the Als performs better than the SP at stimulation rates above O 1. Again, we performed a positive 95% wilcoxon analysis to assess signif- icance. We excluded cases where a recommendation score was unavailable(due to an insufficient number of overlaps). The number of recommendations and overlaps show similar trends though the AIs gives a more constant value. Again, some stimulation was beneficial In later experiments, the stimulation rate was fixed at one of the better values (0. 2, 0.3 or 0.5), in order to give us a good base to work on. These values give us
A RECOMMENDER SYSTEM BASED ON IDIOTYPIC ARTIFICIAL IMMUNE NETWORKS 189 Figure 1. Effect of stimulation rate on neighbourhood and reviewers. Figure 2. Effect of stimulation rate on prediction and recommendation. It can be seen that the simple AIS gives broadly similar prediction performance to the SP. The Mean Absolute Error (MAE) measurements from different runs are not normally distributed, so a non-parametric statistic is appropriate. We performed a Wilcoxon analysis, which showed no significant difference between prediction errors of SP and AIS (at a 95% confidence level). In addition, the choice of an appropriate stimulation rate did make a significant difference (comparing a rate of 0.2 with 0.02 at the 95% level). For recommendation, the AIS performs better than the SP at stimulation rates above 0.1. Again, we performed a positive 95% Wilcoxon analysis to assess significance. We excluded cases where a recommendation score was unavailable (due to an insufficient number of overlaps). The number of recommendations and overlaps show similar trends though the AIS gives a more constant value. Again, some stimulation was beneficial. In later experiments, the stimulation rate was fixed at one of the better values (0.2, 0.3 or 0.5), in order to give us a good base to work on. These values give us
TEVE CAYZER AND UWE AICKELIN generally good performance, while keeping a good neighbourhood size and still evaluating a reasonable number of reviewers 3.2. EXPERIMENTS ON THE IDIOTYPIC AIS Having fixed all the simple parameters, we tested the effect of suppression for stimulation rates of 0.2, 0.3 and 0.5. Not surprisingly we found that suppression changed the number of reviewers looked at and the number of neighbours(ig ure 3) ye then tested the effect of suppression on the als performance. Here we fixed the baseline rate at stimulation only (no suppression), and took measurements rela tive to this baseline(Figure 4). Again, it should be noted that the first graph shows prediction error(hence, a good result is low) Again, the graphs show averaged results over five runs at each suppression e. The bars show standard deviations(similar size bars for rates 0.2 and 0.5 Figure 3. Effect of suppression rate on neighbourhood size and reviewers. pe error Figure 4. Effect of suppression rate on prediction and recommendation
190 STEVE CAYZER AND UWE AICKELIN generally good performance, while keeping a good neighbourhood size and still evaluating a reasonable number of reviewers. 3.2. EXPERIMENTS ON THE IDIOTYPIC AIS Having fixed all the simple parameters, we tested the effect of suppression for stimulation rates of 0.2, 0.3 and 0.5. Not surprisingly we found that suppression changed the number of reviewers looked at and the number of neighbours (Figure 3). We then tested the effect of suppression on the AIS performance. Here we fixed the baseline rate at stimulation only (no suppression), and took measurements relative to this baseline (Figure 4). Again, it should be noted that the first graph shows prediction error (hence, a good result is low). Again, the graphs show averaged results over five runs at each suppression rate. The bars show standard deviations (similar size bars for rates 0.2 and 0.5 Figure 3. Effect of suppression rate on neighbourhood size and reviewers. Figure 4. Effect of suppression rate on prediction and recommendation