Availableonlineatwww.sciencedirectcom scⅰ enceDirect Expert Sy stems with Applications ELSEVIER Expert Systems with Applications 36(2009)1972-1977 www.elsevier.com/locate/eswa The task of guiding in adaptive recommender systems Felix hernandez del olmo". elena gaudioso. Eduardo h. martin UNED, Artificial Intelligence Department, C/Juan del Rosal, 16 28040 Madrid, Spain Abstract In this paper, we study the recommendation problem as formed by two tasks: (i) to filter useful/interesting items, (ii) to guide the user to good recommendations. The first task has been widely studied in the field of recommender systems. In fact, the most common char- terization of these systems is based on the algorithms that select (filter) the items to be recommended(e.g. collaborative filtering, con- tent-based, etc. ) Through this paper, we will focus on the second task: the task of guiding the user. We claim that this task needs a closer attention. In this paper, we report an experiment to provide evidence for this fact. Actually, the experiment shows that machine learning algorithms commonly applied to the first task become useless when applied to the task of guiding c 2007 Elsevier Ltd. All rights reserved Keywords: Recommender systems; Adaptive web based systems; Interactive web based systems: User profiles: User models 1. Introduction work. In Section 3. we illustrate the means of a rec- ommender system biased towards of guiding the Recommender systems have the effect of guiding the user. The adaptation techniques experiment are user in a personalized way to interesting or useful objects reported in Section 4. The results are discussed in Section in a large space of possible options(Burke, 2002). The last 5. We finish with some conclusions and future work in Sec implies a clear main objective: to guide the user to usefull tions 6 and 7, respectively However, in its common formulation, the recommenda- tion problem is concentrated on the problem of estimating 2. The task of guiding the user ratings for the items that have not been seen by the user Adomavicius and Tuzhilin(2005), Herlocker, Konstan Whenever a recommender system displays a new recon Terveen, and Riedl (2004). In other words, this area has mendation, it must face two problems. On the first hand, it focused on the task of filtering useful/interesting items, must choose a useful item to be recommended. The diffi- while frequently losing the other part of the whole prob- culty of this problem may be noticed by the numerous lit- lem: to guide the user. In fact, the usual classification of rec- erature about it(for a survey, see Adomavicius tuzhilin, ommender systems places them on categories related to the 2005). We will refer to this labor as the task of filtering use way the items are filtered(e. g. collaborative filtering, con- ful items tent-based, hybrids, etc(Adomavicius Tuzhilin, 2005: On the other hand, the recommender system must dis- Balabanovic shoham, 1997; Burke, 2002) play recommendations to be followed. It must be noticed In this paper, we center on the second part of the recom- that the usefulness of a recommended item does not imply mendation problem In Section 2, we introduce some of the its follow-ability. For instance, a recommendation of a very implications related to this idea jointly with some previous useful book as "Romeo and Juliet"(Shakespeare)could be less follow-able than a recommendation which suggests The Da Vinci Code"(Dan Brown). However, it must be E-mail address: felix(@dia. uned es(F. Hernandez del olmo) highlighted that a not followed suggestion is, at least, as 0957-4174/S- see front matter 2007 Elsevier Ltd. All rights reserved doi:10.1016eswa2007.12.070
The task of guiding in adaptive recommender systems Fe´lix Herna´ndez del Olmo *, Elena Gaudioso, Eduardo H. Martin UNED, Artificial Intelligence Department, C/Juan del Rosal, 16 28040 Madrid, Spain Abstract In this paper, we study the recommendation problem as formed by two tasks: (i) to filter useful/interesting items, (ii) to guide the user to good recommendations. The first task has been widely studied in the field of recommender systems. In fact, the most common characterization of these systems is based on the algorithms that select (filter) the items to be recommended (e.g. collaborative filtering, content-based, etc.). Through this paper, we will focus on the second task: the task of guiding the user. We claim that this task needs a closer attention. In this paper, we report an experiment to provide evidence for this fact. Actually, the experiment shows that machine learning algorithms commonly applied to the first task become useless when applied to the task of guiding. 2007 Elsevier Ltd. All rights reserved. Keywords: Recommender systems; Adaptive web based systems; Interactive web based systems; User profiles; User models 1. Introduction Recommender systems have the effect of guiding the user in a personalized way to interesting or useful objects in a large space of possible options (Burke, 2002). The last implies a clear main objective: to guide the user to useful/ interesting objects. However, in its common formulation, the recommendation problem is concentrated on the problem of estimating ratings for the items that have not been seen by the user Adomavicius and Tuzhilin (2005), Herlocker, Konstan, Terveen, and Riedl (2004). In other words, this area has focused on the task of filtering useful/interesting items, while frequently losing the other part of the whole problem: to guide the user. In fact, the usual classification of recommender systems places them on categories related to the way the items are filtered (e.g. collaborative filtering, content-based, hybrids, etc. (Adomavicius & Tuzhilin, 2005; Balabanovic & Shoham, 1997; Burke, 2002). In this paper, we center on the second part of the recommendation problem. In Section 2, we introduce some of the implications related to this idea jointly with some previous work. In Section 3, we illustrate the idea by means of a recommender system biased towards the task of guiding the user. The adaptation techniques for this experiment are reported in Section 4. The results are discussed in Section 5. We finish with some conclusions and future work in Sections 6 and 7, respectively. 2. The task of guiding the user Whenever a recommender system displays a new recommendation, it must face two problems. On the first hand, it must choose a useful item to be recommended. The diffi- culty of this problem may be noticed by the numerous literature about it (for a survey, see Adomavicius & Tuzhilin, 2005). We will refer to this labor as the task of filtering useful items. On the other hand, the recommender system must display recommendations to be followed. It must be noticed that the usefulness of a recommended item does not imply its follow-ability. For instance, a recommendation of a very useful book as ‘‘Romeo and Juliet” (Shakespeare) could be less follow-able than a recommendation which suggests ‘‘The Da Vinci Code” (Dan Brown). However, it must be highlighted that a not followed suggestion is, at least, as 0957-4174/$ - see front matter 2007 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2007.12.070 * Corresponding author. E-mail address: felixh@dia.uned.es (F. Herna´ndez del Olmo). www.elsevier.com/locate/eswa Available online at www.sciencedirect.com Expert Systems with Applications 36 (2009) 1972–1977 Expert Systems with Applications
F. Hernandez del olmo et aL Expert Systems with Applications 36(2009)1972-1977 EleEdt yew Go Bookmarks Tools Help nte Regular Malo Muy Malo] Heade e iQue es el Data Mining El proceso no trivial de extraer patrones implicitos lose em rote validos, previamente desconocidos, potencialmente utiles ontent EQue queremos decir con cada uno de los terminos que hemos subrayado? 、 los a relaci n (como la d para el futuro, Esto es, i hemo detecta podem banane prme generalizar para el futuro, Por ejemplo, dada la relacion del anterior ejemplo, cada vea Pm时 1. Por tu perfi, te recormiendo ir al apartado Introduccion: Eemplo de Data Mning n Recommendation Fig. 1. Screen of the user interface once a student has entered after registering(by filling out the necessary forms) useless as a recommendation which leads to a useless item. database. The scripts are the way of showing and changing Thus another task must be considered the state of the whole application ng follow-able recommendations to the user. Bearing this f For the purposes of the experiment, we left students run We define the task of guiding the user the job of display- n ee into a course(about data mining) built in this environ- task in mind, the recommender systems must adapt the dis- ment. To this end, users(students) had to fill a form previ play of each recommendation regarded to its user model or ous to be registered into the course. As a result, each user ser profile. Traditionally, machine learning is a common who navigates through the course always holds an(static) technique for user modeling (Webb, Pazzani, billsus, user model filled by herself. Notice the problems related 2001). In particular, for the field of recommender systems, to prompting or asking for these data(obtrusive way of bayesian classifiers, decision trees or instance based classi- getting labels, e.g. see Schwab Kobsa, 2002). However, fiers have been commonly employed for improving user they are smoothed by the fact that they get back a course models in the task of filtering useful items. For instance, for free. Moreover, we can trust in what users filled because bayesian classifiers and decision trees can be found in sig- of two reasons: (1) there were only a few questions very nificative recommender systems(Breese, Heckerman,& easy to fill Gust one selection between a few options ),(2) Kadie, 1998: Pazzani, Muramatsu, Billsus, 1997). the users were aware of the fact that the personalization Instance based classifiers(i.e. nearest neighbor or neighbo of the course depended strongly on the way they filled hood methods) are common in collaborative filtering the forms pproaches(Herlocker, Konstan, borchers, riedl The interface of the whole course is built by means of 1999: Linden, Smith, York, 2003; Resnick, Iacovou, four frames(see Fig. 1): header, index, content, and recom- Suchak, Bergstrom, Riedl, 1994) mendation(recommenders interface). By interacting with a owever, the suitability of these algorithms for the task the header frame, users can rate' the current content or guiding users remains to be studied. To this end we have leave the course. Index frame is to be used for jumping developed a recommender system centered on the task of to each piece of content, also called content item. Into the guiding, which is to be described in the next section index frame, themes(see"Introduccion", ""Evolucion His- torica", etc ) are used to organize semantically all single content items. Nonetheless. we did not number nor ordered 3. Expe them to leave as much freedom as possible in their access Finally, the content frame renders the current content item In order to implement and evaluate our approach, we into its limits(always scarce, see Fig. 1) have chosen a web-based learning environment. As usual, Leaded by the objectives of the experiment, the recom it is based on a web server, a set of scripts and a backed- mendation frame shows a hyperlink pointing to the next database. Both the contents to be presented and the(per sonal and interaction) users' data are stored in the data base, so the state of the whole application is stored in the We are not using these ratings for this paper
useless as a recommendation which leads to a useless item. Thus, another task must be considered. We define the task of guiding the user the job of displaying follow-able recommendations to the user. Bearing this task in mind, the recommender systems must adapt the display of each recommendation regarded to its user model or user profile. Traditionally, machine learning is a common technique for user modeling (Webb, Pazzani, & Billsus, 2001). In particular, for the field of recommender systems, bayesian classifiers, decision trees or instance based classi- fiers have been commonly employed for improving user models in the task of filtering useful items. For instance, bayesian classifiers and decision trees can be found in significative recommender systems (Breese, Heckerman, & Kadie, 1998; Pazzani, Muramatsu, & Billsus, 1997). Instance based classifiers (i.e. nearest neighbor or neighborhood methods) are common in collaborative filtering approaches (Herlocker, Konstan, Borchers, & Riedl, 1999; Linden, Smith, & York, 2003; Resnick, Iacovou, Suchak, Bergstrom, & Riedl, 1994). However, the suitability of these algorithms for the task of guiding users remains to be studied. To this end, we have developed a recommender system centered on the task of guiding, which is to be described in the next section. 3. Experimental settings In order to implement and evaluate our approach, we have chosen a web-based learning environment. As usual, it is based on a web server, a set of scripts and a backeddatabase. Both the contents to be presented and the (personal and interaction) users’ data are stored in the database, so the state of the whole application is stored in the database. The scripts are the way of showing and changing the state of the whole application. For the purposes of the experiment, we left students run free into a course (about data mining) built in this environment. To this end, users (students) had to fill a form previous to be registered into the course. As a result, each user who navigates through the course always holds an (static) user model filled by herself. Notice the problems related to prompting or asking for these data (obtrusive way of getting labels, e.g. see Schwab & Kobsa, 2002). However, they are smoothed by the fact that they get back a course for free. Moreover, we can trust in what users filled because of two reasons: (1) there were only a few questions very easy to fill (just one selection between a few options), (2) the users were aware of the fact that the personalization of the course depended strongly on the way they filled the forms. The interface of the whole course is built by means of four frames (see Fig. 1): header, index, content, and recommendation (recommender’s interface). By interacting with the header frame, users can rate1 the current content or leave the course. Index frame is to be used for jumping to each piece of content, also called content item. Into the index frame, themes (see ‘‘Introduccion”, ‘‘Evolucion Historica”, etc.) are used to organize semantically all single content items. Nonetheless, we did not number nor ordered them to leave as much freedom as possible in their access. Finally, the content frame renders the current content item into its limits (always scarce, see Fig. 1). Leaded by the objectives of the experiment, the recommendation frame shows a hyperlink pointing to the next Fig. 1. Screen of the user interface once a student has entered after registering (by filling out the necessary forms). 1 We are not using these ratings for this paper. F. Herna´ ndez del Olmo et al. / Expert Systems with Applications 36 (2009) 1972–1977 1973
F. Hernandez del Olmo et al Expert Systems with Applications 36 (2009)1972-1977 suggested content item. The recommendation is inferred 4 Machine learning for the task of guiding (forward chaining) by means of some static rules along with the user model previously filled(see above). We must To solve an applied problem of user modeling for adap- point out that the task of filtering useful items is solved by tive interfaces by means of established inductive learning these static rules. We consider a recommendation useful/ methods, developers must(Langley, 1999):(i) reformulate interesting (even when not followed)if, and only if, the the problem in a form that machine learning induction tutor affirms it about the item present in each recommenda- can be applied, (ii) engineer a set of features that describe tion. Since these static rules have antecedently been intro- the training cases adequately, (iii) devise some approach duced by the(human) tutor, we will consider through the to collecting and preparing the training instances. We experiment that the displayed content in the recommenda- implemented these steps as follows tion is always useful/interesting. By means of this fact, we Firstly, the established objective for the adaptive recom- are able to focus only on the task of guiding the user, which menders of this experiment consists of learning to whom an we are about to stud when should the recommender display a recommendation c. As presented in Section 2, in the task of guiding, the (see Section 3). To this end, each training case is built up y of each recommendation consists of being followed. by means of each user's interaction with the recommenda In this experiment, a recommendation is considered fol- tion frame. However, notice that this fact occurs only when lowed whenever the user clicks on the"Ir"("go"in spanish) the recommender has previously maximized the frame(dis- button of the recommendation frame(see Fig. 1). How- played the recommendation). As a result, we have as many ever, note that the act of recommending is not given for opportunities to get users' interactions with the recom free to the user. In fact, it takes space from the content dation frame as users' visits to content items (N, see Section and index frames. In addition, each recommendation 3). Therefore, the recommender system will obtain as many expects(at least tries)to be attended by the user, who must training cases as displayed recommendations(a+ b). Obvi- leave another task unattended (i.e. reading the content ously, this number will always be less or equal than the real item). Consequently, we must take into account and mea- number of opportunities. To put it another way, n> a+b sure the intrusion cost of recommending(hernandez del Secondly, the set of user model features have to do with Olmo, Gaudioso. boticario. 200- the personal data each user has previously filled out(see We permitted the recommender to make decisions about Section 3) and the usage data which describe the content when to recommend To this end, we left open the possibil- item currently visited. Concretely, in our experiments, each ity of maximizing or minimizing the recommendation frame training example is composed by the features summarized so that the recommender could display or hide the recom- in Table I mendation, respectively. Therefore, when the recommender Finally, in order to finish the preparation of each exam- system finds a fixed recommendation follow-able enough ple, it must be labeled upon each user's response. There- for a given user, it maximizes the frame Other- fore, for each recommendation displayed, if the user wise, the recommender system tries not to bother the user, clicks on the recommendation(the"ir"/"go"button), the so it minimizes(hides) the recommendation frame training example is labeled as"+". Otherwise, it is labeled In order to measure numerically the recommender's job as"-". The experiment focalises on the examination of in the task of guiding, we will use a metric that we have several adaptive recommender systems with the same exter- applied in previous works related to the present one. We nal conditions. Therefore, we were compelled to extract a call it NARG, Eq (1), and its goal is to measure the intru- sion cost of recommending in recommender systems(Her andez del olmo et al, 2005). In this measure, a is the User model features for training the machine learning algorithms of times the user got a minimized frame, and N=a+b+c Persona/mhi commer ining the machine learning algorithms number of not-followed recommendations, c is the number Featireodel features total number of followed recommendations, b is the total is the total number of visits(of all users) to every single bk Idr Is it your first course about data-mining?(boolean) Is it your first course about artificial intelligence? ( boolean) content item. In addition, we consider that the act of max- bk_db_ bk Your theoretical database background (1-5) imizing the recommendation window has a cost(r-=-1), bk_db_exp Your practical database background (1-5) if the recommendation is not followed. In any case, the rec- bk_ml_bk Your theoretical machine learning background(1-5) ommender system has to take the risk of maximizing the Is User's learning style( Inductive or Deductive better rewarded(+=+10) than the fact of keeping the mmme Q window, because each followed recommendation is much Usage me associated to the currently visited content item window minimized(ro=0) session visits to content items performed during the current users MARG=+xa+r-×b+×c (1) Each one of the training examples have all these attributes filled with a determined value. This is due to the fact that only students who had filled out the forms could enter into the course. Usage data tell the recom- In the next section we develop the adaptation techniques mender where users are currently placed into the course, so this value is required for this recommender system
suggested content item. The recommendation is inferred (forward chaining) by means of some static rules along with the user model previously filled (see above). We must point out that the task of filtering useful items is solved by these static rules. We consider a recommendation useful/ interesting (even when not followed) if, and only if, the tutor affirms it about the item present in each recommendation. Since these static rules have antecedently been introduced by the (human) tutor, we will consider through the experiment that the displayed content in the recommendation is always useful/interesting. By means of this fact, we are able to focus only on the task of guiding the user, which we are about to study. As presented in Section 2, in the task of guiding, the duty of each recommendation consists of being followed. In this experiment, a recommendation is considered followed whenever the user clicks on the ‘‘Ir”(‘‘go” in spanish) button of the recommendation frame (see Fig. 1). However, note that the act of recommending is not given for free to the user. In fact, it takes space from the content and index frames. In addition, each recommendation expects (at least tries) to be attended by the user, who must leave another task unattended (i.e. reading the content item). Consequently, we must take into account and measure the intrusion cost of recommending (Hernandez del Olmo, Gaudioso, & Boticario, 2005). We permitted the recommender to make decisions about when to recommend. To this end, we left open the possibility of maximizing or minimizing the recommendation frame so that the recommender could display or hide the recommendation, respectively. Therefore, when the recommender system finds a fixed recommendation follow-able enough for a given user, it maximizes the frame (see Fig. 1). Otherwise, the recommender system tries not to bother the user, so it minimizes (hides) the recommendation frame. In order to measure numerically the recommender’s job in the task of guiding, we will use a metric that we have applied in previous works related to the present one. We call it NARG, Eq. (1), and its goal is to measure the intrusion cost of recommending in recommender systems (Hernandez del Olmo et al., 2005). In this measure, a is the total number of followed recommendations, b is the total number of not-followed recommendations, c is the number of times the user got a minimized frame, and N ¼ a þ b þ c is the total number of visits (of all users) to every single content item. In addition, we consider that the act of maximizing the recommendation window has a cost ðr ¼ 1Þ, if the recommendation is not followed. In any case, the recommender system has to take the risk of maximizing the window, because each followed recommendation is much better rewarded ðrþ ¼ þ10Þ than the fact of keeping the window minimized ðr0 ¼ 0Þ. NARG ¼ rþ a þ r b þ r0 c N rþ ð1Þ In the next section we develop the adaptation techniques required for this recommender system. 4. Machine learning for the task of guiding To solve an applied problem of user modeling for adaptive interfaces by means of established inductive learning methods, developers must (Langley, 1999): (i) reformulate the problem in a form that machine learning induction can be applied, (ii) engineer a set of features that describe the training cases adequately, (iii) devise some approach to collecting and preparing the training instances. We implemented these steps as follows. Firstly, the established objective for the adaptive recommenders of this experiment consists of learning to whom an when should the recommender display a recommendation (see Section 3). To this end, each training case is built up by means of each user’s interaction with the recommendation frame. However, notice that this fact occurs only when the recommender has previously maximized the frame (displayed the recommendation). As a result, we have as many opportunities to get users’ interactions with the recommendation frame as users’ visits to content items (N, see Section 3). Therefore, the recommender system will obtain as many training cases as displayed recommendations ða þ bÞ. Obviously, this number will always be less or equal than the real number of opportunities. To put it another way, N P a þ b. Secondly, the set of user model features have to do with the personal data each user has previously filled out (see Section 3) and the usage data which describe the content item currently visited. Concretely, in our experiments, each training example is composed by the features summarized in Table 1. Finally, in order to finish the preparation of each example, it must be labeled upon each user’s response. Therefore, for each recommendation displayed, if the user clicks on the recommendation (the ‘‘ir”/‘‘go” button), the training example is labeled as ‘‘+”. Otherwise, it is labeled as ‘‘”. The experiment focalises on the examination of several adaptive recommender systems with the same external conditions. Therefore, we were compelled to extract a Table 1 User model features for training the machine learning algorithms Feature Comment Personal data bk_1dm Is it your first course about data-mining? (boolean) bk_1ia Is it your first course about artificial intelligence? (boolean) bk_db_bk Your theoretical database background (1–5) bk_db_exp Your practical database background (1–5) bk_ml_bk Your theoretical machine learning background (1–5) ls User’s learning style (Inductive or Deductive) Usage data theme Theme associated to the currently visited content item session_visits Visits to content items performed during the current user’s session Each one of the training examples have all these attributes filled with a determined value. This is due to the fact that only students who had filled out the forms could enter into the course. Usage data tell the recommender where users are currently placed into the course, so this value is always determined. 1974 F. Herna´ ndez del Olmo et al. / Expert Systems with Applications 36 (2009) 1972–1977
F. Hernandez del olmo et aL Expert Systems with Applications 36(2009)1972-1977 1975 Table 2 new model for its next decision. If the recommender deci- Some characteristics of the dataset collected des not to recommend, the training case is discarded. Also Registered users 86 in the case that the recommender decides to display the Users who followed at least one recommendation 34 recommendation, it updates the value of the NARG mea- Users who followed at least one recommendation for each 10 visits 22 o content items sure: positively if the label is +, negatively if the lal Total number of visits= N Otherwise (if it does not recommend), NARG is not updated(ro=0) The behavior of each recommender system is parameter unique and common dataset for all of them. To this end ed by the algorithm employed for its decisions. Three out we divided the experiment into two stages. In the first stage, of five recommenders are adaptive ones, and their decisions we presented the users a recommender that always had the are made by means of machine learning algorithms.The recommendation frame maximized. In other words, during chosen machine learning algorithms are common in the the first stage each user's visit produced a new training field, and they belong to the families cited in the Section example, hence N=a+b. A brief description of the col- 2: Naive Bayes(bayesian classifier), C4.5(decision tree) lected dataset is summarized in Table 2 and Nearest Neighbor (instance based ). In addition as a After this, in the second stage of the experiment (once baseline, we compare the adaptive behavior of these recom we had the users' interaction data in our hands) we ran five menders with non-adaptive ones. To this end, we intro- recommender systems over them (see below). In order to duced the next two extreme recommenders: "never observe their behavior, now without any kind of interven- recommend"and"always recommend" tion. we left the recommenders total freedom to decide The results may be visualized in Fig. 2. They will be dis whether maximize or minimize the frame for each previ- cussed in the next section ously stored( during the first stage)user's visit. If the recom- mender system decides to recommend(maximize), it gets 5. Discussion the training example, retrains its user model, and uses the This section is devoted to report some of the most signi fictive facts involved in the graph of the Fig. 2. Firstly NARG notice the negative behavior of every recommender(except "never recommend")along the first visits. The fact is that all recommenders (except"never recommend")were set Nearest Neig to behave as"always recommend"for the first 50 visits always recommend This setting permitted the adaptive recommenders to start making decisions with a (decorous) first model (of 50 exam ples). However, as may be inferred from Table 2, the likel of finding a user that follows a recommendation in only 50 visits is very low. Consequently, the negative values of granted.g the first visits are almost to be taken for NaRg alo Related to the latter after the first hundred of visit notice that the adaptive recommender curves tend to split up from the"always recommend"curve. However, instead of adapting their behavior to improve their decisions, they are biased towards a shy conduct. In fact, they tend to pre- fer not recommending than taking the risk of making a bad decision. Singularly, this behavior is so degenerate that all the adaptive recommenders even downgrade their behavior falling under the annoying"always recommend The behavior of "never recommend" also deserves some xplanation. First, note that it never reaches even a single negative value. However, notice how this extremely pru dent behavior is, in the long run, the worst among its more annoying partners. Finally, notice the asymptotic behavior towards 0 of all the curves. This fact contributes to our main conclusion 1000 2000 2500 the most common machine learning techniques to build up adaptive recommender systems have to be extended in Fig. 2. Evolution of the NARG measure for each recommender system order to manage the task of guiding. We develop this idea through the same collected dataset in more extent in the next section
unique and common dataset for all of them. To this end, we divided the experiment into two stages. In the first stage, we presented the users a recommender that always had the recommendation frame maximized. In other words, during the first stage each user’s visit produced a new training example, hence N ¼ a þ b. A brief description of the collected dataset is summarized in Table 2. After this, in the second stage of the experiment (once we had the users’ interaction data in our hands) we ran five recommender systems over them (see below). In order to observe their behavior, now without any kind of intervention, we left the recommenders total freedom to decide whether maximize or minimize the frame for each previously stored (during the first stage) user’s visit. If the recommender system decides to recommend (maximize), it gets the training example, retrains its user model, and uses the new model for its next decision. If the recommender decides not to recommend, the training case is discarded. Also, in the case that the recommender decides to display the recommendation, it updates the value of the NARG measure: positively if the label is +, negatively if the label is . Otherwise (if it does not recommend), NARG is not updated ðr0 ¼ 0Þ. The behavior of each recommender system is parameterized by the algorithm employed for its decisions. Three out of five recommenders are adaptive ones, and their decisions are made by means of machine learning algorithms. The chosen machine learning algorithms are common in the field, and they belong to the families cited in the Section 2: Naive Bayes (bayesian classifier), C4.5 (decision tree), and Nearest Neighbor (instance based). In addition, as a baseline, we compare the adaptive behavior of these recommenders with non-adaptive ones. To this end, we introduced the next two extreme recommenders: ‘‘never recommend” and ‘‘always recommend”. The results may be visualized in Fig. 2. They will be discussed in the next section. 5. Discussion This section is devoted to report some of the most signi- ficative facts involved in the graph of the Fig. 2. Firstly, notice the negative behavior of every recommender (except ‘‘never recommend”) along the first visits. The fact is that all recommenders (except ‘‘never recommend”) were set to behave as ‘‘always recommend” for the first 50 visits. This setting permitted the adaptive recommenders to start making decisions with a (decorous) first model (of 50 examples). However, as may be inferred from Table 2, the likely of finding a user that follows a recommendation in only 50 visits is very low. Consequently, the negative values of NARG along the first visits are almost to be taken for granted. Related to the latter, after the first hundred of visits, notice that the adaptive recommender curves tend to split up from the ‘‘always recommend” curve. However, instead of adapting their behavior to improve their decisions, they are biased towards a shy conduct. In fact, they tend to prefer not recommending than taking the risk of making a bad decision. Singularly, this behavior is so degenerate that all the adaptive recommenders even downgrade their behavior falling under the annoying ‘‘always recommend”. The behavior of ‘‘never recommend” also deserves some explanation. First, note that it never reaches even a single negative value. However, notice how this extremely prudent behavior is, in the long run, the worst among its more annoying partners. Finally, notice the asymptotic behavior towards 0 of all the curves. This fact contributes to our main conclusion: the most common machine learning techniques to build up adaptive recommender systems have to be extended in order to manage the task of guiding. We develop this idea in more extent in the next section. -0.1 -0.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0 500 1000 1500 2000 2500 NARG visit NARG never recommend Naive Bayes Nearest Neighbor C4.5 always recommend Fig. 2. Evolution of the NARG measure for each recommender system through the same collected dataset. Table 2 Some characteristics of the dataset collected Registered users 86 Users who followed at least one recommendation 34 Users who followed at least one recommendation for each 10 visits to content items 22 Total number of visits = N 2493 F. Herna´ ndez del Olmo et al. / Expert Systems with Applications 36 (2009) 1972–1977 1975
F. Hernandez del olmo et al I Expert Systems with Applications 36(2009)1972-1977 6. Conclusions Again, the latter partially explains the experiment. As the recommenders tend to recommend less frequently In this paper, we have considered the recommendation (tend to"never recommend"), they get less number of problem as formed by two tasks. Also, we have focused training cases. Therefore, their models remain bad and on the one that, in our opinion, needs a more closer atten- constant over the experiment. However, as the time tion in the field: the task of guiding runs. a constant bad behavior is worse considered. In Also, notice the general terms in which the experiment the next n. we introduce some ideas which look has been developed (as a summary, see Table 2). They promising to face the above problems allow us to generalize the results and think just in terms of the task(of guiding) involved. In addition, in an exper- iment centered on this task, we have observed a really bad 7. Future work behavior of the machine learning techniques commonly employed in this area To overcome the difficulties mentioned in the last sec- As a final conclusion, we claim that this behavior has to tion, we present some proposals, which are part of our do with the following distinguishing characteristics of the future work task of guiding On the first hand, one of the research paths, in the task of guiding, consists of developing recommender systems The task of guiding pursues two objectives at the same based on cost-oriented machine learning algorithms. These time: (i)it tries to recommend as much as possible, bu kind of algorithms are not new. In fact, there are early monly pursued accurate recommendations in terms of(Provost, Fawcett, Kohavi, 1998/ ented approach.y (ii)it tries to recommend (only) follow-able recommen works warning about the overuse of the metric accu dations. Opposite to this, the task of filtering has com- and the proposal of a most cost-or usefulness/interest. In other words, the task of guiding On the other hand, we are facing the problem of shyness pursues balancing the cost of displaying a recommenda- in the task of guiding. As mentioned, this problem tion against the possibility of annoying the user with it mostly due to the calmed behavior of the recommenders Therefore, the task of guiding is a cost-oriented duty. present in the experiment. Actually, they"expect"to get training cases"without any risk". As a result, there is a Although, the task of filtering fits better in an accu- need of aggressiveness addition into their behaviors so that racy-maximization duty The latter is one of the keys for explaining why the they might explore more hazardous possibilities. In other behavior of conventional algorithms(for the task of fil- words, they must try to balance the(aggressive)explora- tering)are so degraded when applied to the task of guid- tion of new possibilities against the (calmed) exploitation of old knowledge. A good source of these techniques can ing. In fact, the task of filtering employs classifiers that be found in the reinforcement learning field(Sutton maximize their accuracy, though the t needs classifiers that maximize the benefits(minimize Barto, 1998). We find this field very promising for the ar costs). This partially explains the shy behavior of adap of recommender systems, at least for the task of guiding tive recommenders in the experiment: once they are left alone(after the first 50 visits), they tend to prefer the References accurate model"never recommend"-over a more prof- itable one Adomavicius, G,& Tuzhilin, A.(2005). Toward the next generation of Evidently, there is a necessity of new training cases to recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data engineering improve the recommendation task as the work moves 17(6),734-749 along. In the task of filtering, the training cases come Balabanovic, M.,& Shoham, Y(1997).Fab: Content-based,collaborative from the users' activity on items(by rating on them recommendation. Communications of the ACM, 40(3), 66-72. Thus, it is expected an increase of cases as the time goes Breese, J, Heckerman, D.& kadie, C.(998). Empirical analysis of along. However, in the task of guiding, a new training Artificial Intelligence. Proceedings of the 14th Conference(pp. 43-52 case comes only when there is an act of recommenda Morgan Kaufman. tion. In fact, the number of training cases totally Burke, R(2002). Hybrid recommender systems. User Modeling and User- depends on the recommenders activity. For instance Adapted Interaction, 12(4), 331-370. an extreme adaptive recommender whose primer behav Herlocker, J. L, Konstan, J. A, Borchers, A,& riedl, J.(1999).An r was similar to“ never recommend” would never algorithmic framework for performing collaborative filtering. In SIGIR99: Proceedings of the 22nd annmal international ACM SIGIR its conduct because the recommender would conference on research and development in information retrieval never get a single training case. (pp 230-237). New York, NY, USA: ACM Press. Herlocker, J. L, Konstan, J. A, Terveen, L. G,& Riedl, J. T.(2004) Evaluating collaborative filtering recommender systems. ACM Trans- actions on Information Systems, 22(1).5-5 Hernandez del olmo, F, Gaudioso, E, Boticario, J G, 2005. Evaluating 2 In the experiment, not many users decided to follow a recommenda- the intrusion cost of recommending in recommender systems. In User modeling 2005, 10th international conference (pp 342-346)
6. Conclusions In this paper, we have considered the recommendation problem as formed by two tasks. Also, we have focused on the one that, in our opinion, needs a more closer attention in the field: the task of guiding. Also, notice the general terms in which the experiment has been developed (as a summary, see Table 2). They allow us to generalize the results and think just in terms of the task (of guiding) involved. In addition, in an experiment centered on this task, we have observed a really bad behavior of the machine learning techniques commonly employed in this area. As a final conclusion, we claim that this behavior has to do with the following distinguishing characteristics of the task of guiding: The task of guiding pursues two objectives at the same time: (i) it tries to recommend as much as possible, but (ii) it tries to recommend (only) follow-able recommendations. Opposite to this, the task of filtering has commonly pursued accurate recommendations in terms of usefulness/interest. In other words, the task of guiding pursues balancing the cost of displaying a recommendation against the possibility of annoying the user with it. Therefore, the task of guiding is a cost-oriented duty. Although, the task of filtering fits better in an accuracy-maximization duty. The latter is one of the keys for explaining why the behavior of conventional algorithms (for the task of filtering) are so degraded when applied to the task of guiding. In fact, the task of filtering employs classifiers that maximize their accuracy, though the task of guiding needs classifiers that maximize the benefits (minimize costs). This partially explains the shy behavior of adaptive recommenders in the experiment: once they are left alone (after the first 50 visits), they tend to prefer the accurate model ‘‘never recommend”2 over a more profitable one. Evidently, there is a necessity of new training cases to improve the recommendation task as the work moves along. In the task of filtering, the training cases come from the users’ activity on items (by rating on them). Thus, it is expected an increase of cases as the time goes along. However, in the task of guiding, a new training case comes only when there is an act of recommendation. In fact, the number of training cases totally depends on the recommender’s activity. For instance, an extreme adaptive recommender whose primer behavior was similar to ‘‘never recommend” would never improve its conduct because the recommender would never get a single training case. Again, the latter partially explains the experiment. As the recommenders tend to recommend less frequently (tend to ‘‘never recommend”), they get less number of training cases. Therefore, their models remain bad and constant over the experiment. However, as the time runs, a constant bad behavior is worse considered. In the next section, we introduce some ideas which look promising to face the above problems. 7. Future work To overcome the difficulties mentioned in the last section, we present some proposals, which are part of our future work. On the first hand, one of the research paths, in the task of guiding, consists of developing recommender systems based on cost-oriented machine learning algorithms. These kind of algorithms are not new. In fact, there are early works warning about the overuse of the metric accuracy and the proposal of a most cost-oriented approach, i.e (Provost, Fawcett, & Kohavi, 1998). On the other hand, we are facing the problem of shyness in the task of guiding. As mentioned, this problem is mostly due to the calmed behavior of the recommenders present in the experiment. Actually, they ‘‘expect” to get training cases ‘‘without any risk”. As a result, there is a need of aggressiveness addition into their behaviors so that they might explore more hazardous possibilities. In other words, they must try to balance the (aggressive) exploration of new possibilities against the (calmed) exploitation of old knowledge. A good source of these techniques can be found in the reinforcement learning field (Sutton & Barto, 1998). We find this field very promising for the area of recommender systems, at least for the task of guiding. References Adomavicius, G., & Tuzhilin, A. (2005). Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering, 17(6), 734–749. Balabanovic, M., & Shoham, Y. (1997). Fab: Content-based, collaborative recommendation. Communications of the ACM, 40(3), 66–72. Breese, J., Heckerman, D., & Kadie, C. (1998). Empirical analysis of predictive algorithms for collaborative filtering. In Uncertainty in Artificial Intelligence. Proceedings of the 14th Conference (pp. 43–52). Morgan Kaufman. Burke, R. (2002). Hybrid recommender systems. User Modeling and UserAdapted Interaction, 12(4), 331–370. Herlocker, J. L., Konstan, J. A., Borchers, A., & Riedl, J. (1999). An algorithmic framework for performing collaborative filtering. In SIGIR’99: Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval (pp. 230–237). New York, NY, USA: ACM Press. Herlocker, J. L., Konstan, J. A., Terveen, L. G., & Riedl, J. T. (2004). Evaluating collaborative filtering recommender systems. ACM Transactions on Information Systems, 22(1), 5–53. Hernandez del Olmo, F., Gaudioso, E., Boticario, J. G., 2005. Evaluating the intrusion cost of recommending in recommender systems. In User modeling 2005, 10th international conference (pp. 342–346). 2 In the experiment, not many users decided to follow a recommendation, see Table 2. 1976 F. Herna´ ndez del Olmo et al. / Expert Systems with Applications 36 (2009) 1972–1977
F. Hernandez del olmo et al Expert Systems with Applications 36 (2009)1972-1977 Langley, P(1999 ). User modeling in adaptive interfaces. In Proceedings of 15th international conference on machine learning(pp. 445-453). San the seventh international conference on user modeling(pp. 357-370) francisco, CA: Morgan Kaufmann. Banff, Alberta: Spring Resnick, P, lacovou, N, Suchak, M, Bergstrom, P, Riedl, J., 1994 LindensGreg,Smith,Brent&YorkjeRemy(2003).amazon.com Grouplens: An open architecture for collaborative filtering of netnews. In recommendations: Item-to-item collaborative filtering. IEEE Internet Proceedings of the conference on computer supported cooperatiy omputing, 7(1), 76-80 Schwab, L,& Kobsa, A.(2002). Adaptivity through unobstrusive Pazzani, M.J., Muramatsu, J.,& Billsus. D. (1997). Learning and revising arning. K.到(5-9) user profiles: The identification of interesting web sites. Machine Sutton, Richard S,& Barto, Andrew G (1998). Reinforcement learning: earning,27,313-331 An introduction. MIT Press Provost, F, Fawcett, T,& Kohavi, R(1998). The case against accuracy Webb, G, Pazzani, M. J,& Billsus, D (2001). Machine learning for user estimation for comparing induction algorithms. In Proceedings of the deling. User Modeling and User-Adapted Interaction, 11, 19-29
Langley, P. (1999). User modeling in adaptive interfaces. In Proceedings of the seventh international conference on user modeling (pp. 357–370). Banff, Alberta: Springer. Linden, Greg, Smith, Brent, & York, Jeremy (2003). Amazon.com recommendations: Item-to-item collaborative filtering. IEEE Internet Computing, 7(1), 76–80. Pazzani, M. J., Muramatsu, J., & Billsus, D. (1997). Learning and revising user profiles: The identification of interesting web sites. Machine Learning, 27, 313–331. Provost, F., Fawcett, T., & Kohavi, R. (1998). The case against accuracy estimation for comparing induction algorithms. In Proceedings of the 15th international conference on machine learning (pp. 445–453). San Francisco, CA: Morgan Kaufmann. Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., Riedl, J., 1994. Grouplens: An open architecture for collaborative filtering of netnews. In Proceedings of the conference on computer supported cooperative work. Schwab, I., & Kobsa, A. (2002). Adaptivity through unobstrusive learning. KI, 3(5–9). Sutton, Richard S., & Barto, Andrew G. (1998). Reinforcement learning: An introduction. MIT Press. Webb, G., Pazzani, M. J., & Billsus, D. (2001). Machine learning for user modeling. User Modeling and User-Adapted Interaction, 11, 19–29. F. Herna´ ndez del Olmo et al. / Expert Systems with Applications 36 (2009) 1972–1977 1977