Artificial Intelligence Review(2005)24: 179-19 C Springer 2005 DOI10.1007/s10462-005-4612-x Explanation in Recommender Systems DAVID MCSHERRY School of Computing and Information Engineering, University of Ulster, Coleraine BT52 ISA, Northern Ireland, UK(e- mail: dmg mcsherry @ulster: ac uk) Abstract. There is increasing awareness in recommender systems research of the need to make the recommendation process more transparent to users. Following a brief review of existing approaches to explanation in recommender systems, we focus in this paper on a case-based reasoning(CBR) approach to product recommendation that offers important benefits in terms of the ease with which the recommendation process can be explained and the system s recommendations can be justified. For example, rec- ommendations based on incomplete queries can be justified on the grounds that the er's preferences with respect to attributes not mentioned in her query cannot affect the outcome. We also show how the relevance of any question the user is asked can be explained in terms of its ability to discriminate between competing cases, thus giving users a unique insight into the recommendation process. Keywords: attribute-selection strategy, case-based reasoning, explanation, recommender 1. Introduction The importance of intelligent systems having the ability to explain their reasoning is well recognised in domains such as medical decision aking and intelligent tutoring (e.g. Armengol et al., 2001; Sormo and Aamodt, 2002; Evans-Romaine and Marling, 2003). In an intelli gent tutoring system, for example, communicating the reasoning pro- cess to students may be as important as finding the right solution. Until recently, explanation in recommender systems appears to have been a relatively neglected issue. However, recent research has high lighted the importance of making the recommendation process more transparent to users and the potential role of explanation in achiev ing this objective(Herlocker et al., 2000; Shimazu, 2002; McSherry 2002b,2003b; Reilly et al,2005) Herlocker et al. (2000) suggest that the black box image of recom- mender systems may be one of the reasons why they have gained much less acceptance in high-risk domains such as holiday packages or invest- ment portfolios than in low-risk domains such as CDs or movies. They
DOI 10.1007/s10462-005-4612-x Artificial Intelligence Review (2005) 24: 179–197 © Springer 2005 Explanation in Recommender Systems DAVID MCSHERRY School of Computing and Information Engineering, University of Ulster, Coleraine BT52 1SA, Northern Ireland, UK (e-mail: dmg.mcsherry@ulster.ac.uk) Abstract. There is increasing awareness in recommender systems research of the need to make the recommendation process more transparent to users. Following a brief review of existing approaches to explanation in recommender systems, we focus in this paper on a case-based reasoning (CBR) approach to product recommendation that offers important benefits in terms of the ease with which the recommendation process can be explained and the system’s recommendations can be justified. For example, recommendations based on incomplete queries can be justified on the grounds that the user’s preferences with respect to attributes not mentioned in her query cannot affect the outcome. We also show how the relevance of any question the user is asked can be explained in terms of its ability to discriminate between competing cases, thus giving users a unique insight into the recommendation process. Keywords: attribute-selection strategy, case-based reasoning, explanation, recommender systems 1. Introduction The importance of intelligent systems having the ability to explain their reasoning is well recognised in domains such as medical decision making and intelligent tutoring (e.g. Armengol et al., 2001; Sørmo and Aamodt, 2002; Evans-Romaine and Marling, 2003). In an intelligent tutoring system, for example, communicating the reasoning process to students may be as important as finding the right solution. Until recently, explanation in recommender systems appears to have been a relatively neglected issue. However, recent research has highlighted the importance of making the recommendation process more transparent to users and the potential role of explanation in achieving this objective (Herlocker et al., 2000; Shimazu, 2002; McSherry, 2002b, 2003b; Reilly et al., 2005). Herlocker et al. (2000) suggest that the black box image of recommender systems may be one of the reasons why they have gained much less acceptance in high-risk domains such as holiday packages or investment portfolios than in low-risk domains such as CDs or movies. They
180 D MCSHERRY argue that extracting meaningful explanations from the computational models on which recommendations are based is a challenge that must e addressed to enable the development of recommender systems that are more understandable, more effective, and more acceptable. It is an argument that seems equally compelling in collaborative and content based approaches to product recommendation McSherry (2003a)proposes a case-base reasoning(CBR)approach to product recommendation that combines an effective strategy for reducing the length of recommendation dialogues with a mechanism for ensuring that the dialogue is terminated only when it is certain that the recommendation will be the same no matter how the user chooses to extend her query. Referring to the approach as incremen- tal nearest neighbour (iNN), we focus here on the benefits it offers terms of making the recommendation process more transparent to users. One advantage is that recommendations based on incomplete queries can be justified on the grounds that the user's preferences with respect to attributes not mentioned in her query cannot affect the out come. We also show how the relevance of any question the user is asked can be explained in terms of its ability to discriminate between competing cases, thus giving users a unique insight into the recom ommender systems and some of the pproaches to explanation in rec- In Section 2. we examine existing essons learned from this research In Section 3, we present a detailed account of the recommendation process in inN and the important role played by the concept of case dominance in the approach. In Section 4, we present an approach to explanation in which there is no requirement for domain knowledge other than the similarity knowledge and cases already available to the system. We demonstrate the approach in a mixed-initiative recom- mender system called Top Case which can explain the relevance of any question the user is asked in strategic terms, recognise when the dia- logue can be safely terminated, and justify its recommendations on the grounds that any un-elicited preferences of the user cannot affect the outcome. Related work is discussed in Section 5 and our conclusions are presented in Section 6 2. Existing Approaches Herlocker et al.(2000)evaluated several approaches to explanation in the collaborative movie recommender movie Lens in terms of their
180 D. MCSHERRY argue that extracting meaningful explanations from the computational models on which recommendations are based is a challenge that must be addressed to enable the development of recommender systems that are more understandable, more effective, and more acceptable. It is an argument that seems equally compelling in collaborative and contentbased approaches to product recommendation. McSherry (2003a) proposes a case-base reasoning (CBR) approach to product recommendation that combines an effective strategy for reducing the length of recommendation dialogues with a mechanism for ensuring that the dialogue is terminated only when it is certain that the recommendation will be the same no matter how the user chooses to extend her query. Referring to the approach as incremental nearest neighbour (iNN), we focus here on the benefits it offers in terms of making the recommendation process more transparent to users. One advantage is that recommendations based on incomplete queries can be justified on the grounds that the user’s preferences with respect to attributes not mentioned in her query cannot affect the outcome. We also show how the relevance of any question the user is asked can be explained in terms of its ability to discriminate between competing cases, thus giving users a unique insight into the recommendation process. In Section 2, we examine existing approaches to explanation in recommender systems and some of the lessons learned from this research. In Section 3, we present a detailed account of the recommendation process in iNN and the important role played by the concept of case dominance in the approach. In Section 4, we present an approach to explanation in which there is no requirement for domain knowledge other than the similarity knowledge and cases already available to the system. We demonstrate the approach in a mixed-initiative recommender system called Top Case which can explain the relevance of any question the user is asked in strategic terms, recognise when the dialogue can be safely terminated, and justify its recommendations on the grounds that any un-elicited preferences of the user cannot affect the outcome. Related work is discussed in Section 5 and our conclusions are presented in Section 6. 2. Existing Approaches Herlocker et al. (2000) evaluated several approaches to explanation in the collaborative movie recommender MovieLens in terms of their
EXPLANATION IN RECOMMENDER SYSTEMS 181 effects on user acceptance of the system's recommendations. The most onvincing explanation of why a movie was recommended was one in which users were shown a histogram of the ratings of the same movie by similar users. Moreover, grouping together of good ratings (4 or and bad ratings(I or 2)and separation of ambivalent ratings(3)was found to increase the effectiveness of the histogram approach. Inter estingly, the second most convincing explanation was a simple state ment of the system's performance in the past e.g Movie Lens has predicted correctly for you 80% of the time in the past Another important finding was that some of the explanations eval uated had a negative impact on acceptance, the goal of explanation in this instance, showing that no explanation may be better than one that is poorly designed CBR recommender systems that can explain their recommenda- tions include Shimazu's(2002) Expert Clerk and McSherry's(2003b) First Case. Expert Clerk can explain why it is proposing two contrast ing products in terms of the trade-offs between their positive and neg ative features e.g This blouse is more expensive but the material is silk. That one is cheaper but the material is polyester Its explanations are based on assumed preferences with respect to attributes not mentioned in the user's query. For example, a blouse made of silk is assumed to be preferred to one made of polyester In a similar way, First Case can explain why one case is more highly recommended than another by highlighting the benefits it offers (McSherry, 2003b). As the following example illustrates, it can also explain why a given product, such as a personal computer, is recom- mended in terms of the compromises it involves with respect to the Case 38 differs from your query only in processor speed and mon- itor size. It is better than Case 50 in terms of memory and price However. the potentia of explanation in recommender sys- tems is not limited to explaining why a particular item is recom- mended. In this paper, we present a CBr recommender system that can also explain the relevance of any question the user is asked in terms of its ability to discriminate between competing cases. Reilly et al. s(2005)dynamic approach to critiquing in recommender sys tems differs from traditional critiquing approaches (e.g. Burke, 2002)
EXPLANATION IN RECOMMENDER SYSTEMS 181 effects on user acceptance of the system’s recommendations. The most convincing explanation of why a movie was recommended was one in which users were shown a histogram of the ratings of the same movie by similar users. Moreover, grouping together of good ratings (4 or 5) and bad ratings (1 or 2) and separation of ambivalent ratings (3) was found to increase the effectiveness of the histogram approach. Interestingly, the second most convincing explanation was a simple statement of the system’s performance in the past e.g. MovieLens has predicted correctly for you 80% of the time in the past. Another important finding was that some of the explanations evaluated had a negative impact on acceptance, the goal of explanation in this instance, showing that no explanation may be better than one that is poorly designed. CBR recommender systems that can explain their recommendations include Shimazu’s (2002) ExpertClerk and McSherry’s (2003b) First Case. ExpertClerk can explain why it is proposing two contrasting products in terms of the trade-offs between their positive and negative features e.g. This blouse is more expensive but the material is silk. That one is cheaper but the material is polyester. Its explanations are based on assumed preferences with respect to attributes not mentioned in the user’s query. For example, a blouse made of silk is assumed to be preferred to one made of polyester. In a similar way, First Case can explain why one case is more highly recommended than another by highlighting the benefits it offers (McSherry, 2003b). As the following example illustrates, it can also explain why a given product, such as a personal computer, is recommended in terms of the compromises it involves with respect to the user’s preferences. Case 38 differs from your query only in processor speed and monitor size. It is better than Case 50 in terms of memory and price. However, the potential role of explanation in recommender systems is not limited to explaining why a particular item is recommended. In this paper, we present a CBR recommender system that can also explain the relevance of any question the user is asked in terms of its ability to discriminate between competing cases. Reilly et al.’s (2005) dynamic approach to critiquing in recommender systems differs from traditional critiquing approaches (e.g. Burke, 2002)
182 D MCSHERRY in that critiques are dynamically generated by the system and may involve compromises as well as improvements relative to the currently recommended case. In this way, the user is informed in advance of trade-offs associated with desired improvements. Before selecting a suggested critique, the user can ask to see an explanation of the trade offs involved In recommender systems that treat some or all of the user's require ments as constraints that must be satisfied, explanation can also play an important role in recovery from the retrieval failures that occur when there is no exact match for the user's requirements(Hammond et al., 1996; McSherry, 2004). Hammond et al.s(1996)Car Naviga tor is a recommender system for cars that uses declarative knowl edge to explain trade-offs that are known to be common causes of retrieval failure in the domain, such as that between fuel economy and horsepower. For example, if the user asks for good fuel economy and high horsepower, she is shown a movie explaining the trade-off between these features. The user is also advised that she will need to revise her query if she hopes to find a car that meets her require ments In recent work, we combined a knowledge-light approach to expla nation of retrieval failure with a mixed-initiative approach to recovery from retrieval failure in a CBR recommender system called Show Me (McSherry, 2004). Failure to retrieve a case that exactly matches the user's query triggers an explanation that draws the user's attention t combinations of features in her query for which there are no matching cases e orry, there are no products that match these combinations of fea- tures in your query:(price 500, type laptop),(type laptop screen size = 19) As well as highlighting areas of the product space in which the case library is lacking in coverage, the explanation may reveal misconceptions on the part of the user such as the price she expects to pay for the prod uct she is seeking. Showing the user only the minimally failing sub-queries of her query, a technique we have adapted from research on co-opera- tive responses to failing data base queries( Gasterland et al., 1992), helps to minimise cognitive load in the approach. Explanation of the retrieval failure is followed in Show Me by a mixed-initiative recovery process in which the user is guided in the selection of one or more constraints to be eliminated from her query(Mcsherry, 2004)
182 D. MCSHERRY in that critiques are dynamically generated by the system and may involve compromises as well as improvements relative to the currently recommended case. In this way, the user is informed in advance of trade-offs associated with desired improvements. Before selecting a suggested critique, the user can ask to see an explanation of the tradeoffs involved. In recommender systems that treat some or all of the user’s requirements as constraints that must be satisfied, explanation can also play an important role in recovery from the retrieval failures that occur when there is no exact match for the user’s requirements (Hammond et al., 1996; McSherry, 2004). Hammond et al.’s (1996) Car Navigator is a recommender system for cars that uses declarative knowledge to explain trade-offs that are known to be common causes of retrieval failure in the domain, such as that between fuel economy and horsepower. For example, if the user asks for good fuel economy and high horsepower, she is shown a movie explaining the trade-off between these features. The user is also advised that she will need to revise her query if she hopes to find a car that meets her requirements. In recent work, we combined a knowledge-light approach to explanation of retrieval failure with a mixed-initiative approach to recovery from retrieval failure in a CBR recommender system called ShowMe (McSherry, 2004). Failure to retrieve a case that exactly matches the user’s query triggers an explanation that draws the user’s attention to combinations of features in her query for which there are no matching cases e.g. Sorry, there are no products that match these combinations of features in your query: (price ≤ 500, type = laptop), (type = laptop, screen size = 19) As well as highlighting areas of the product space in which the case library is lacking in coverage, the explanation may reveal misconceptions on the part of the user such as the price she expects to pay for the product she is seeking. Showing the user only the minimally failing sub-queries of her query, a technique we have adapted from research on co-operative responses to failing database queries (Gasterland et al., 1992), helps to minimise cognitive load in the approach. Explanation of the retrieval failure is followed in ShowMe by a mixed-initiative recovery process in which the user is guided in the selection of one or more constraints to be eliminated from her query (McSherry, 2004)
EXPLANATION IN RECOMMENDER SYSTEMS 183 3. Incremental Nearest Neighbour In this section, a brief overview of conversational CBR(Aha et al 2001)in the context of product recommendation is followed by a detailed account of the recommendation process in iNn and the important role played by the concept of case dominance in the approach. One distinguishing feature of our approach is a goal-driven attribute selection strategy that has been shown to be very effective in reducing the length of recommendation dialogues(McSherry, 2003a) Another is a simple mechanism for ensuring that the dialogue is ter- minated only when it is certain that a more similar case will not be found if the dialogue is allowed to continue 3. 1. Conversational CBr In CBr approaches to product recommendation, descriptions of the available products are stored in a case library and retrieved in response to a query representing the preferences of the user. In con- versational CBR(CCBR)approaches like INN, a query is incremer tally (and often incompletely) elicited in an interactive dialogue with the user. We focus here on approaches in which the retrieval of recom- mended cases is based on their similarity to the elicited query, rather than relying on exact matching as in most decision-tree approaches (e.g. Doyle and Cunningham, 2000: McSherry, 2001b) Given a query Q over a subset Ao of the case attributes A, the similarity of any case C to Q is typically defined to be sim(C, 2)=> wa sima(C, 2), where for each a E A, Wa is the importance weight assigned to a and sima(C, @)is a local measure of the similarity of a(C), the value of a in C, to Ia(Q), the preferred value of a. As usual in practice, we assume that0≤sima(x,y)≤ I for all a∈ a and that sim(x,y)=lif nd only if x=y. We also assume that for each a E A, the distance measure I-sima satisfies the triangle inequalit a generic algorithm for CCBR in product recommendation (CCBR- PR)is shown in Figure 1. At each stage of the recommenda- tion dialogue, the system selects the next most useful attribute, asks the user for the preferred value, and retrieves the case (or product)
EXPLANATION IN RECOMMENDER SYSTEMS 183 3. Incremental Nearest Neighbour In this section, a brief overview of conversational CBR (Aha et al., 2001) in the context of product recommendation is followed by a detailed account of the recommendation process in iNN and the important role played by the concept of case dominance in the approach. One distinguishing feature of our approach is a goal-driven attribute selection strategy that has been shown to be very effective in reducing the length of recommendation dialogues (McSherry, 2003a). Another is a simple mechanism for ensuring that the dialogue is terminated only when it is certain that a more similar case will not be found if the dialogue is allowed to continue. 3.1. Conversational CBR In CBR approaches to product recommendation, descriptions of the available products are stored in a case library and retrieved in response to a query representing the preferences of the user. In conversational CBR (CCBR) approaches like iNN, a query is incrementally (and often incompletely) elicited in an interactive dialogue with the user. We focus here on approaches in which the retrieval of recommended cases is based on their similarity to the elicited query, rather than relying on exact matching as in most decision-tree approaches (e.g. Doyle and Cunningham, 2000; McSherry, 2001b). Given a query Q over a subset AQ of the case attributes A, the similarity of any case C to Q is typically defined to be: sim(C, Q)= a∈AQ wa sima(C, Q), where for each a ∈ A,wa is the importance weight assigned to a and sima(C, Q) is a local measure of the similarity of πa(C), the value of a in C, to πa(Q), the preferred value of a. As usual in practice, we assume that 0 ≤ sima(x, y) ≤ 1 for all a ∈ A and that sima(x, y) = 1 if and only if x = y. We also assume that for each a ∈ A, the distance measure 1−sima satisfies the triangle inequality. A generic algorithm for CCBR in product recommendation (CCBR-PR) is shown in Figure 1. At each stage of the recommendation dialogue, the system selects the next most useful attribute, asks the user for the preferred value, and retrieves the case (or product)
D MCSHERRY algorithm CCBR-PR Q←Qu{a=w retrieve the case C that is most similar to o until termination criteria are satisfied Figure 1. Conversational CBR in product recommendation. that is most similar to the query that has been elicited so far. The dialogue continues until some predefined termination criteria are satis fied. or until no further attributes remain. The case recommended on each cycle is usually the one that is most similar to the current query r, It is not un similar to a given query, in which case we assume that all such cases are equally recommended. That is, we define the recommendation for en query r(Q={C:sim(C,Q)≥sim(C°,Q) for all c Cases other than those that are maximally similar to the current query may also be presented as alternatives that the user may wish to con- sider, though the number of cases that can be presented to the user may be limited by the available screen space. Of course, cognitive load is another important consideration The defining components of a CCBR-PR algorithm are the strat egy used to select the most useful attribute on each recommendation cycle and the criteria used to decide when the dialogue should be terminated. Possible approaches to attribute selection include giving priority to the most important of the remaining attribut McSherry, 2003a)and the similarity-based approach proposed by Kohlmaier et al.(2001). Various approaches to termination of the rec ommendation dialogue are also possible. For example, the dialogue could be terminated when the current query Q is such that [r(Q)I I or when the similarity of any case reaches a predefined threshold As we shall see in Section 3. 4. the criteria for termination of the
184 D. MCSHERRY Figure 1. Conversational CBR in product recommendation. that is most similar to the query that has been elicited so far. The dialogue continues until some predefined termination criteria are satis- fied, or until no further attributes remain. The case recommended on each cycle is usually the one that is most similar to the current query. However, it is not unusual for more than one case to be maximally similar to a given query, in which case we assume that all such cases are equally recommended. That is, we define the recommendation for a given query Q to be: r(Q)= {C : sim(C, Q)≥sim(C◦ , Q) for all C◦ }. Cases other than those that are maximally similar to the current query may also be presented as alternatives that the user may wish to consider, though the number of cases that can be presented to the user may be limited by the available screen space. Of course, cognitive load is another important consideration. The defining components of a CCBR-PR algorithm are the strategy used to select the most useful attribute on each recommendation cycle and the criteria used to decide when the dialogue should be terminated. Possible approaches to attribute selection include giving priority to the most important of the remaining attributes (McSherry, 2003a) and the similarity-based approach proposed by Kohlmaier et al. (2001). Various approaches to termination of the recommendation dialogue are also possible. For example, the dialogue could be terminated when the current query Q is such that |r(Q)| = 1 or when the similarity of any case reaches a predefined threshold. As we shall see in Section 3.4, the criteria for termination of the
EXPLANATION IN RECOMMENDER SYSTEMS recommendation dialogue in inn are closely linked to the attribut selection strategy that characterises the approach 3. 2. Identifying dominated cases In INN, an important role in the recommendation process is played by the concept of case dominance that we now define Definition 1: A given case C2 is dominated by another case CI with respect to a query o if sim( C2, 0)wa(l-sima(C1, C2)<sim(C1,Q) a∈A-AQ Proof: See Appendix A 3.3. Attribute selection strategy The attribute selected by inN on each cycle of the recommendation process is the one that is most useful for confirming the case selected as the target case. The target first selected at random from the cases that are maximally similar to an initial query entered by the
EXPLANATION IN RECOMMENDER SYSTEMS 185 recommendation dialogue in iNN are closely linked to the attributeselection strategy that characterises the approach. 3.2. Identifying dominated cases In iNN, an important role in the recommendation process is played by the concept of case dominance that we now define. Definition 1: A given case C2 is dominated by another case C1 with respect to a query Q if sim(C2, Q) < sim(C1, Q) and sim(C2, Q∗) < sim(C1, Q∗) for all extensions Q∗ of Q. One reason for the importance of case dominance in product recommendation is that if a given case C2 is dominated by another case C1 then the product represented by C2 can be eliminated. Of course, the number of ways in which a given query can be extended may be very large. So given an incomplete query Q and cases C1, C2 such that sim(C2, Q) < sim(C1, Q), how can we tell if C2 is dominated by C1 without resorting to exhaustive search? One situation in which C2 is clearly dominated by C1 is when both cases have the same values for all the remaining attributes. Another is when sim(C1, Q)−sim(C2, Q) is greater than the sum of the importance weights of all the remaining attributes. In situations where dominance is less obvious, account must be taken of the similarity between the two cases as well as their similarities to the current query (McSherry, 2003a). The criterion used to identify dominated cases in iNN is presented in the following theorem. Theorem 1: A given case C2 is dominated by another case C1 with respect to a query Q if and only if: sim(C2, Q)+ a∈A−AQ wa(1−sima(C1, C2)) <sim(C1, Q). Proof: See Appendix A. 3.3. Attribute selection strategy The attribute selected by iNN on each cycle of the recommendation process is the one that is most useful for confirming the case selected as the target case. The target case is first selected at random from the cases that are maximally similar to an initial query entered by the
186 user,and continually revised as the query is extended. No change is eeded as long as the target case remains one of the cases that are maximally similar to the current query As Figure 2 illustrates, attribute selection in inn aims to maxi- mise the number of cases dominated by the target case. The cases cur- rently dominated by the target case are shown in the lower half of the diagram. As indicated by the dashed arrows, there may be many dom inance relationships with respect to the current query, but iNn con- siders only cases that are dominated by the target case. For each of the remaining attributes. it uses the dominance criterion from Theo rem l to determine the number of cases that will be dominated by the target case if the preferred value of the attribute is the same as in the target case. It then selects the attribute that maximises the num ber of cases potentially dominated by the target case. If two or more attributes are equally promising according to this criterion, iNn uses the importance weights assigned to the case attributes as a second ary selection criterion. That is, it chooses the most important of the equally promising attributes. 3.4. Terminating the recommendation dialogue As we have shown in previous work, naive approaches to termination of recommendation dialogues such as stopping when the similarity of any case reaches a predefined threshold cannot guarantee that a bet- ter solution will not be found if the dialogue is allowed to continue (McSherry, 2003a). In fact, the only way to ensure that a more sim lar case(or another equally similar case) will not be found is to insist Maximally similar cases Target case Cases dominated by target case Figure 2. Attribute selection in iNN aims to maximise the number of cases dominate
186 D. MCSHERRY user, and continually revised as the query is extended. No change is needed as long as the target case remains one of the cases that are maximally similar to the current query. As Figure 2 illustrates, attribute selection in iNN aims to maximise the number of cases dominated by the target case. The cases currently dominated by the target case are shown in the lower half of the diagram. As indicated by the dashed arrows, there may be many dominance relationships with respect to the current query, but iNN considers only cases that are dominated by the target case. For each of the remaining attributes, it uses the dominance criterion from Theorem 1 to determine the number of cases that will be dominated by the target case if the preferred value of the attribute is the same as in the target case. It then selects the attribute that maximises the number of cases potentially dominated by the target case. If two or more attributes are equally promising according to this criterion, iNN uses the importance weights assigned to the case attributes as a secondary selection criterion. That is, it chooses the most important of the equally promising attributes. 3.4. Terminating the recommendation dialogue As we have shown in previous work, na¨ıve approaches to termination of recommendation dialogues such as stopping when the similarity of any case reaches a predefined threshold cannot guarantee that a better solution will not be found if the dialogue is allowed to continue (McSherry, 2003a). In fact, the only way to ensure that a more similar case (or another equally similar case) will not be found is to insist Cases dominated by target case Target case Maximally similar cases Figure 2. Attribute selection in iNN aims to maximise the number of cases dominated by the target case
EXPLANATION IN RECOMMENDER SYSTEMS 187 that the recommendation dialogue is terminated only when the current query Q is such that r(Q)=r(e)for all possible extensions Q*of Q That is, the recommendation dialogue can be safely terminated only when it is certain that the recommendation will be the same no mat- ter how the user chooses to extend her query It may seem at first sight that testing the above condition for safe termination of the recommendation dialogue may require an exhaus- tive search over all possible extensions of the current query. How ever, McSherry(2003a)shows that it can be tested without relying on exhaustive search and with a computational cost that increases only linearly with the size of the case library. The criteria used in inn to recognise when the recommendation dialogue can be safely terminated are stated in the following theorem Theorem 2: The recommendation dialogue in iNN can be safely term nated if and only if the following conditions hold. L. any case that equals the similarity of the target case to the current uery has the same values as the target case for all remaining attri- butes 2. all cases that are less similar than the target case are dominated by Proof: See Appendix A Although expressed in terms of the target case in INN's goal-driven approach to attribute selection, the criteria identified in Theorem 2 are equivalent to the criteria we have shown to be essential to ensure that the recommendation dialogue can be safely terminated in any approach to attribute selection(McSherry 2003a) 3.5. Recommendation efficiency In previous work, we evaluated iNN in comparison with CCBR-PR algorithms based on a variety of different attribute-selection strategies (McSherry, 2003a). The performance measure of interest was mendation efficiency as measured by the average number of tions the user is asked before the final recommendation is made The algorithms compared differed only in their attribute-selection strategies, with termination of the recommendation dialogue based
EXPLANATION IN RECOMMENDER SYSTEMS 187 that the recommendation dialogue is terminated only when the current query Q is such that: r(Q∗ )=r(Q) for all possible extensions Q∗ of Q. That is, the recommendation dialogue can be safely terminated only when it is certain that the recommendation will be the same no matter how the user chooses to extend her query. It may seem at first sight that testing the above condition for safe termination of the recommendation dialogue may require an exhaustive search over all possible extensions of the current query. However, McSherry (2003a) shows that it can be tested without relying on exhaustive search and with a computational cost that increases only linearly with the size of the case library. The criteria used in iNN to recognise when the recommendation dialogue can be safely terminated are stated in the following theorem. Theorem 2: The recommendation dialogue in iNN can be safely terminated if and only if the following conditions hold: 1. any case that equals the similarity of the target case to the current query has the same values as the target case for all remaining attributes, 2. all cases that are less similar than the target case are dominated by the target case. Proof: See Appendix A. Although expressed in terms of the target case in iNN’s goal-driven approach to attribute selection, the criteria identified in Theorem 2 are equivalent to the criteria we have shown to be essential to ensure that the recommendation dialogue can be safely terminated in any approach to attribute selection (McSherry 2003a). 3.5. Recommendation efficiency In previous work, we evaluated iNN in comparison with CCBR-PR algorithms based on a variety of different attribute-selection strategies (McSherry, 2003a). The performance measure of interest was recommendation efficiency as measured by the average number of questions the user is asked before the final recommendation is made. The algorithms compared differed only in their attribute-selection strategies, with termination of the recommendation dialogue based
188 D MCSHERRY on the criteria we have shown to be essential to ensure that the recommendation will remain unchanged in any approach to attribute Our evaluation was based on the Travel case library (www ai-cbr. org), a standard benchmark containing more than 1,000 cases, and the pc case library (McGinty and Smyth, 2002). The results showed inn to be more effective in reducing dialogue length than any of the other attribute-selection strategies. Its performance on the pc ase library was close to optimal, reducing the number of questions asked by up to 63% and by 35% on average relative to a full-length query. It also gave the best performance on the Travel case library, reducing dialogue length by up to 63% and by 25% on average 4. Explanation in Top Case We now present an approach to explanation of the recommendation process in iNN in which explanations are automatically generated with no requirement for domain knowledge other than the similar ity knowledge and cases already available to the system. We demon- strate the approach in a mixed-initiative recommender system called Top Case which can explain the relevance of any question the user is asked in strategic terms, recognise when the dialogue can be safely terminated, and justify its recommendations on the grounds that any remaining attributes cannot affect the outcome. An example recom mendation dialogue based on a well-known case library in the travel domain is used to illustrate the approach 4.1. Explanation engineering An initial query entered by the user is incrementally extended in Top Case by asking the user to specify preferred values for attributes not mentioned in her initial query. On each recommendation cycle, the user is asked for the preferred value of the most useful attribute for confirming the target case and shown the competing cases that are now most similar to her query. The user can terminate the recommen- lation dialogue at any stage by selecting one of the cases she is shown as the product she prefers. Otherwise, query elicitation continues until Top Case has determined that its recommendation will be the same no matter how the user chooses to extend her query. At this point, the dialogue is terminated and the user is informed that the target case
188 D. MCSHERRY on the criteria we have shown to be essential to ensure that the recommendation will remain unchanged in any approach to attribute selection. Our evaluation was based on the Travel case library (www. ai-cbr.org), a standard benchmark containing more than 1,000 cases, and the PC case library (McGinty and Smyth, 2002). The results showed iNN to be more effective in reducing dialogue length than any of the other attribute-selection strategies. Its performance on the PC case library was close to optimal, reducing the number of questions asked by up to 63% and by 35% on average relative to a full-length query. It also gave the best performance on the Travel case library, reducing dialogue length by up to 63% and by 25% on average. 4. Explanation in Top Case We now present an approach to explanation of the recommendation process in iNN in which explanations are automatically generated with no requirement for domain knowledge other than the similarity knowledge and cases already available to the system. We demonstrate the approach in a mixed-initiative recommender system called Top Case which can explain the relevance of any question the user is asked in strategic terms, recognise when the dialogue can be safely terminated, and justify its recommendations on the grounds that any remaining attributes cannot affect the outcome. An example recommendation dialogue based on a well-known case library in the travel domain is used to illustrate the approach. 4.1. Explanation engineering An initial query entered by the user is incrementally extended in Top Case by asking the user to specify preferred values for attributes not mentioned in her initial query. On each recommendation cycle, the user is asked for the preferred value of the most useful attribute for confirming the target case and shown the competing cases that are now most similar to her query. The user can terminate the recommendation dialogue at any stage by selecting one of the cases she is shown as the product she prefers. Otherwise, query elicitation continues until Top Case has determined that its recommendation will be the same no matter how the user chooses to extend her query. At this point, the dialogue is terminated and the user is informed that the target case