LEARNING TO SURF MULTIAGENT SYSTEMS FOR ADAPTIVE WEB PAGE RECOMMENDATION A DISSERTATION SUBMITTED TO THE DEPARTMENT OF COMPUTER SCIENCE AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEG REE OF DOCTOR OF PHILOSOPHY Marko balabanovic March 1998
LEARNING TO SURF: MULTIAGENT SYSTEMS FOR ADAPTIVE WEB PAGE RECOMMENDATION a dissertation submitted to the department of computer science and the committee on graduate studies of stanford university in partial fulfillment of the requirements for the degree of doctor of philosophy Marko Balabanovic March 1998
O Co pyright 1998 by M arko b al All Rights reserved
c Copyright 1998 by Marko Balabanovic All Rights Reserved ii
I CEFIIEY THAT I HAVE READ THIS DISSERLATION AND THAT IN MY OPINION IT IS FUITY ADEQUATE, IN SCOPE AND IN QUAIITY, AS A DISSEFIATION FOR THE DEGREE OF DOCIOR OF PHIIOSOPHY YOAV SHOHAM (PHINCIPAL ADVISER I CEFIIEY THAT I HAVE HEAD THIS DISSERIATION AND THAT IN MY OPINION IT IS FUITY ADEQUATE, IN SCOPE AND IN QUAITY, AS A DISSEFIATION FOR THE DEGREE OF DOCIOR OF PHIIOSOPHY TEREY WINOGRAD I CEFIIEY THAT I HAVE HEAD THIS DISSERIATION AND THAT IN MY OPINION IT IS FUITY ADEQUATE, IN SCOPE AND IN QUAlITY, As A DISSEFIATION FOR THE DEGHEE OF DOCIOR O ILOSOPHY ROBERT BARRETT APPROWED FOR THE UNIVERSITY COMMITIEE ON GEADUATE STUDIES
I certify that I have read this dissertation and that in my opinion it is fully adequate, in scope and in quality, as a dissertation for the degree of Doctor of Philosophy. Yoav Shoham (Principal Adviser) I certify that I have read this dissertation and that in my opinion it is fully adequate, in scope and in quality, as a dissertation for the degree of Doctor of Philosophy. Terry Winograd I certify that I have read this dissertation and that in my opinion it is fully adequate, in scope and in quality, as a dissertation for the degree of Doctor of Philosophy. Robert Barrett Approved for the University Committee on Graduate Studies: iii
Iv
iv
A bstract Imagine a newspaper per Sonalized for your tastes. Instead of a selection of articles hosen for a general audience by a human editor, a Software agent picksitemsjust for you, covering your particular topicS of interest. Since there are no journ alists at its disposal, the agent searches the Web for appropriate articles. Over time, it uses yo feedback on recommen ded articles to build a mo del of your interests. This thesis investigates the design of "recommen der systems" which create Such per Son alized newspapers Two research issu es motivate this work and distinguish it from approach es usually taken by inform ation retrieval or m achinelearning researchers. First, a recommender rstem will have many userS, with overl apping interests. How can this be exploited nalized new sp sts of a small set of ar ticles Techniques for deciding on the relevance of individu al ar ticles are well known, but how is the com dosition of the set determined? One of the primary contributions of this research is an implemented architecture linking populations of adaptive Software agents. Common interests among its userS ed both to increa recommendations. A novel interface infers document preferences by monitoring user crag and drop actions, and affords control over the com position of sets of recommen dationS. Results are presented from a variety of experiments: user tests meaSuring learning performance, simulation stu Cies isolating particular tradeoffs, and uSability tests investigating inter action designs
Abstract Imagine a newspaper personalized for your tastes. Instead of a selection of articles chosen for a general audience byahuman editor, a software agent picks items just for you, covering your particular topics of interest. Since there are no journalists at its disposal, the agent searches the Web for appropriate articles. Over time, it uses your feedback on recommended articles to build a model of your interests. This thesis investigates the design of \recommender systems" which create such personalized newspapers. Two research issues motivate this work and distinguish it from approaches usually taken by information retrieval or machine learning researchers. First, a recommender system will have many users, with overlapping interests. How can this be exploited? Second, each edition of a personalized newspaper consists of a small set of articles. Techniques for deciding on the relevance of individual articles are well known, but how is the composition of the set determined? One of the primary contributions of this research is an implemented architecture linking populations of adaptive software agents. Common interests among its users are used both to increase eciency and scalability, and to improve the quality of recommendations. A novel interface infers document preferences by monitoring user drag-and-drop actions, and aords control over the composition of sets of recommendations. Results are presented from a variety of experiments: user tests measuring learning performance, simulation studies isolating particular tradeos, and usability tests investigating interaction designs. v
VI
vi
knoW LE D GEM TS This re search was con ducted under the supervision of Yoav Shoham, whose intellec- tual rigor and insight have been a con st ant source of en couragement and inspiration Valuable perspectives and gui dance h ave al so been provided by Terry Winograd and Rob Barrett, who both induced greater clarity of thinking and present ation Daphne Koller and Cliff Nass supplied helpful pointers to related areas of re search as members of my orals committee. A thorough groun ding in AI systems and archi lectures was provided by earlier advisors Carole Klein at Cambridge University and both barbara haves-Roth and Nils Nilsson at St an ford Universit ddition, Andy Hopper at Cambridge University provided the initial en couragement to un dertake doctoral rese arch Over the ne arly six years I have been at the Depart ment of Computer Scien ce at St anford University. t here have been const ant valuable interactions with members of the various research groups focussed on creating AI agents(or 'bots) which reason or learn(Alot s, Bots, Nobotics and Nobot s), as well as the People, Computers and Design group focus sed on interaction design, and, most recently, the Digit al Libraries Project, working on many different technologies in support of information seeking tasks. Outside of St anford, I have gained a wider perspective on interaction de sign and prototyping from Mik Lamming and Mike Flynn at the Xerox Research Center in Cambridge, and a better underst anding of real-world technology design and development from James Rucker and Marcos Polan co at Imana in San francisco
Acknowledgements This research was conducted under the supervision of Yoav Shoham, whose intellectual rigor and insight have been a constant source of encouragement and inspiration. Valuable perspectives and guidance have also been provided by Terry Winograd and Rob Barrett, who both induced greater clarity of thinking and presentation. Daphne Koller and Cli Nass supplied helpful pointers to related areas of research as members of my orals committee. A thorough grounding in AI systems and architectures was provided by earlier advisors Carole Klein at Cambridge University and both Barbara Hayes-Roth and Nils Nilsson at Stanford University; in addition, Andy Hopper at Cambridge University provided the initial encouragement to undertake doctoral research. Over the nearly six years I have been at the Department of Computer Science at Stanford University, there have been constant valuable interactions with members of the various research groups focussed on creating AI agents (or `bots) which reason or learn (AIbots, Bots, Nobotics and Nobots), as well as the People, Computers and Design group focussed on interaction design, and, most recently, the Digital Libraries Pro ject, working on many dierent technologies in support of information seeking tasks. Outside of Stanford, I have gained a wider perspective on interaction design and prototyping from Mik Lamming and Mike Flynn at the Xerox Research Center in Cambridge, and a better understanding of real-world technology design and development from James Rucker and Marcos Polanco at Imana in San Francisco. vii
Funding was provided by nsF grant IRI-9503109, the NSF/DARPA/NASA Dig ital Libr ary project(NSF IRI-9411306, NSF grant IRI-9220645, and ArPa grant F49620-94-1-0900. The implementations to be described were con structed with the help of a lot of free Software. Thanks. therefore are also cue to the authors and maintainerS of Apache, Sub Arctic [Hudson and Smith, 1996, Berkeley DB[Seltzer andYigit, 1991, mSQL [Hughes, 1993], CH ACO [Hendrick Son and Leland, 1994]and above all the Python programming langu age [van Rossum, 1995 A number of user tests were performed as part of this work. I am grateful to all of the people who took the time to partici pate, either on the web or in perSon here at stanford rrentlydocumentedathttp://www.apache.org
Funding was provided by NSF grant IRI-9503109, the NSF/DARPA/NASA Digital Library pro ject (NSF IRI-9411306), NSF grant IRI-9220645, and ARPA grant F49620-94-1-0900. The implementations to be described were constructed with the help of a lot of free software. Thanks, therefore, are also due to the authors and maintainers of Apache1 , SubArctic [Hudson and Smith, 1996], Berkeley DB [Seltzer and Yigit, 1991], mSQL [Hughes, 1993], CHACO [Hendrickson and Leland, 1994] and above all the Python programming language [van Rossum, 1995]. A number of user tests were performed as part of this work. I am grateful to all of the people who took the time to participate, either on the Web or in person here at Stanford. 1Currently documented at http://www.apache.org viii
Istrat ACknowledgements Introductio 1.1 The recommen dation tas esearch issues 1.2.1 How to exploit overlaps between userS'interests 1.2.2 How to decide upon the composition of recommendation sets 1.3 Summary of contributions 1.4 Overview of the remainder of this thesis 10 2 BcG lound ad definitions 2. 1 Representing text 2.1.1 Vector space model 2. 1. 2 Special con siderations when gath ering Web pages 2.13 Alternative features 2.2 Repr eventing users 2.2.1 Prefer ence rankin gs: Definition and notation 21 2.2 Information need and relevance
Contents Abstract v Acknowledgements vii 1 Introduction 1 1.1 The recommendation task ........................ 3 1.2 Research issues .............................. 6 1.2.1 How to exploit overlaps between users' interests . . . . . . . . 6 1.2.2 How to decide upon the composition of recommendation sets . 7 1.3 Summary of contributions ........................ 8 1.4 Overview of the remainder of this thesis ................ 10 2 Background and Denitions 13 2.1 Representing text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.1.1 Vector space model . . . . . . . . . . . . . . . . . . . . . . . . 14 2.1.2 Special considerations when gathering Web pages ....... 17 2.1.3 Alternative features . . . . . . . . . . . . . . . . . . . . . . . . 19 2.2 Representing users ............................ 21 2.2.1 Preference rankings: Denition and notation . . . . . . . . . . 21 2.2.2 Information need and relevance ................. 23 ix
2.2.3 USERS'jUDGM ENIS OF DOCUM ENIS 2.2.4 USERPeCfIES 2.2.5 SIM UIATING USERS' DOCUM ENT FANKNG BEHAVIOR 3 EVALUATIONS 2.3.1 RELEVANCE BASED EVALUATION 222333 2.3.2 EVAIUATION USING DISIANCE BETWEN FANK 2.3.3 EXPERIM ENIALM EIHODOIOGY 2. 4 INIEFACIIONS 2.4.1 INP UIS: USER FFEDBAC k 2.4.2 OUIP UIS: RECOM M ENDATION SEIS 42 2.5 SUM MARY OF NCIATION 3 Single Agent Content-B ased recommendation 3.1 DESIGN 3.1.1 colLeCtION 3.12 SELECtiON 3.1.3 LEARNING 3. 2 IMPIEM ENIATION 3. 3 SIM ULATION 3. 4 EXPERIMENIS 3.4.1 CS227 CIASS PRCIECIS 5556 3. 4.2 TESIS OF LIRA 3.43 SIM ULATION HESUIIS 3.5 PHOBIEMS WIH THE PURE CONIENF BASED APPROACH 4 Collaborative recommendation 4.1 DESI 74 4. 2 A BRIEF SUFVEY
2.2.3 Users' judgments of documents ................. 25 2.2.4 User proles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.2.5 Simulating users' document ranking behavior ......... 29 2.3 Evaluations ................................ 32 2.3.1 Relevance-based evaluation . . . . . . . . . . . . . . . . . . . . 32 2.3.2 Evaluation using distance between rankings .......... 33 2.3.3 Experimental methodology . . . . . . . . . . . . . . . . . . . . 35 2.4 Interactions ................................ 38 2.4.1 Inputs: user feedback. . . . . . . . . . . . . . . . . . . . . . . 39 2.4.2 Outputs: recommendation sets ................. 42 2.5 Summary of notation . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3 Single Agent Content-Based Recommendation 45 3.1 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.1.1 Collection ............................. 48 3.1.2 Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.1.3 Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 3.3 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 3.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 3.4.1 CS227 Class Pro jects . . . . . . . . . . . . . . . . . . . . . . . 58 3.4.2 Tests of LIRA . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 3.4.3 Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . 65 3.5 Problems with the pure content-based approach ............ 71 4 Collaborative Recommendation 73 4.1 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 4.2 A brief survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 x