正在加载图片...
⑥ MODELING PHASE ② FILTERING PHASE magnetic disk,… updates the monetary syston,… softwar,… User Model before the upd ② RELEVANCE OF THE DOCUMENT d fitern See fi]ure 4 for a sum m ary sk etch of the filter) ]ory, and a new docum ent was sou ht in the test corpus w ith in that cate)ory. If a relev ant docum ent as found, . t w as added to the ad v isor proposa 4 Evaluation oth no docum ents for that cate] ory is Wew anted to estim atehon much thene version of posed. Eventually, an add ition ald ocum ent, outside SitelF (synset based)actuaIly im proves the perfor- the cate ores brow sed by the user cou d be add ed m an ces w th respect to the previous version of the by the adv sor. On av eraje, the adv isor proposed 3 sy stem (w ord based ) Hor ever, settin] a com pa d ocum ents for a user d ocum ent tive test am on)user m odek, )oin) bey ond a)eneric v isor proposals user satisfaction is not strai htforw ard. To eval at w th the requ Its of the t o sy stem s. To sim ulte w hether and how the exp lo tat ion of the syn set rep the adv ior beh av ior(ie. it is allow ed that for resen tation m proves the accuracy of the sem anti )Ien cate ory no proposal is seLected), althe sys m ents w h ose releyan ce w as less th an a fixed netur ork m od ella) and filter), we arran]ed an ex- d ifferen ce(20%)from thebestdocum ent,w ere elin perim ent w hose ]oalw as to com pare th of adv Bor sy stem s against the jud ) em ents of a hum an in ated. After this selection, on avera]e, the system the t o proposed 10 docum ents for a user d ocum ent set We proceed ed n the follow in)w ay. First, a te Stand ard fijures for prec is ion and recaIlh ave been of about one ed En)lich new s from the calculated con sider the m atch es am on] the advi ADN KRON oS corPus w ere selected hom o)en eously sor and the sy stem s docum ents. Precision i the w th respect to the overalld istribution i cate] ories ratio of recom m end ed docum ents that are relevant (ie cul t ure, notors, etc.). The test set h as be is the ratio of relevant m ade available as a Web site, and then 12 ITC-rst that are recom m ended. In term s of our experim ent research ers w ere asked to brow se the site sim ulti we have precision a u ser v isatin] the new s site. Users w ere instructed w here H is the set of the hum an adv isor proposals to select a new s, accord n) to th ei person al inter- and s is the set of the sy stem proposals sts, to com pletely read It, and th en to select an Table I shas the resutt of the evaluation th other new s, a)an accord into their interests. Th i first colum n takes into account the docum ent new s, process w asrep eated untilten new s w ere picked out. the second on ly the ADNKRoNoS cate] ories. We After th is ph ase, a hum an adv isor, who was ac- can note that precision con s: d erably n creases(34%) quanted w ith the test corpus, w as asked to analyze w th the syn set based u ser m odel Th is confirm s cs, and to propose thew orkin hypothes is th at su bst tutin]w ords w ith new potential interestin] docum ents from the cor- sen ses both in the m odell) and in the filterin) pus. The adv isor w as requ ested to fo lo the sam e ph ase produces a m ore accurate output. The m ain reason, as expected, is th at a syn set- based retrieval first rouped accord in to ther ADNKRON oS cate allow s to prefer docum ents w th hih deree ofUSER MODELING PHASE SiteIF considers the user visited documents in a navigation session User Model before the update visits updates the user model Comparison {magnetic disk, …} {software, …} {operating system, … } List of synsets {…} WDD algorithm {monetary system, …} List of synsets WDD algorithm FILTERING PHASE {metal money, …} {currency, …} SiteIF compares any site document with the user model {…} User Model after the update 3 5 3 1 3 4 3 4 2 {operating system, …} {monetary system, …} {software, …} {magnetic disk, …} RELEVANCE OF THE DOCUMENT 2 3 4 2,5 3 3,5 {operating system, … } {monetary system, …} {software, …} 1 1 2 2 1 2 Figure 4: Modelling and Filtering Processes See gure 4 for a summary sketch of the ltering process. 4 Evaluation We wanted to estimate how much the new version of SiteIF (synset based) actually improves the perfor￾mances with respect to the previous version of the system (word based). However, setting a compara￾tive test among user models, going beyond a generic user satisfaction is not straightforward. To evaluate whether and how the exploitation of the synset rep￾resentation improves the accuracy of the semantic network modelling and ltering, we arranged an ex￾periment whose goal was to compare the output of the two systems against the judgements of a human advisor. We proceeded in the following way. First, a test set of about one hundred English news from the AdnKronos corpus were selected homogeneously with respect to the overall distribution in categories (i.e. culture, motors, etc. . . ). The test set has been made available as a Web site, and then 12 ITC-irst researchers were asked to browse the site, simulating a user visiting the news site. Users were instructed to select a news, according to their personal inter￾ests, to completely read it, and then to select an￾other news, again according to their interests. This process was repeated until ten news were picked out. After this phase, a human advisor, who was ac￾quainted with the test corpus, was asked to analyze the documents chosen by the users, and to propose new potential interesting documents from the cor￾pus. The advisor was requested to follow the same procedure for each document set: documents were rst grouped according to their AdnKronos cate￾gory, and a new document was sought in the test corpus within that category. If a relevant document was found, it was added to the advisor proposals, otherwise no documents for that category is pro￾posed. Eventually, an additional document, outside the categories browsed by the user could be added by the advisor. On average, the advisor proposed 3 documents for a user document set. At this point we compared the advisor proposals with the results of the two systems. To simulate the advisor behavior (i.e. it is allowed that for a given category no proposal is selected), all the sys￾tem documents whose relevance was less than a xed di erence (20%) from the best document, were elim￾inated. After this selection, on average, the system proposed 10 documents for a user document set. Standard gures for precision and recall have been calculated considering the matches among the advi￾sor and the systems documents. Precision is the ratio of recommended documents that are relevant, while the recall is the ratio of relevant documents that are recommended. In terms of our experiment we have precision = jH\Sj jSj and recall = jH\Sj jHj , where H is the set of the human advisor proposals and S is the set of the system proposals. Table 1 shows the result of the evaluation. The rst column takes into account the document news, the second only the AdnKronos categories. We can note that precision considerably increases (34%) with the synset-based user model. This con rms the working hypothesis that substituting words with senses both in the modelling and in the ltering phase produces a more accurate output. The main reason, as expected, is that a synset-based retrieval allows to prefer documents with high degree of se-
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有