正在加载图片...
C Porcel et aL/ Expert Systems with Applications 36(2009)12520-12528 3.4. Feedback phase P In this phase the recommender system recalculates and updates the recommendations of the accessed resources. This feedback Recall is calculated as the ratio of the selected relevant items to the activity is developed in the following steps elevant items, that is, it represents the probability of a relevant items being selected (1)The system recommends the user U a resource R, and then it R asks the user his/her opinion or evaluation judgements R about it Fl is a combination metric that gives equal weight to both precision (2)The user communicates the linguistic evaluation judge- and recall(Cao& Li, 2007: Sarwar et al., 2000): Sy (3) This evaluation is registered in the system for future F1=2XRX endations. The system recalculates the linguistic P endations of R by aggregating the opinions provi other users together with rcy provided by U. This can be 4.2. Experimental result done using the 2-tuple aggregation operator x given in he purpose of the experiments is to test the performance of the proposed recommender system, so we compared the recommen- 4. Experiment and evaluation dations made by the system and the recommendations made by the library staff. In this section we present the evaluation of the proposed sys- We considered a data set with 50 research resources of different tem. The main focus in evaluating the system is to determine if it areas, collected by the library staff from different information ulfills the proposed objectives, that is, the recommended informa sources. These resources were included in the system following tion is useful and interesting for the users. At the moment. we have the indications described above. We limited these experiments to plemented a trial version, in which the system works only with 10 users; all of them completed the registration process and eval- a few researchers. In a later version we will include the system in a uated 15 resources. The resources and the provided evaluations UDL. To evaluate this trial version we have designed experiments in other 20 resources that constituted the test data set. The system which the system is used to recommend research resources that filtered this 20 resources and recommends them to the suitable best satisfy the preferences of 10 users. users. Then, we compared the recommendations the systems with the recommendations provided by the library staff, and the obtained contingency table for all users is shown in 4.1. Evaluation metrics Table 5 From this contingency table, the corresponding precision, recall For the evaluation of recommender systems precision, recall and Fl are shown in Table 5.3. The average of precision, recall and and FI are measures widely used to evaluate the quality of the rec- F1 metrics is 63.52%, 67.94% and 65.05%, respectively. Fig. 5 shows ommendations( Cao& Li, 2007: Cleverdon Keen, 1966; Sarwar, a graph with the precision, recall and F1 values for each user.These Karypis, Konstan, Riedl, 2000). To calculate these metrics we values reveal a good performance of the proposed system and need a contingency table to categorize the items with respect to therefore a great satisfaction by the users. the information needs. The items are classified as both relevant or irrelevant and selected(recommended to the user)or not se- lected. The contingency table(Table 5.1)is created using these four categories Precision is defined as the ratio of the selected relevant items to Detailed experimental result. le selected items, that is, it measures the probability of a selected Precision (%6 Recall(%) Fl item being relevant: User1 66.67 571 75.00 Table 5.1 Selected Not selected Total Users Relevant User9 8000 72.73 User10 relevant 63.52 Table 5.2 Experimental contingency table. User User53.4. Feedback phase In this phase the recommender system recalculates and updates the recommendations of the accessed resources. This feedback activity is developed in the following steps: (1) The system recommends the user U a resource R, and then it asks the user his/her opinion or evaluation judgements about it. (2) The user communicates the linguistic evaluation judge￾ments, rcy 2 S2. (3) This evaluation is registered in the system for future recom￾mendations. The system recalculates the linguistic recom￾mendations of R by aggregating the opinions provided by other users together with rcy provided by U. This can be done using the 2-tuple aggregation operator xe given in Definition 3. 4. Experiment and evaluation In this section we present the evaluation of the proposed sys￾tem. The main focus in evaluating the system is to determine if it fulfills the proposed objectives, that is, the recommended informa￾tion is useful and interesting for the users. At the moment, we have implemented a trial version, in which the system works only with a few researchers. In a later version we will include the system in a UDL. To evaluate this trial version we have designed experiments in which the system is used to recommend research resources that best satisfy the preferences of 10 users. 4.1. Evaluation metrics For the evaluation of recommender systems precision, recall and F1 are measures widely used to evaluate the quality of the rec￾ommendations (Cao & Li, 2007; Cleverdon & Keen, 1966; Sarwar, Karypis, Konstan, & Riedl, 2000). To calculate these metrics we need a contingency table to categorize the items with respect to the information needs. The items are classified as both relevant or irrelevant and selected (recommended to the user) or not se￾lected. The contingency table (Table 5.1) is created using these four categories. Precision is defined as the ratio of the selected relevant items to the selected items, that is, it measures the probability of a selected item being relevant: P ¼ Nrs Ns : Recall is calculated as the ratio of the selected relevant items to the relevant items, that is, it represents the probability of a relevant items being selected: R ¼ Nrs Nr : F1 is a combination metric that gives equal weight to both precision and recall (Cao & Li, 2007; Sarwar et al., 2000): F1 ¼ 2  R  P R þ P : 4.2. Experimental result The purpose of the experiments is to test the performance of the proposed recommender system, so we compared the recommen￾dations made by the system and the recommendations made by the library staff. We considered a data set with 50 research resources of different areas, collected by the library staff from different information sources. These resources were included in the system following the indications described above. We limited these experiments to 10 users; all of them completed the registration process and eval￾uated 15 resources. The resources and the provided evaluations constituted our training data set. After this, we took into account other 20 resources that constituted the test data set. The system filtered this 20 resources and recommends them to the suitable users. Then, we compared the recommendations provided by the systems with the recommendations provided by the library staff, and the obtained contingency table for all users is shown in Table 5.2. From this contingency table, the corresponding precision, recall and F1 are shown in Table 5.3. The average of precision, recall and F1 metrics is 63.52%, 67.94% and 65.05%, respectively. Fig. 5 shows a graph with the precision, recall and F1 values for each user. These values reveal a good performance of the proposed system and therefore a great satisfaction by the users. Table 5.2 Experimental contingency table. User1 User2 User3 User4 User5 User6 User7 User8 User9 User10 Nrs 5431556444 Nrn 3311123212 Nis 2222332123 Nr 8742679656 Ns 7653888567 Table 5.3 Detailed experimental result. Precision (%) Recall (%) F1 (%) User1 71.43 62.50 66.67 User2 66.67 57.14 61.54 User3 60.00 75.00 66.67 User4 33.33 50.00 40.00 User5 62.50 83.33 71.43 User6 62.50 71.43 66.67 User7 75.00 66.67 70.59 User8 80.00 66.67 72.73 User9 66.67 80.00 72.73 User10 57.14 66.67 61.54 Average 63.52 67.94 65.05 Table 5.1 Contingency table. Selected Not selected Total Relevant Nrs Nrn Nr Irrelevant Nis Nin Ni Total Ns Nn N 12526 C. Porcel et al. / Expert Systems with Applications 36 (2009) 12520–12528
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有