正在加载图片...
5. CONCLUSIONS AND FUTURE WORK Workshop on Collaborative Web Tagging, Edinburgh, In this paper we have investigated the use of Latent Dirich- let Allocation for collective tag recommendation. Compared 6 B Berendt and C Hanser. Tags are not metadata, but to association rules, LDA achieves better accuracy, and in ust more content- to some people. In Proceedings of particular recommends more specific tags, which are more he International Conference on Weblogs and Social useful for search. In general, our LDA-based approach is Media, 2007 able to elicit a shared topical structure from the collabo- 7 I Bhattacharya and L Getoor. A latent dirichlet rative tagging effort of multiple users, whereas association odel for unsupervised entity resolution. In SIAM rules are more focused on simple terminology expansio onference on Data Mining(SDM), pages 47-58 However, both approaches succeed to some degree in over- April 2006 coming the idiosyncracies of individual tagging practices 8 I. Biro, D. Siklosi, J. Szabo, and A. A. Benczur For future work we are interested to see whether it is ben- Linked latent dirichlet allocation in web spam filtering eficial to combine association rules and Lda. As we showed In AIRWeb 09: Proceedings of the 5th International in Section 3.2 the tags that are recommended by both algo- Workshop on Adversarial Information Retrieval on the rithms differ significantly from each other. Our hypothesis Web, pages 37-40, New York, NY, USA, 2009. ACM. that 9 K. Bischoff, C. S. Firan, W. Nejdl, and R. Paiu. Can ags recommended by e Inore all tags be used for search? In CIKM 08: Proceeding pecific tags recommended by LDA. Along similar lines, we of the 17th ACM conference on Information and so plan to investigate combining language models derived nowledge management, pages 193-202, New York from the actual tags annotated to a resource with the latent NY USA. 2008. ACM opic models [10 D. M. Blei, A. Y. Ng, and M. IJordan. Latent The main contribution of latent topic models is to reduce dirichlet allocation. Journal of Machine Learning sparsity of the tag space. This gives rise to several interest- Research, 3: 993-1022, January 2003 ing lines of research we will investigate: Mapping resou to their latent topics may result in more robust resource [11 P. A. Chirita, S. Costache, W. Nejdl, and recommendation. Eliciting latent topics from the tagging S Handschuh. P-tag: large scale automatic generation practices of individual users and combining them with the of personalized annotation tags for the web In www latent topics for resources is a promising direction for per 07: Proceedings of the 16th international conference on World Wide Web, pages 845-854, New York, NY sonalized tag recommendation. Finally, we will experiment USA 2007. ACM with using the probability of tags derived from topic mod els for visualizing tag recommendations in the form of tag the annotation-retrieval gap in image searc d bridging [12 R. Datta, W. Ge, J. Li, and J. Wang. Towar Regarding data sets, we also want to experiment with Multimedia, IEEE, 14(3): 24-35, July-Sept. 2007. datasets from different domains, to check whether photo [13 P. A. Dmitriev, N. Iron, M. Fontoura, and E.J video, or music tagging sites show different system behavior Shekita. Using annotations in enterprise search. In influencing our algorithms. L. Carr, D. D. Roure, A. lyengar, C. A. Goble, and M. Dahlin, editors, Proceedings of the 15th 6. ACKNOWLEDGMENTS international conference on World wide web, www 2006, Edinburgh, Scotland, UK, May 23-26, 2006 45035-Platform for searcH of Audiovisual Resources across [14/1 Ses811-817, New York, NY, USA, 2006. ACM This work was supported in part by the eu project IST Garg and I. Weber. Personalized, interactive tag Online Spaces(PHAROS) recommendation for flickr. In Rec Sys 08: Proceedings of the 2008 ACM conference on Recommender 7. REFERENCES systems, pages 67-74, New York, NY, USA, 2008 ACM 1 R. Agrawal, T. Imielinski, and S. A Mining e association rules between sets of items in larg 15S. Golder and B A Huberman. Usage patterns of databases. SIGMOD Record, 22(2), 1993 collaborative tagging systems. Journal of Information [2 Alias-i. Lingpipe 3.7.0 Science,32(2):198-208, April2006 http://alias-i.com/lingpipe(accessed:10/2008) [16 T. L. Griffiths 2008. topics. Proc Natl Acad Sci U S A, 101 Suppl 3 S Bao, G.R. Xue, X Wu, Y. Yu, B Fei, and Z Su 1:52285235, April2004 Optimizing web search using social annotations In [17 P. Heymann, G. Koutrika, and H Garcia-Molina Can C. L. Williamson. M. E. Zurko. P. F. Patel-Schneider social bookmarking improve web search? In and P.J. Shenoy, editors, Proceedings of the 16th M. Najork, A. Z. Broder, and S Chakrabarti, editor International Conference on World wide web, www Proceedings of the International Conference on Web 2007, Banf, Alberta, Canada, May 8-12, 2007, pages Search and Web Data Mining, WSDM 2008, Pale 501-510, New York, NY, USA, 2007. ACM ] versnik. Generalized cores 195-206.ACM,2008 CoRR,cDS/0202039,2002 18 P. Heymann, D. Ramage, and H Garcia-Molina 5 G. Begelman, P. Keller, and F. Smadja Automated Social tag prediction. In SIGIR 08: Proceedings of the tag clustering: Improving search and exploration in 3Ist annual international ACM SIGIR conference on the tag space. In Proceedings of the www 2006 Research and development in information retriev pages 531-538, New York, NY, USA, 2008. ACM5. CONCLUSIONS AND FUTURE WORK In this paper we have investigated the use of Latent Dirich￾let Allocation for collective tag recommendation. Compared to association rules, LDA achieves better accuracy, and in particular recommends more specific tags, which are more useful for search. In general, our LDA-based approach is able to elicit a shared topical structure from the collabo￾rative tagging effort of multiple users, whereas association rules are more focused on simple terminology expansion. However, both approaches succeed to some degree in over￾coming the idiosyncracies of individual tagging practices. For future work we are interested to see whether it is ben￾eficial to combine association rules and LDA. As we showed in Section 3.2 the tags that are recommended by both algo￾rithms differ significantly from each other. Our hypothesis is that accuracy can be improved by combining the more gen￾eral tags recommended by association rules with the more specific tags recommended by LDA. Along similar lines, we also plan to investigate combining language models derived from the actual tags annotated to a resource with the latent topic models. The main contribution of latent topic models is to reduce sparsity of the tag space. This gives rise to several interest￾ing lines of research we will investigate: Mapping resources to their latent topics may result in more robust resource recommendation. Eliciting latent topics from the tagging practices of individual users and combining them with the latent topics for resources is a promising direction for per￾sonalized tag recommendation. Finally, we will experiment with using the probability of tags derived from topic mod￾els for visualizing tag recommendations in the form of tag clouds. Regarding data sets, we also want to experiment with datasets from different domains, to check whether photo, video, or music tagging sites show different system behavior influencing our algorithms. 6. ACKNOWLEDGMENTS This work was supported in part by the EU project IST 45035 - Platform for searcH of Audiovisual Resources across Online Spaces (PHAROS). 7. REFERENCES [1] R. Agrawal, T. Imielinski, and S. A. Mining association rules between sets of items in large databases. SIGMOD Record, 22(2), 1993. [2] Alias-i. Lingpipe 3.7.0. http://alias-i.com/lingpipe(accessed:10/2008), 2008. [3] S. Bao, G.-R. Xue, X. Wu, Y. Yu, B. Fei, and Z. Su. Optimizing web search using social annotations. In C. L. Williamson, M. E. Zurko, P. F. Patel-Schneider, and P. J. Shenoy, editors, Proceedings of the 16th International Conference on World Wide Web, WWW 2007, Banff, Alberta, Canada, May 8-12, 2007, pages 501–510, New York, NY, USA, 2007. ACM. [4] V. Batagelj and M. Zaversnik. Generalized cores. CoRR, cs.DS/0202039, 2002. [5] G. Begelman, P. Keller, and F. Smadja. Automated tag clustering: Improving search and exploration in the tag space. In Proceedings of the WWW 2006 Workshop on Collaborative Web Tagging, Edinburgh, May 2006. [6] B. Berendt and C. Hanser. Tags are not metadata, but just more content - to some people. In Proceedings of the International Conference on Weblogs and Social Media, 2007. [7] I. Bhattacharya and L. Getoor. A latent dirichlet model for unsupervised entity resolution. In SIAM Conference on Data Mining (SDM), pages 47–58, April 2006. [8] I. B´ır´o, D. Sikl´osi, J. Szab´o, and A. A. Bencz´ur. Linked latent dirichlet allocation in web spam filtering. In AIRWeb ’09: Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web, pages 37–40, New York, NY, USA, 2009. ACM. [9] K. Bischoff, C. S. Firan, W. Nejdl, and R. Paiu. Can all tags be used for search? In CIKM ’08: Proceeding of the 17th ACM conference on Information and knowledge management, pages 193–202, New York, NY, USA, 2008. ACM. [10] D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3:993–1022, January 2003. [11] P. A. Chirita, S. Costache, W. Nejdl, and S. Handschuh. P-tag: large scale automatic generation of personalized annotation tags for the web. In WWW ’07: Proceedings of the 16th international conference on World Wide Web, pages 845–854, New York, NY, USA, 2007. ACM. [12] R. Datta, W. Ge, J. Li, and J. Wang. Toward bridging the annotation-retrieval gap in image search. Multimedia, IEEE, 14(3):24–35, July-Sept. 2007. [13] P. A. Dmitriev, N. Eiron, M. Fontoura, and E. J. Shekita. Using annotations in enterprise search. In L. Carr, D. D. Roure, A. Iyengar, C. A. Goble, and M. Dahlin, editors, Proceedings of the 15th international conference on World Wide Web, WWW 2006, Edinburgh, Scotland, UK, May 23-26, 2006, pages 811–817, New York, NY, USA, 2006. ACM. [14] N. Garg and I. Weber. Personalized, interactive tag recommendation for flickr. In RecSys ’08: Proceedings of the 2008 ACM conference on Recommender systems, pages 67–74, New York, NY, USA, 2008. ACM. [15] S. Golder and B. A. Huberman. Usage patterns of collaborative tagging systems. Journal of Information Science, 32(2):198–208, April 2006. [16] T. L. Griffiths and M. Steyvers. Finding scientific topics. Proc Natl Acad Sci U S A, 101 Suppl 1:5228–5235, April 2004. [17] P. Heymann, G. Koutrika, and H. Garcia-Molina. Can social bookmarking improve web search? In M. Najork, A. Z. Broder, and S. Chakrabarti, editors, Proceedings of the International Conference on Web Search and Web Data Mining, WSDM 2008, Palo Alto, California, USA, February 11-12, 2008, pages 195–206. ACM, 2008. [18] P. Heymann, D. Ramage, and H. Garcia-Molina. Social tag prediction. In SIGIR ’08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 531–538, New York, NY, USA, 2008. ACM
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有