正在加载图片...
Finding Experts By Semantic Matching of User Profiles Rajesh Thiagarajan, Geetha Manjunath", and Markus Stumptner Advanced Computing Research Centre, University of South Australia Icisrkt, mst ecs unisa.edu.au 2 Hewlett-Packard Laboratories. India geetha.manjunathghp.com Abstract. Extracting interest profiles of users based on their personal documents ne of the key topics of IR research. However, when these extracted profiles are used in expert finding applications, only naive text-matching techniques an sed to rank experts for a given requirement. In this paper, we address this gap and escribe multiple techniques to match user profiles for better ranking of experts. le propose new metrics for computing semantic similarity of user profiles using spreading activation networks derived from ontologies. Our pilot evaluation shows that matching algorithms based on bipartite graphs over semantic user profiles provide the best results. We show that using these techniques, we can find an expert more accurately than other approaches, in particular within the top ranked results. In applications where a group of candidate users need to be short-listed (say, for a job interview ), we get very good precision and recall as well. 1 Introduction The problem of finding experts on a given set of topics is important for many lines of business e. g, consulting, recruitment, e-business. In these applications, one common way to model a user is with a user profile which is a set of topics with weights determining his level of interest. When used for personalization, these user profiles matched with a retrieved documentmay be a search result) for checking its relevance to him. a similar matching technique can be used for expert finding as well- wherein we first formulate the requis then carried out by matching the query profile with the available/extracted ment(query) as an expected profile of the expert who is sought after. Expert finding expert user profiles In the above context, automatic extraction of topics of expertise (interest) of a person ased on the documents authored(accessed) by the person through information extraction techniques is well known. However, when these extracted profiles are used for expert finding, the profile matching is often carried out by applying traditional content matching techniques which miss most potential candidates if the query is only an approximate escription of the expert (as is usually e. In thi multiple approaches for semantic matching of user profiles to enable better expert-finding in such Let us briefly look at the challenges in comparing user profiles. User profiles are generally represented in the bag-of-words(Bow) format-a set of weighted terms that describe the interest or the expertise of a user. The most commonly used content match ng technique is cosine similarity -cosine between the bOw vector representing the user profile and that of the document to match. Although this simple matching technique suf fices in a number of content matching applications, it is well known that considering just the words leads to problems due to lack of semantics in the representation Problems due to polysemy(terms such as apple, jaguar having two different meanings)and synonymy (two words meaning almost the same thing such as glad and happy)can be solved if profiles are described using semantic concepts instead of words. Once again simpleFinding Experts By Semantic Matching of User Profiles Rajesh Thiagarajan1 , Geetha Manjunath2 , and Markus Stumptner1 1 Advanced Computing Research Centre, University of South Australia {cisrkt,mst}@cs.unisa.edu.au 2 Hewlett-Packard Laboratories, India geetha.manjunath@hp.com Abstract. Extracting interest profiles of users based on their personal documents is one of the key topics of IR research. However, when these extracted profiles are used in expert finding applications, only naive text-matching techniques are used to rank experts for a given requirement. In this paper, we address this gap and describe multiple techniques to match user profiles for better ranking of experts. We propose new metrics for computing semantic similarity of user profiles using spreading activation networks derived from ontologies. Our pilot evaluation shows that matching algorithms based on bipartite graphs over semantic user profiles provide the best results. We show that using these techniques, we can find an expert more accurately than other approaches, in particular within the top ranked results. In applications where a group of candidate users need to be short-listed (say, for a job interview), we get very good precision and recall as well. 1 Introduction The problem of finding experts on a given set of topics is important for many lines of business e.g., consulting, recruitment, e-business. In these applications, one common way to model a user is with a user profile which is a set of topics with weights determining his level of interest. When used for personalization, these user profiles matched with a retrieved document (may be a search result) for checking its relevance to him. A similar matching technique can be used for expert finding as well - wherein we first formulate the requirement (query) as an expected profile of the expert who is sought after. Expert finding is then carried out by matching the query profile with the available/extracted expert user profiles. In the above context, automatic extraction of topics of expertise (interest) of a person based on the documents authored (accessed) by the person through information extraction techniques is well known. However, when these extracted profiles are used for expert finding, the profile matching is often carried out by applying traditional content matching techniques which miss most potential candidates if the query is only an approximate description of the expert (as is usually the case). In this paper, we propose and evaluate multiple approaches for semantic matching of user profiles to enable better expert-finding in such cases. Let us briefly look at the challenges in comparing user profiles. User profiles are generally represented in the bag-of-words (BOW) format - a set of weighted terms that describe the interest or the expertise of a user. The most commonly used content match￾ing technique is cosine similarity - cosine between the BOW vector representing the user profile and that of the document to match. Although this simple matching technique suf- fices in a number of content matching applications, it is well known that considering just the words leads to problems due to lack of semantics in the representation. Problems due to polysemy (terms such as apple, jaguar having two different meanings) and synonymy (two words meaning almost the same thing such as glad and happy) can be solved if profiles are described using semantic concepts instead of words. Once again simple
向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有