正在加载图片...
C. DeLong. P Desikan. and J Srivastava A set of concepts C=(Cl, C2, C3,... where each document D; can be la- beled by a subset of concepts from C A set of Web pages, S, generated by the content management system, which have a one-to-one mapping with each document D · Web server usage logs The views of an expert are captured explicitly by building a graph of Web pa connected by concepts that are derived from a Web graph generated by a set perts using a content management system. While our earlier work [25] concent on deriving expert knowledge from the content management system itself, in our current work, we present a more generally-applicable case of dealing with a Web graph, thus removing the dependence on the content management system. The expert views are captured using this graph and the expert opinion of the relative importance of any given Web page is captured using an ExpertRank The navigational importance is captured from the Web graph by the StructureRank. The queries of the learner are captured from the Usage Rank. By defining these three different kinds of rank we are able to capture what the learner is intending to learn and what the expert thinks the learner needs to know n The most popular kind of information infrastructure for representing documents nd their interconnections is a link graph. Such an information infrastructure can be modeled as a graph, G (V, E)-where V is a set of vertices of that represents units of information and E is a set of edges that represents the interaction between them. For pe of this paper, a vertex represents a single Web page and an edge represents a hyperlink between pages or a relation between two pages. As can be seen in Figure 2, two different graphs are constructed: each corresponding to a different type of edge relationship. In order to generate relevance ranks of nodes on these graphs, Googles PageRank [15] is used as a foundation due to its stability and success in the Web search domain. However, it should be noted that the best way to capture an experts advice on the whole set of documents automatically without an expert involved is an ExpertRank: A concept graph is constructed from the Web graph In this graph, a vertex is a single Web page and its edges correspond to its conceptual links with other Web pages. In the prototype recommender system, the concepts of a Web page are sented by individual anchor text words and the information-retaining phras e repre- derived from the anchor text of the Web pages pointing to it. The concepts are repre- be grown out of them. If two Web pages share a concept, but are not already itly-linked, then an implicit link is introduced between them. Two pages are mined to share a concept if the intersection of the set of concepts each represents is not empty. The set of concepts by themselves are determined by the anchor text of the s pointing to them. On this constructed graph, Page Rank is applied to obtain the ortance ranking of these documents thus the rank of a document d is defined as eR(d) d',d∈G Where d is a given Web page and d is a set of all Web pages that point to d, either by an explicit or an implicit link, Np is the number of documents in the Web graph, and a is the dampening factor82 C. DeLong, P. Desikan, and J. Srivastava • A set of concepts C = {C1, C2, C3,….}, where each document Di can be la￾beled by a subset of concepts from C • A set of Web pages, S, generated by the content management system, which have a one-to-one mapping with each document D • Web server usage logs The views of an expert are captured explicitly by building a graph of Web pages connected by concepts that are derived from a Web graph generated by a set of ex￾perts using a content management system. While our earlier work [25] concentrated on deriving expert knowledge from the content management system itself, in our current work, we present a more generally-applicable case of dealing with a Web graph, thus removing the dependence on the content management system. The expert views are captured using this graph and the expert opinion of the relative importance of any given Web page is captured using an ExpertRank. The navigational importance is captured from the Web graph by the StructureRank. The queries of the learner are captured from the UsageRank. By defining these three different kinds of rank we are able to capture what the learner is intending to learn and what the expert thinks the learner needs to know. The most popular kind of information infrastructure for representing documents and their interconnections is a link graph. Such an information infrastructure can be modeled as a graph, G (V, E) – where V is a set of vertices of that represents units of information and E is a set of edges that represents the interaction between them. For the scope of this paper, a vertex represents a single Web page and an edge represents a hyperlink between pages or a relation between two pages. As can be seen in Figure 2, two different graphs are constructed: each corresponding to a different type of edge relationship. In order to generate relevance ranks of nodes on these graphs, Google’s PageRank [15] is used as a foundation due to its stability and success in the Web search domain. However, it should be noted that the best way to capture an experts advice on the whole set of documents automatically without an expert involved is an open issue of research. ExpertRank: A concept graph is constructed from the Web graph. In this graph, a vertex is a single Web page and its edges correspond to its conceptual links with other Web pages. In the prototype recommender system, the concepts of a Web page are derived from the anchor text of the Web pages pointing to it. The concepts are repre￾sented by individual anchor text words and the information-retaining phrases that can be grown out of them. If two Web pages share a concept, but are not already explic￾itly-linked, then an implicit link is introduced between them. Two pages are deter￾mined to share a concept if the intersection of the set of concepts each represents is not empty. The set of concepts by themselves are determined by the anchor text of the pages pointing to them. On this constructed graph, PageRank is applied to obtain the importance ranking of these documents. Thus the rank of a document d is defined as: ( ) ∑′ ∈ ′ ′ = + − ⋅ D d d G OutDeg d ER d N ER d , ( ) ( ) ( ) 1 α α (1) Where d is a given Web page and d’ is a set of all Web pages that point to d, either by an explicit or an implicit link, ND is the number of documents in the Web graph, and α is the dampening factor
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有