正在加载图片...
Jie Tang and Jing Zhang →P a retrieval Fig 1: Example of citation recommendation. o. In this paper, we formalize citation recommendation as that of topic discovery, pic-based recommendation, and matching citation sentences with the recommend papers. We propose a unified and discriminative approach to citation recommendation This approach can automatically discover topical aspects of each paper and recommend papers based on the discovered topic distribution. Experimental results show that the proposed approach significantly outperforms the baseline methods 2 Problem formulation We define notations used throughout this paper. Assuming that a paper d contains a vector wd of Nd words, in which each word wdi is chosen from a vocabulary of size V; and a list ld of Ld references. Then a collection of D papers can be represented as D=I(w1, 11), . ,(wD, D). We only consider references that appear in the paper ollection D. Thus the size L of the vocabulary of references is D. Further, we consider that each paper is associated with a distribution of T' topics, so is the citation. Definition 1.(Citation Context and Citation Sentence)Citation context is defined by the context words occurring in, for instance, the user written proposal. For an example, the words". We use Cosine computation (x)evaluate the similarity. " would be citation context. One reference paper is expected to be cited at the position"Ix".We use c to denote a citation context. each sentence in the citation context is called citation sentence. The position"Ix) "to cite the reference paper is called citation position Figure 1 shows an example of citation recommendation. The left part of Figure 1 includes a citation context provided by the user and a paper collection. The right part shows the recommended result that we expect a citation recommendation algorithm outputs. For instance, two topics, i.e., "text summarization"and"information retrieval have been extracted from the citation context. For the first topic"text summarization two papers have been recommended and for the second topic"information retrieval three papers have been recommended. Further, the recommended papers are matched with the citation sentences and the corresponding citation positions have been identified.2 Jie Tang and Jing Zhang We are considering the extraction-based text summarization [2] [3]. … As for the models, we can adopt many existing probabilistic retrieval models such as the classic probabilistic retrieval models [4] and the Kullback-Leibler (KL) divergence retrieval model [1] [5]. Suggested references Citation context Paper collection We are considering the extraction-based text summarization. … As for the models, we can adopt many existing probabilistic retrieval models such as the classic probabilistic retrieval models and the Kullback-Leibler (KL) divergence retrieval model. Citation recommendation results Document lan￾guage models, .. Topic1: text summarization Topic 2: information retrieval Lafferty, J. and Zhai, C. Document language models, query models, and risk minimization for information retrieval. In SIGIR'01. 111-119. [1] Luhn, H. P. 1958. The automatic creation of literature abstracts. IBM Journal of Research and Development 2, 2, 159-165. [2] McKeown, K. and Radev, D. R. 1995. Generating summaries of multiple news articles. In SIGIR’95, 74-82. [3] Robertson, S. E. 1977. The probability ranking principle in IR. Journal of Documentation 33, 4, 294-304. [4] Zhai, C. and Lafferty, J. A study of smoothing methods for language models applied to ad hoc information retrieval. In SIGIR’01, 334-342. [5] Discovered topics Matching references with citation sentences Fig. 1: Example of citation recommendation. In this paper, we formalize citation recommendation as that of topic discovery, topic-based recommendation, and matching citation sentences with the recommended papers. We propose a unified and discriminative approach to citation recommendation. This approach can automatically discover topical aspects of each paper and recommend papers based on the discovered topic distribution. Experimental results show that the proposed approach significantly outperforms the baseline methods. 2 Problem Formulation We define notations used throughout this paper. Assuming that a paper d contains a vector wd of Nd words, in which each word wdi is chosen from a vocabulary of size V ; and a list ld of Ld references. Then a collection of D papers can be represented as D = {(w1, l1), · · · ,(wD, lD)}. We only consider references that appear in the paper collection D. Thus the size L of the vocabulary of references is D. Further, we consider that each paper is associated with a distribution of T topics, so is the citation. Definition 1. (Citation Context and Citation Sentence) Citation context is defined by the context words occurring in, for instance, the user written proposal. For an example, the words “... We use Cosine computation [x] evaluate the similarity ...” would be a citation context. One reference paper is expected to be cited at the position “[x]”. We use c to denote a citation context. Each sentence in the citation context is called citation sentence. The position “[x]” to cite the reference paper is called citation position. Figure 1 shows an example of citation recommendation. The left part of Figure 1 includes a citation context provided by the user and a paper collection. The right part shows the recommended result that we expect a citation recommendation algorithm outputs. For instance, two topics, i.e., “text summarization” and “information retrieval”, have been extracted from the citation context. For the first topic “text summarization”, two papers have been recommended and for the second topic “information retrieval”, three papers have been recommended. Further, the recommended papers are matched with the citation sentences and the corresponding citation positions have been identified
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有