正在加载图片...
A Discriminative Approach to Topic-based Citation Recommendation ole 2: Performance of citation recommendation on the two data sets Data Method P@1 P@2 P@3 P@5 P@10 Rprec MAP BprefMRR LM001950016400132001250014800161004450010800132 NIPS RBM0O28900313002630024006400245006520017600162 MCS0.24020262802349017920117001676034990162601082 LM0049600492004540043900274002590.11030031100243 iteseerRBM6840188407800.1519007601510028040189039 RBM-CS03330379103501028000.176802375042370250101564 Table 3: Performance of sentence-level citation recommendation on the nips data set odel P@1 P@2 P@3 P@5 P@10 Rpec MAP Bpref MRR LM0078300642005820062900050300607011780048300502 RBM10810106101061010000727009140.2089076100851 BMCS02005021360201001788015610178202854015650.1657 Table 3 shows the performance of citation recommendation by RBM and RBM-CS in terms of sentence-level evaluation. (As the Citeseer data contains a lot OCR errors and it is difficult to accurately extract the citation position, we conducted sentence- level evaluation on the NIPs data only. We can again see that our proposed model significantly outperforms the method of using LM and that of using RBM. 5 Related work e review scientific literatures about citation analysis and related topic models. Citation analysis usually employs a graphical model to represent papers and their relationships, for example Science Citation Index [3]. This index links authors and their corresponding papers. Bibliographical Coupling(BC)[6]and co-citation analysis are proposed for citation analysis, for example to measure the quality of an academic paper [ 3] Recommending citations for scientific papers is a task which has not been studied exhaustively before Strohman et al. [9] investigated this task using a graphical frame- work. Each paper is represented by a node and the citation relationship is represented as the link between nodes. A new paper is a node without in and out links. Citation recommendation is then cast as link prediction. McNee et al. [7] employed collabora- tive filtering in citation network to recommend citations to papers. Both of them use the graphical framework. We look at citation recommendation from a different perspective. We take advantages of the dependencies between paper contents and citation relation- ships by using a hidden topic layer to joint model them. Restricted Boltzmann Machines(RBMs)[8] are generative models based on latent (usually binary) variables to model an input distribution, and have been applied in a large variety of problems in the past few years. Many extensions of the RBM model ple dual wing RBM[12], modeling various types of input distribution [5][11]. In this paper, we propose a two-layer Restricted Boltzmann Ma-A Discriminative Approach to Topic-based Citation Recommendation 7 Table 2: Performance of citation recommendation on the two data sets. Data Method P@1 P@2 P@3 P@5 P@10 Rprec MAP Bpref MRR NIPS LM 0.0195 0.0164 0.0132 0.0125 0.0148 0.0161 0.0445 0.0108 0.0132 RBM 0.0289 0.0313 0.0263 0.0224 0.0164 0.0245 0.0652 0.0176 0.0162 RBM-CS 0.2402 0.2628 0.2349 0.1792 0.1170 0.1676 0.3499 0.1626 0.1082 Citeseer LM 0.0496 0.0492 0.0454 0.0439 0.0274 0.0259 0.1103 0.0311 0.0243 RBM 0.1684 0.1884 0.1780 0.1519 0.0776 0.1510 0.2804 0.1189 0.0639 RBM-CS 0.3337 0.3791 0.3501 0.2800 0.1768 0.2375 0.4237 0.2501 0.1564 Table 3: Performance of sentence-level citation recommendation on the NIPS data set. Model P@1 P@2 P@3 P@5 P@10 Rpec MAP Bpref MRR LM 0.0783 0.0642 0.0582 0.0629 0.00503 0.0607 0.1178 0.0483 0.0502 RBM 0.1081 0.1061 0.1061 0.1000 0.0727 0.0914 0.2089 0.0761 0.0851 RBM-CS 0.2005 0.2136 0.2010 0.1788 0.1561 0.1782 0.2854 0.1565 0.1657 Table 3 shows the performance of citation recommendation by RBM and RBM-CS in terms of sentence-level evaluation. (As the Citeseer data contains a lot OCR errors and it is difficult to accurately extract the citation position, we conducted sentence￾level evaluation on the NIPS data only.) We can again see that our proposed model significantly outperforms the method of using LM and that of using RBM. 5 Related Work We review scientific literatures about citation analysis and related topic models. Citation analysis usually employs a graphical model to represent papers and their relationships, for example Science Citation Index [3]. This index links authors and their corresponding papers. Bibliographical Coupling (BC) [6] and co-citation analysis are proposed for citation analysis, for example to measure the quality of an academic paper [3]. Recommending citations for scientific papers is a task which has not been studied exhaustively before. Strohman et al. [9] investigated this task using a graphical frame￾work. Each paper is represented by a node and the citation relationship is represented as the link between nodes. A new paper is a node without in and out links. Citation recommendation is then cast as link prediction. McNee et al. [7] employed collabora￾tive filtering in citation network to recommend citations to papers. Both of them use the graphical framework. We look at citation recommendation from a different perspective. We take advantages of the dependencies between paper contents and citation relation￾ships by using a hidden topic layer to joint model them. Restricted Boltzmann Machines (RBMs) [8] are generative models based on latent (usually binary) variables to model an input distribution, and have been applied in a large variety of problems in the past few years. Many extensions of the RBM model have been proposed, for example dual wing RBM [12], modeling various types of input distribution [5] [11]. In this paper, we propose a two-layer Restricted Boltzmann Ma-
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有