For Technology ),(iii) CACM (3204 abstracts from the PLSI we report the best result obtained by any of these CACM Journal), and (iv) CISI (1460 abstracts in li- models, for LSI we report the best result obtained for brary science from the Institute for Scientific Informa- the optimal dimension(exploring 32-512 dimensions t a step size of 8).The combination weight A with the0 50 100 0 10 20 30 40 50 60 70 80 90 MED recall [%] precision [%] 0 50 100 0 10 20 30 40 50 60 70 CRAN recall [%] 0 50 100 0 10 20 30 40 50 60 CACM recall [%] 0 50 100 0 5 10 15 20 25 30 35 40 45 50 CISI recall [%] cos LSI PLSI* cos LSI PLSI* cos LSI PLSI* cos LSI PLSI* Figure 6: Precision-recall curves for term matching, LSI, and PLSI on the 4 test collections. have been utilized to evaluate similarities. To achieve this, queries have to be folded in, which is done in the PLSA by xing the P (wjz) parameters and calculating weights P (zjq) by TEM. One advantage of using statistical models vs. SVD techniques is that it allows us to systematically com￾bine di erent models. While this should optimally be done according to a Bayesian model combination scheme, we have utilized a much simpler approach in our experiments which has nevertheless shown excel￾lent performance and robustness. Namely, we have simply combined the cosine scores of all models with a uniform weight. The resulting method is referred to as PLSI . Empirically we have found the performance to be very robust w.r.t. di erent (non-uniform) weights and also w.r.t. the -weight used in combination with the original cosine score. This is due to the noise re￾ducing bene ts of (model) averaging. Notice that LSA representations for di erent K form a nested sequence, which is not true for the statistical models which are expected to capture a larger variety of reasonable de￾compositions. We have utilized the following four medium-sized stan￾dard document collection with relevance assessment: (i) MED (1033 document abstracts from the National Library of Medicine), (ii) CRAN (1400 document ab￾stracts on aeronautics from the Cran eld Institute of Technology), (iii) CACM (3204 abstracts from the CACM Journal), and (iv) CISI (1460 abstracts in li￾brary science from the Institute for Scienti c Informa- 0.6 0.7 0.8 0.9 1 1200 1400 1600 1800 2000 beta perplexity K=48 0.6 0.7 0.8 0.9 1 30 40 50 60 70 beta average precision K=48 0.6 0.7 0.8 0.9 1 1200 1400 1600 1800 2000 beta perplexity K=128 0.6 0.7 0.8 0.9 1 30 40 50 60 70 beta average precision K=128 Figure 7: Perplexity and average precision as a func￾tion of the inverse temperature for an aspect model with K = 48 (left) and K = 128 (right). tion). The condensed results in terms of average pre￾cision recall (at the 9 recall levels 10%￾90%) are sum￾marized in Table 1, while the corresponding precision recall curves can be found in Figure 6. Here are some additional details of the experimental setup: PLSA models at K = 32; 48; 64; 80; 128 have been trained by TEM for each data set with 10% held-out data. For PLSI we report the best result obtained by any of these models, for LSI we report the best result obtained for the optimal dimension (exploring 32{512 dimensions at a step size of 8). The combination weight  with the
