正在加载图片...
Index Compression Collection Statistics Reca‖ Reuters rcv1 symbol statistic value documents 800.000 avg tokens per doc 200 terms(=word types)" 400,000 avg. bytes per token 6 (incl spaces/punct. avg. bytes per token 4.5 (without spaces/punct avg. bytes per term 7.5 non-positional postings 100,000,000Index Compression 6 Recall Reuters RCV1 ▪ symbol statistic value ▪ N documents 800,000 ▪ L avg. # tokens per doc 200 ▪ M terms (= word types) ~400,000 ▪ avg. # bytes per token 6 (incl. spaces/punct.) ▪ avg. # bytes per token 4.5 (without spaces/punct.) ▪ avg. # bytes per term 7.5 ▪ non-positional postings 100,000,000 Collection Statistics
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有