Index Compression 11 Heaps’ Law For R_中国高校课件下载中心

点击下载：《网络搜索和挖掘关键技术 Web Search and Mining》课程教学资源（PPT讲稿）Lecture 06 Index Compression

正在加载图片...

Index Compression Collection Statistics Heaps Law Fig 5.1 p81 For rcv1, the dashed line oguM=0.49g107+164 is the best least squares fit. Thus,M=10.6470490k= 10164≈44andb=049 Good empirical fit for Reuters rcv1 For first 1,000.020 tokens law predicts 38, 323 terms actually, 38, 365 terms log10TIndex Compression 11 Heaps’ Law For RCV1, the dashed line log10M = 0.49 log10T + 1.64 is the best least squares fit. Thus, M = 101.64T 0.49 so k = 101.64 ≈ 44 and b = 0.49. Good empirical fit for Reuters RCV1 ! For first 1,000,020 tokens, law predicts 38,323 terms; actually, 38,365 terms Fig 5.1 p81 Collection Statistics

<<向上翻页向下翻页>>

点击下载：《网络搜索和挖掘关键技术 Web Search and Mining》课程教学资源（PPT讲稿）Lecture 06 Index Compression