正在加载图片...
Index Compression Collection Statistics Index parameters vs. what we index (details //R Table 5.1, p 80) size of word types(terms) non-positional positional postings postings dictionary non-positional index positional index size△% cumul size(K)△ cumu size(K)△ cumul % %% Unfiltered 484 109971 197.879 No numbers 4742 -2100,6808 8179,1589 9 Case folding 392-17 1996969-3 12179,1580 9 30 stopwords 391-0 -1983,390-14-24121,858-31 -38 150 stopwords39101967002303994,51747 52 stemming 322-17 -3363.812-4 4294.5170 52 Exercise: give intuitions for all the o' entries Why do some zero entries correspond to big deltas in other columnsIndex Compression 7 Index parameters vs. what we index (details IIR Table 5.1, p.80) size of word types (terms) non-positional postings positional postings dictionary non-positional index positional index Size (K) ∆% cumul % Size (K) ∆ % cumul % Size (K) ∆ % cumul % Unfiltered 484 109,971 197,879 No numbers 474 -2 -2 100,680 -8 -8 179,158 -9 -9 Case folding 392 -17 -19 96,969 -3 -12 179,158 0 -9 30 stopwords 391 -0 -19 83,390 -14 -24 121,858 -31 -38 150 stopwords 391 -0 -19 67,002 -30 -39 94,517 -47 -52 stemming 322 -17 -33 63,812 -4 -42 94,517 0 -52 Exercise: give intuitions for all the ‘0’ entries. Why do some zero entries correspond to big deltas in other columns? Collection Statistics
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有