正在加载图片...
Index Compression Collection Statistics Vocabulary vs collection size How big is the term vocabulary? That is how many distinct words are there? Can we assume an upper bound? Not really: At least 7020 =1037 different words of length 20 In practice, the vocabulary will keep growing with the collection size Especially with UnicodeIndex Compression 9 Vocabulary vs. collection size ▪ How big is the term vocabulary? ▪ That is, how many distinct words are there? ▪ Can we assume an upper bound? ▪ Not really: At least 7020 = 1037 different words of length 20 ▪ In practice, the vocabulary will keep growing with the collection size ▪ Especially with Unicode ☺ Collection Statistics
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有