正在加载图片...
Index Compression Collection Statistics Zipfs law Heaps' law gives the vocabulary size in collections We also study the relative frequencies of terms In natural language, there are a few very frequent terms and very many very rare terms Zipf's law: The ith most frequent term has frequency proportional to 1/ cf, a 1/ i=i where k is a normalizing constant ct; is collection frequency the number of occurrences of the term t in the collectionIndex Compression 13 Zipf’s law ▪ Heaps’ law gives the vocabulary size in collections. ▪ We also study the relative frequencies of terms. ▪ In natural language, there are a few very frequent terms and very many very rare terms. ▪ Zipf’s law: The ith most frequent term has frequency proportional to 1/i . ▪ cfi ∝ 1/i = K/i where K is a normalizing constant ▪ cfi is collection frequency: the number of occurrences of the term ti in the collection. Collection Statistics
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有