正在加载图片...
Index Compression Collection Statistics Lossless vs lossy compression Lossless compression: All information is preserved What we mostly do in IR Lossy compression Discard some information Several of the preprocessing steps can be viewed as lossy compression: case folding stop words stemming number elimination Chap 7: Prune postings entries that are unlikely to turn up in the top k list for any query Almost no loss quality for top k list 8Index Compression 8 Lossless vs. lossy compression ▪ Lossless compression: All information is preserved. ▪ What we mostly do in IR. ▪ Lossy compression: Discard some information ▪ Several of the preprocessing steps can be viewed as lossy compression: case folding, stop words, stemming, number elimination. ▪ Chap 7: Prune postings entries that are unlikely to turn up in the top k list for any query. ▪ Almost no loss quality for top k list. Collection Statistics
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有