当前位置:高等教育资讯网  >  中国高校课件下载中心  >  大学文库  >  浏览文档

北京大学:《信息检索》课程教学资源(PPT课件讲稿)Retrieval Models

资源类别:文库,文档格式:PPT,文档页数:53,文件大小:1.73MB,团购合买
• Vector Space Model (VSM) • Latent Semantic Model (LSI) • Language Model (LM)
点击下载完整版文档(PPT)

Outline Vector Space Model (VSM) Latent Semantic Model (LSI) ·Language Model(LM) CCF-ADL at Zhengzhou University, 2 June25-27,2010

Outline • Vector Space Model (VSM) • Latent Semantic Model (LSI) • Language Model (LM) 2 CCF-ADL at Zhengzhou University, June 25-27, 2010

Simple flow of retrieval process Information Need Text Objects 2 Representation Representation Query Indexed Objects Comparison Evaluation /Feedback Retrieved Objects CCF-ADL at Zhengzhou University June25-27,2010

CCF -ADL at Zhengzhou University, June 25 -27, 2010 3

文件E)编辑(E)查看)历史(⑤)书签但)工具(①)帮助仙) http://www.google.com/search?hl-en&newwindow=-18q-latent+semantictindexing&aq-0e&oq Google 4 in Zob..James Mcc-.Chengxian..图百度搜索_Gmail-nb.Conferenc..web Base. Pregel 如何修改p.laten.…区 Web Images Videos Maps News Shopping Gmail more kyhhdm@gmail.com|Web History I Settings Google latent semantic indexing Search Advanced Search weh田Show options. Results 1-10 of about 129,000 for latent semantic indexing.(0.31 seco Latent semantic indexing-Wikipedia,the free encyclopedia Latent Semantic Indexing(LSI)is an indexing and retrieval method that uses a mathematical technique called Singular Value Decomposition (SVD)to identify... Relevance Feedback Benefits of LSI-LSI Timeline-Mathematics of LSI en.wikipedia.org/wiki/Latent_semantic_indexing-Cached-Similar- Query Expansion Latent semantic analysis-Wikipedia,the free encyclopedia CO:2-9.http://Isi.research.telcordia.com/Isi/papers/JASIS90.pdf.Original article where the model was first exposed.Michael Berry.S.T.Dumais,.. Occurrence matrix-Applications-Rank lowering-Derivation en.wikipedia.org/wiki/Latent_semantic_analysis-Cached-Similar- Google Semantically Related Words Latent Semantic Indexing .. Google recently strongly promoted the semantic relationships of words in their algorithm. www.seobook.com/archives/000657.shtml-Cached-Similar-x Latent Semantic Indexing Latent semantic indexing adds an important step to the document indexing process.In addition to recording which keywords a document contains,.. www.seobook.com/lsi/lsa_definition.htm-Cached Similar- LSI-Latent Semantic Indexing Web Site January 12,2006 podcast interview of Michael W.Berry discussing LSI on the Good Karma Show hosted by Greg Niland (aka GoodROl)at WebmasterRadio.fm... ww.cs.utk.edu-lsi/-Cached-Similar-⊙图☒ Laterit Semantic Indexingrsity, 完成tne25-27,2010

Relevance Feedback Query Expansion CCF-ADL at Zhengzhou University, June 25-27, 2010 4

Vector Space Model

Vector Space Model

Documents as vectors Di D2 D3 Da Ds Do 中国 4.1 0.0 3.7 5.9 3.1 0.0 文化 4.5 4.5 0 0 11.6 0 日本 0 3.5 2.9 0 2.1 3.9 留学生 0 3.1 5.1 12.8 0 0 教育 2.9 0 0 2.2 0 0 北京 7.1 0 0 0 4.4 3.8 每一个文档j能够被看作一个向量,每个term是一个维 度,取值为log-scaled tf.idf So we have a vector space -terms are axes docs live in this space -高维空间:即使作stemming,.may have20,000+dimensions 6

Documents as vectors • 每一个文档 j 能够被看作一个向量,每个term 是一个维 度,取值为log-scaled tf.idf • So we have a vector space – terms are axes – docs live in this space – 高维空间:即使作stemming, may have 20,000+ dimensions D1 D2 D3 D4 D5 D6 … 中国 4.1 0.0 3.7 5.9 3.1 0.0 文化 4.5 4.5 0 0 11.6 0 日本 0 3.5 2.9 0 2.1 3.9 留学生 0 3.1 5.1 12.8 0 0 教育 2.9 0 0 2.2 0 0 北京 7.1 0 0 0 4.4 3.8 … 6

Intuition t3 d2 d3 ,d 8 中 t da Postulate:在vector space中“close together'"的 文档会talk about the same things. 用例:Query-by-example,Free Text query as vector CCF-ADL at Zhengzhou University,June 25-27,2010

Intuition Postulate: 在vector space中“close together” 的 文档会talk about the same things. t1 d2 d1 d3 d4 d5 t3 t2 θ φ 用例:Query-by-example,Free Text query as vector CCF-ADL at Zhengzhou University, June 25-27, 2010 7

Cosine similarity t3 d2 。向量d,和d,的“closeness” 可以用它们之间的夹角大 小来度量 -d ·具体的,可用cosine of the 8 angle x来计算向量相似度. 向量按长度归一化 Normalization 2 a=v∑w2=1 sim(djdk)= d d ∑e小 V∑∑暖 8

Cosine similarity 1 1 , 2 =  = = M i i j d j w  • 向量d1和d2的“closeness” 可以用它们之间的夹角大 小来度量 • 具体的,可用cosine of the angle x来计算向量相似度. • 向量按长度归一化 Normalization t 1 d 2 d 1 t 3 t 2 θ    = = = =  = M i i k M i i j M i i j i k j k j k j k w w w w d d d d sim d d 1 2 , 1 2 , 1 , , ( , )     8

Latent Semantic Model

Latent Semantic Model

Vector Space Model:Pros Automatic selection of index terms Partial matching of queries and documents (dealing with the case where no document contains all search terms) Ranking according to similarity score (dealing with large result sets) Term weighting schemes (improves retrieval performance) ·Various extensions -Document clustering Relevance feedback(modifying query vector) Geometric foundation CCF-ADL at Zhengzhou University, 10 June25-27,2010

Vector Space Model: Pros • Automatic selection of index terms • Partial matching of queries and documents (dealing with the case where no document contains all search terms) • Ranking according to similarity score (dealing with large result sets) • Term weighting schemes (improves retrieval performance) • Various extensions – Document clustering – Relevance feedback (modifying query vector) • Geometric foundation CCF-ADL at Zhengzhou University, June 25-27, 2010 10

I guess this page is about a blackberry...? plackberry blackberry blackberry blackhemy CCF-ADL at Zhengzhou University, 11 June25-27,2010

CCF -ADL at Zhengzhou University, June 25 -27, 2010 11

点击下载完整版文档(PPT)VIP每日下载上限内不扣除下载券和下载次数;
按次数下载不扣除下载券;
24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
共53页,可试读18页,点击继续阅读 ↓↓
相关文档

关于我们|帮助中心|下载说明|相关软件|意见反馈|联系我们

Copyright © 2008-现在 cucdc.com 高等教育资讯网 版权所有