当前位置:高等教育资讯网  >  中国高校课件下载中心  >  大学文库  >  浏览文档

《网络搜索和挖掘关键技术 Web Search and Mining》课程教学资源(PPT讲稿)Lecture 12 Language Models

资源类别:文库,文档格式:PPT,文档页数:43,文件大小:660.5KB,团购合买
点击下载完整版文档(PPT)

Language Models Web Search and Mining ecture 12: Language Models

Language Models 1 Lecture 12: Language Models Web Search and Mining

Language Models Recap Probabilistic models Binary independence model Bayesian networks for IR Language models

Language Models 2 Recap ▪ Probabilistic models: ▪ Binary independence model ▪ Bayesian networks for IR ▪ Language models

Language Models This lecture The Language model approach to Ir Basic query generation model Alternative models

Language Models 3 This lecture ▪ The Language Model Approach to IR ▪ Basic query generation model ▪ Alternative models

Language Models Standard probabilistic ir Information need P(r O, d) d1 matching d2 query dn document collection

Language Models 4 Standard Probabilistic IR query d1 d2 dn … Information need document collection matching P(R | Q,d)

Language Models iR based on Language Model (Lm) Information …………… need P(2IM) Mdi generation d2 query a common search heuristic is to use Mdn words that you expect to find in matching documents as your query document collection The LM approach directly exploits that idea

Language Models 5 IR based on Language Model (LM) query d1 d2 dn … Information need document collection generation ( | ) P Q Md 1 Md M d2 … n M d ▪ A common search heuristic is to use words that you expect to find in matching documents as your query. ▪ The LM approach directly exploits that idea!

Language Models Formal Language(model) Traditional generative model: generates strings Finite state machines or regular grammars, etc Example: I wish I wish i wish I wish i wish i wish wish I wish i wish i wish i wish *wish i wish

Language Models 6 Formal Language (Model) ▪ Traditional generative model: generates strings ▪ Finite state machines or regular grammars, etc. ▪ Example: I wish I wish I wish I wish I wish I wish I wish I wish I wish I wish I wish … *wish I wish

Language Models Stochastic Language Models Models probability of generating strings in the language(commonly all strings over alphabet 2) Model m 0.2 the the man likes the woman 0.1 0.01 man 0.20.010.020.20.01 0.01 woman 0.03 said multiply 0.02 likes P(S|M)=0.0000008

Language Models 7 Stochastic Language Models ▪ Models probability of generating strings in the language (commonly all strings over alphabet ∑) 0.2 the 0.1 a 0.01 man 0.01 woman 0.03 said 0.02 likes … the man likes the woman 0.2 0.01 0.02 0.2 0.01 multiply Model M P(s | M) = 0.00000008

Language Models Stochastic Language Models Model probability of generating any string Model mi Model m2 0.2 the 0.2th 0.0001cass the class pleaseth yon maiden 0.01 class 0.0001 sayst 0.03 sayst 0.2 0.010.00010.00010.0005 0.001 oleasethll0.02 pleaseth)0200100201001 0.0001yon yon 0.0005 m aiden 0.01 m aiden (SM2)> P(SM1) 0.01 wom an 0.0001 woman 8

Language Models 8 Stochastic Language Models ▪ Model probability of generating any string 0.2 the 0.01 class 0.0001 sayst 0.0001 pleaseth 0.0001 yon 0.0005 maiden 0.01 woman Model M1 Model M2 the class pleaseth yon maiden 0.2 0.01 0.0001 0.0001 0.0005 0.2 0.0001 0.02 0.1 0.01 P(s|M2) > P(s|M1) 0.2 the 0.0001 class 0.03 sayst 0.02 pleaseth 0.1 yon 0.01 maiden 0.0001 woman

Language Models Stochastic Language Models A statistical model for generating text Probability distribution over strings in a given language → P(o●oM)=P(。|M (o|M,● P(。|M,●d P(。|M,●o

Language Models 9 Stochastic Language Models ▪ A statistical model for generating text ▪ Probability distribution over strings in a given language M P ( | M ) = P ( | M ) P ( | M, ) P ( | M, ) P ( | M, )

Language Models Unigram and higher-order models ○0●● P()P(o|P(●oP(。l●o刂 Unigram Language models Easy P()P(o)P(●)P( Effective Bigram( generally n-gram) Language models P(°)P(ol●)P(olo)P(●●) Other Language models Grammar-based models(PCFGs),etc Probably not the first thing to try in ir

Language Models 10 Unigram and higher-order models ▪ Unigram Language Models ▪ Bigram (generally, n-gram) Language Models ▪ Other Language Models ▪ Grammar-based models (PCFGs), etc. ▪ Probably not the first thing to try in IR = P ( ) P ( | ) P ( | ) P ( | ) P ( ) P ( ) P ( ) P ( ) P ( ) P ( ) P ( | ) P ( | ) P ( | ) Easy. Effective!

点击下载完整版文档(PPT)VIP每日下载上限内不扣除下载券和下载次数;
按次数下载不扣除下载券;
24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
共43页,可试读15页,点击继续阅读 ↓↓
相关文档

关于我们|帮助中心|下载说明|相关软件|意见反馈|联系我们

Copyright © 2008-现在 cucdc.com 高等教育资讯网 版权所有