香港科技大学：Latent Tree Models Part III：Learning Algorithms

团购合买资源类别：文库，文档格式：PPTX，文档页数：96，文件大小：3.59MB

AAAl 2014 Tutorial Latent tree models Part Il: Learning Algorithms Nevin L Zhang Dept. of computer Science Engineering The hong Kong Univ of Sci. Tech http://www.cse.ust.hk/lzhang

Latent Tree Models Part III: Learning Algorithms Nevin L. Zhang Dept. of Computer Science & Engineering The Hong Kong Univ. of Sci. & Tech. http://www.cse.ust.hk/~lzhang AAAI 2014 Tutorial

Learning latent Tree Models X1)(X2)(X3 X5(X6)(X7 To Determine 1. Number of latent variables 2. Cardinality of each latent variable 3. Model structure 4. Probability distributions Model selection :1.2.3 Parameter estimation 4 AAAl2014 Tutorial Nevin L Zhang HKUST

AAAI 2014 Tutorial Nevin L. Zhang HKUST 2 To Determine 1. Number of latent variables 2. Cardinality of each latent variable 3. Model structure 4. Probability distributions Learning Latent Tree Models Model selection: 1, 2, 3 Parameter estimation: 4

Light Bulb lustration Run interactive program"LightBulbllustration jar' Ilustrate the possibility of inferring latent variables and latent structures from observed co-occurrence patterns 6900 「上一步[下一步一「结果「暂停加速 start next result faster slower

AAAI 2014 Tutorial Nevin L. Zhang HKUST 3  Run interactive program “LightBulbIllustration.jar”  Illustrate the possibility of inferring latent variables and latent structures from observed co-occurrence patterns. Light Bulb Illustration

Part lll: Learning algorithms Introduction Search-based algorithms Algorithms based on variable clustering Distance-based algorithms Empirical comparisons Spectral methods for parameter estimation AAAl2014 Tutorial Nevin L Zhang HKUST

AAAI 2014 Tutorial Nevin L. Zhang HKUST 4 Part III: Learning Algorithms  Introduction  Search-based algorithms  Algorithms based on variable clustering  Distance-based algorithms  Empirical comparisons  Spectral methods for parameter estimation

Search Algorithms a search algorithm explores the space of regular models guided by a scoring function Start with an initial model s terate until model score ceases to increase Modify the current model in various ways to generate a list of candidate models Evaluate the candidate models using the scoring function Pick the best candidate model What scoring function to use? How do we evaluate candidate models? This is the model selection problem AAAl2014 Tutorial Nevin L Zhang HKUST 5

AAAI 2014 Tutorial Nevin L. Zhang HKUST 5  A search algorithm explores the space of regular models guided by a scoring function:  Start with an initial model  Iterate until model score ceases to increase  Modify the current model in various ways to generate a list of candidate models.  Evaluate the candidate models using the scoring function.  Pick the best candidate model  What scoring function to use? How do we evaluate candidate models?  This is the model selection problem. Search Algorithms

Model selection criteria Bayesian score: posterior probability P(mD P(mD)=P(mP(D m)/P(D) =P(m)P(DIm, e) P(e m)de/P(DI BIC Score: Large sample approximation of bayesian score BIC(m D)=log P(D/m, 8-d/2 logN d: number of free parameters; n is the sample size 8*: MLE of 0, estimated using the Em algorithm Likelihood term of bic Measure how well the model fits data Second term Penalty for model complexity. The use of the bic score indicates that we are looking for a model that fits the data well, and at the same time, not overly complex AAAl2014 Tutorial Nevin L Zhang HKUST

AAAI 2014 Tutorial Nevin L. Zhang HKUST 6  Bayesian score: posterior probability P(m|D) P(m|D) = P(m)P(D|m) / P(D) = P(m)∫ P(D|m, θ) P(θ |m) dθ / P(D)  BIC Score: Large sample approximation of Bayesian score BIC(m|D) = log P(D|m, θ*) – d/2 logN  d : number of free parameters; N is the sample size.  θ*: MLE of θ, estimated using the EM algorithm.  Likelihood term of BIC: Measure how well the model fits data.  Second term: Penalty for model complexity.  The use of the BIC score indicates that we are looking for a model that fits the data well, and at the same time, not overly complex. Model Selection Criteria

Model selection criteria AlC(Akaike, 1974) AlC(m D)=log P(DIm, 8*)-d/2 holdout likelihood Data=> Training set, validation set Model parameters estimated based on the training set s Quality of model is measured using likelihood on the validation set Cross validation too expensive AAAl2014 Tutorial Nevin L Zhang HKUST

AAAI 2014 Tutorial Nevin L. Zhang HKUST 7  AIC (Akaike, 1974): AIC(m|D) = log P(D|m, θ*) – d/2  Holdout likelihood  Data => Training set, validation set.  Model parameters estimated based on the training set.  Quality of model is measured using likelihood on the validation set.  Cross validation: too expensive Model Selection Criteria

Search Algorithms Double hill climbing (DHC),(zhang 2002, 2004) )7 manifest variables Single hill climbing( SHC),(Zhang and Kocka 2004 12 manifest variables HeuristIc SHC (HSHC),(Zhang and Kocka 2004) 50 manifest variables EAST, ( Chen et al 2011) 100+ manifest variables AAAl2014 Tutorial Nevin L Zhang HKUST

AAAI 2014 Tutorial Nevin L. Zhang HKUST 8 Search Algorithms  Double hill climbing (DHC), (Zhang 2002, 2004)  7 manifest variables.  Single hill climbing (SHC), (Zhang and Kocka 2004)  12 manifest variables  Heuristic SHC (HSHC), (Zhang and Kocka 2004)  50 manifest variables  EAST, (Chen et al 2011)  100+ manifest variables

Double Hill climbing ( DHC) Two search procedures s One for model structure One for cardinalities of latent variables Very inefficient. Tested only on data sets with 7 or fewer variables (Zhang 2004) DHC tested on synthetic and real-world data sets, together with BIC AIC, and Holdout likelihood respectively Best models found when bic was used So subsequent work based on bIC AAAl2014 Tutorial Nevin L Zhang HKUST

AAAI 2014 Tutorial Nevin L. Zhang HKUST 9  Two search procedures  One for model structure  One for cardinalities of latent variables.  Very inefficient. Tested only on data sets with 7 or fewer variables. (Zhang 2004)  DHC tested on synthetic and real-world data sets, together with BIC, AIC, and Holdout likelihood respectively.  Best models found when BIC was used.  So subsequent work based on BIC. Double Hill Climbing (DHC)

Single Hill Climbing (HSC) Determines both model structure and cardinalities of latent variables using a single search procedure Uses five search operators Node Introduction (ND) Node Deletion(ND) Node Relation (NR) State Introduction (SI) State Deletion (SI) AAAl2014 Tutorial Nevin L Zhang HKUST 10

AAAI 2014 Tutorial Nevin L. Zhang HKUST 10  Determines both model structure and cardinalities of latent variables using a single search procedure.  Uses five search operators  Node Introduction (NI)  Node Deletion (ND)  Node Relation (NR)  State Introduction (SI)  State Deletion (SI) Single Hill Climbing (HSC)

点击进入文档下载页（PPTX格式）

共96页，可试读20页，点击继续阅读 ↓↓

点击下载（PPTX格式）

浏览记录