Natural language processing (8) Zhao hai赵海 Department of Computer Science and Engineering Shang hai Jiao Tong University 2010-2011 zhaohaidcs. situ. edu. cn
1 Natural Language Processing (8) Zhao Hai 赵海 Department of Computer Science and Engineering Shanghai Jiao Tong University 2010-2011 zhaohai@cs.sjtu.edu.cn
Overview Models HMM: Hidden markov model maximum entropy markov model CRES: Conditional random fields Tasks Chinese word segmentation part-of-speech tagging named entity recognition
2 Overview • Models – HMM: Hidden Markov Model – maximum entropy Markov model – CRFs: Conditional Random Fields • Tasks – Chinese word segmentation – part-of-speech tagging – named entity recognition
What is an hmm? Graphical model Circles indicate states Arrows indicate probabilistic dependencies between states
3 What is an HMM? • Graphical Model • Circles indicate states • Arrows indicate probabilistic dependencies between states
What is an hmm? Green circles are hidden states Dependent only on the previous state The past is independent of the future given the present
4 What is an HMM? • Green circles are hidden states • Dependent only on the previous state • “The past is independent of the future given the present
What is an hmm? Purple nodes are observed states Dependent only on their corresponding hidden state
5 What is an HMM? • Purple nodes are observed states • Dependent only on their corresponding hidden state
HMM Formalism K K S,K,∏,A,B S:S.SN are the values for the hidden states K: kI.kM are the values for the observations
6 HMM Formalism • {S, K, P, A, B} • S : {s1…sN } are the values for the hidden states • K : {k1…kM } are the values for the observations S S S K K K S K S K
HMM Formalism B K K S,K,∏,A,B ∏={π} are the initial state probabilities A=(ai are the state transition probabilities B=bik are the observation state probabilities
7 HMM Formalism • {S, K, P, A, B} • P = {pi} are the initial state probabilities • A = {aij} are the state transition probabilities • B = {bik} are the observation state probabilities A B A A A B B S S S K K K S K S K
Inference in an hmm Probability Estimation: Compute the probability of a given observation sequence Decoding: Given an observation sequence compute the most likely hidden state sequence Parameter Estimation: Given an observation sequence find a model that most closely fits the observation
8 Inference in an HMM • Probability Estimation: Compute the probability of a given observation sequence • Decoding: Given an observation sequence, compute the most likely hidden state sequence • Parameter Estimation: Given an observation sequence, find a model that most closely fits the observation
Probability estimation Given an observation sequence and a model compute the probability of the observation sequence O=(01….On),=(A,B,) Compute P(Olu
9 Compute ( | ) ( ... ), ( , , ) 1 P O O = o oT = A B P o1 ot-1 ot ot+1 oT Given an observation sequence and a model, compute the probability of the observation sequence Probability Estimation
Probability estimation P(O|x,p)=ba0b2…ba
10 Probability Estimation T oT P O X bx o bx o bx ( | , ) ... 1 1 2 2 = o1 ot-1 ot ot+1 oT x1 xt-1 xt xt+1 xT