791/7.36/BE490 Lecture #6 Mar.11,2004 Predicting rna Secondary structure Chris burge
7.91 / 7.36 / BE.490 Lecture #6 Mar. 11, 2004 Predicting RNA Secondary Structure Chris Burge
Review of markov models DNA Evolution CpG Island HMM The viterbi algorithm Real World HMMs Markov models for dna evolution Ch. 4 of Mount
Review of Markov Models & DNA Evolution Ch. 4 of Mount • CpG Island HMM • The Viterbi Algorithm • Real World HMMs • Markov Models for DNA Evolution
DNA Sequence evolution Generation n-1(grandparent) 5/TGGCATGCACCCTGTAAGTCAATATAAATGGCTAdGCCTAGCCCATGCGA 3 3 ACCGTACGTGGGACATTCAGTTATATTTACCGATGCGGATCGGGTACGCT 5 Generation n(parent) 5 TGGCATGCACCCTGTAAGTCAATATAAATGGCTATGCCTAGCCOATGCGA 3 3/ ACCGTACGTGGGACATTCAGTTATATTTACCGATACGGATCGGGTACGCT 5/ Generation n+1(child) 5 TGGCATGCACCCTGTAAGTCAATATAAATGGCTATGCCTAGCCCGTGCGA 3 3 ACCGTACGTGGGACATTCAGTTATATTTACCGATACGGATCGGGCACGCT 5/
DNA Sequence Evolution Generation n-1 (grandparent) 5’ TGGCATGCACCCTGTAAGTCAATATAAATGGCTACGCCTAGCCCATGCGA 3’ |||||||||||||||||||||||||||||||||||||||||||||||||| 3’ ACCGTACGTGGGACATTCAGTTATATTTACCGATGCGGATCGGGTACGCT 5’ 5’ TGGCATGCACCCTGTAAGTCAATATAAATGGCTA TGCCTAGCCCATGCGA 3’ |||||||||||||||||||||||||||||||||||||||||||||||||| 3’ ACCGTACGTGGGACATTCAGTTATATTTACCGAT ACGGATCGGGTACGCT 5’ Generation n (parent) Generation n+1 (child) 5’ TGGCATGCACCCTGTAAGTCAATATAAATGGCTA TGCCTAGCCC GTGCGA 3’ |||||||||||||||||||||||||||||||||||||||||||||||||| 3’ ACCGTACGTGGGACATTCAGTTATATTTACCGAT ACGGATCGGG CACGCT 5’
What is a Markov model (aka Markov Chain)? Classical Definition a discrete stochastic process X1, X2, X3, which has the Markov property PMXn1JX=X, X2=X2,.X,x,)=PXn+ X=X) (for all xi, all j, all n In words A random process which has the property that the future (next state) is conditionally independent of the past given the present(current state) Markov-a russian mathematician ca. 1922
What is a Markov Model (aka Markov Chain)? Classical Definition A discrete stochastic process X1, X2, X3, … which has the Markov property: P(Xn+1 = j | X1=x1, X2=x2, … Xn=xn) = P(Xn+1 = j | Xn=x ) n (for all x , all j, all n) i In words: A random process which has the property that the future (next state) is conditionally independent of the past given the present (current state) Markov - a Russian mathematician, ca. 1922
DNA Sequence evolution is a markov process No selection case PAA PAC Pag P Sn base at generation n P CA CT PGA PGC PGG Pgt P=P(Sm+1=j1S2=) Pta PIc PIg p d=(9a,c, 9, aT)=vector of prob's of bases at gen. n ntk Handy relations gp q
DNA Sequence Evolution is a Markov Process No selection case ⎛ PAA PAC PAG PAT ⎞ PCC PCG PCT ⎟ Sn = base at generation n P = ⎜ ⎜ PCA ⎟ ⎜ PGA PGC PGG PGT ⎟ ⎟ Pij = P (S = j |Sn = i ) ⎝⎜ PTA PTC PTG PTT ⎠ n +1 G q n = ( q A , qC ,q , q T G ) = vector of prob’s of bases at gen. n Handy relations: G q n + 1 G q P n = G q n +k = G q n Pk
Limit Theorem for markov chains n=base at generationn Pi=P(Sn+1=jiN=i) Pij >0 for all i,j(and ∑P=1fora0 then there is a unique vector r such that r=rp and ling pn=r for any prob. vector q n→>00 r is called the"stationary"or"limiting" distribution of P See Ch 4, Taylor Karlin, An Introduction to Stochastic Modeling, 1984 for details
Limit Theorem for Markov Chains Sn = base at generation n Pij = P ( Sn +1 = j |Sn =i ) If Pij >0 for all i,j (and ∑ Pij =1 for all i) j G then there is a unique vector P n G r P G r r such that G q G r G = and lim = (for any prob. vector q ) n → ∞ G r is called the “stationary” or “limiting” distribution of P See Ch. 4, Taylor & Karlin, An Introduction to Stochastic Modeling, 1984 for details
Stationary Distribution Examples 2-letter alphabet: R=purine, Y=pyrimidine Stationary distributions for: pp 0<p<1 0<p<1,0<q q
Stationary Distribution Examples 2-letter alphabet: R = purine, Y = pyrimidine Stationary distributions for: ⎛ 1 0⎞ ⎛ 0 1 ⎞ I = ⎜ ⎟ Q = ⎜ ⎟ ⎝ 0 1⎠ ⎝ 1 0⎠ ⎛1 − p p ⎞ P = ⎝⎜ p 1 − p⎠⎟ 0 < p < 1 ⎛1 − p p ⎞ P′ = 0 < p < 1, 0 < q < 1 ⎝⎜ q 1 − q⎠⎟
How are mutation rates measured?
How are mutation rates measured?
How does entropy change when a Markov transition matrix is applied? If limiting distribution is uniform, then entropy increases (analogous to 2nd Law of Thermodynamics However, this is not true in general (why not
How does entropy change when a Markov transition matrix is applied? If limiting distribution is uniform, then entropy increases (analogous to 2nd Law of Thermodynamics) However, this is not true in general (why not?)
How rapidly is the stationary distribution approached?
How rapidly is the stationary distribution approached?