正在加载图片...
Downloaded from genome. cshlporg on June 23, 2011-Published by Cold Spring Harbor Laboratory Press Genetic code optimality for additional information UGACA frame 0 frame NUGACANN NNUGACAN NUGA CANN AAUGACAAA 1.8103I NNNUGACAA AUGACANNN P=3.0*10 AAUGACAAG NNNUGACAG p=0 ACANNN P-23.10 NNNUGACAC P0!! CUGACANNN P-22*10 CCUGA CACC p-0.4"10 Po(U GACA) 0! PH(UGACA) 98*10 P-CU GA C)=.19*10j P( UGACA)=(Po+P1+P)3=(0+19*10+9.8*105)3=9.610 stop codon:AAA stop codons: CCA, CCG CGG 5-mer: AAAAA 0X NNNUGACAN 0 X NNNAAAJAAN 0 X NNNCCG GUN - V NNUgaCanN -1 X NNAAAAJANN -1 X NNCCGG UNN +IV NNNnUgaCA +1 X NNNNAAJAAA +1 V NNNNCCGGU n-mer size 6-mer probabilities E (A) Calculation of the probability that an n-mer sequence appears within a protein-coding region in the real genetic code. The 5-mer S=UGACA can appear in one of the three reading frames For each reading frame, th ilities of all three codon combinations that are summed up. Codon combinations with an in-frame stop(such as UGA)do not contribute to the n-mer probability since they cannot appea dons, stop codons are in red, Por P-1 P, denote the probabilities of encountering S in the 0/-1/+1 frame.(B, C, D) Three examples of""n-mers in the real code and in al (B)The 5-mer UGACA, which includes the codon UGA, can appear in a protein-coding sequence with the real genetic code in only two of the three possible reading frames(+1 and-1 frames) (O In the alternative code shown in Figure 3D, whose stop codon AAA overlaps with itself, the 5-mer AAAAA cannot appear in a protein-coding sequence in any of the three reading frames. ( D)In an alternative code with the overlapping stop codons CCG and CGG, the S-mer CCGGU can only appear in one reading frame. The 5-mers are in bold text, stop codons are in red, n denotes any DNA letter, green v denotes a frame in whic appear, red x denotes a frame in which the n-l bilities of all 6-mers in the real code(bold black line)and 6-mers with this probability In the real code there are significantly less"difficult "6-mers(with low probabilities), relative to the altemative codes. (2) The fraction of n-mers that have a higher probability in the real code than in altemative codes increases with n-mer The y-axis shows the fractio of n-mers for which the average probability of appearing in the real genetic code is significantly higher than in the altenative codes Genome Research 407Figure 2. (A) Calculation of the probability that an n-mer sequence appears within a protein-coding region in the real genetic code. The 5-mer sequence S = UGACA can appear in one of the three reading frames. For each reading frame, the probabilities of all three codon combinations that contain S are summed up. Codon combinations with an in-frame stop (such as UGA) do not contribute to the n-mer probability since they cannot appear in a coding region. Vertical lines separate consecutive codons, stop codons are in red, P0, P1, P+1 denote the probabilities of encountering S in the 0/1/+1 frame. (B,C,D) Three examples of “difficult” n-mers in the real code and in alternative codes. (B) The 5-mer UGACA, which includes the stop codon UGA, can appear in a protein-coding sequence with the real genetic code in only two of the three possible reading frames (+1 and 1 frames). (C) In the alternative code shown in Figure 3D, whose stop codon AAA overlaps with itself, the 5-mer AAAAA cannot appear in a protein-coding sequence in any of the three reading frames. (D) In an alternative code with the overlapping stop codons CCG and CGG, the 5-mer CCGGU can only appear in one reading frame. The 5-mers are in bold text, stop codons are in red, N denotes any DNA letter, green v denotes a frame in which the n-mer can appear, red x denotes a frame in which the n-mer cannot appear. (E) Distribution of the probabilities of all 6-mers in the real code (bold black line) and in the alternative codes (light blue lines). The x-axis is the probability of obtaining 6-mers within protein-coding sequences; the y-axis is the number of 6-mers with this probability. In the real code there are significantly less “difficult” 6-mers (with low probabilities), relative to the alternative codes. (F) The fraction of n-mers that have a higher probability in the real code than in alternative codes increases with n-mer size. The y-axis shows the fraction of n-mers for which the average probability of appearing in the real genetic code is significantly higher than in the alternative codes. Genetic code optimality for additional information Genome Research 407 www.genome.org Downloaded from genome.cshlp.org on June 23, 2011 - Published by Cold Spring Harbor Laboratory Press
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有