《生物信息学》(第二版)(樊龙江主编,2021)配套PPT5 5. Phylogenetic Tree 5.1 Genetic polymorphism and phylogenetic tree 5.2 Construction of phylogenetic tree
5. Phylogenetic Tree 5.1 Genetic polymorphism and phylogenetic tree 5.2 Construction of phylogenetic tree 《生物信息学》(第二版)(樊龙江主编,2021)配套PPT5
历史与人物 达尔文 费希尔( Fisher)、莱特( Wright)和霍尔丹( Haldane)(群体遗传学三巨头) 马莱科特( Malecot)、科克汉姆( Cockerham)等 木村资生( Kimura)、根井根正( M Nei)等 Great trinity
达尔文 费希尔(Fisher)、莱特(Wright)和霍尔丹(Haldane)(群体遗传学三巨头) 马莱科特(Malecot)、科克汉姆(Cockerham)等 木村资生(Kimura)、根井根正(M. Nei)等 Wright Fisher Haldane Great trinity 历史与人物
5.1 Genetic polymorphism and phylogenetic tree Polymorphism in the genomes Introduction about tree
5.1 Genetic polymorphism and phylogenetic tree • Polymorphism in the genomes • Introduction about tree
Polymorphism in the genomes Types of polymorphism Single nucleotide polymorphism(SNP) Insertion/deletion (indel) Copy-number variation(CNV) ● Frame- shift Presence and absence variation(PAv)
• Types of polymorphism • Single nucleotide polymorphism (SNP) •Insertion/deletion (indel) • Copy-number variation (CNV) • Frame-shift • Presence and absence variation (PAV) Polymorphism in the genomes
indica 1 ATG CGG GAT CCA TTC CTT AAT GAG TTT CCT AAA ACG GTG CAC GGT TTT ATG TGG GAT CCA TTC CTT AAT GAG TTT CCC GAA ACG GTG CAC GGT TTT indica 3 ATG TGG GAT CCA TTC CTT AAT GAG TTC CCT GAA ACG GTG CAC GGT TTT joponica 1 ATG TGG CCA TTC CTT AAT GAG TTC CCT GAA ACC GTG CAC GGT TTT joponica 2 ATG CGG--- CCA TTG CTT AAT GAG TTC CCT GAA ACC GTG CAC GGT TTT joponica 3 ATG TGG GAT CCA TTG CTT AAT GAG TTC CCT GAA ACC GTG CAC GGT TTT ufipogon_ 1 ATG TGG GAT CCA TTC CTT AAT GAG TTC CCT GAA ACG GTG CAC GGT TTT rufipogon 2 ATG TGG GAT CCA TTG CTT AAT GAG TTC CCT GAA ACC GTG CAC GGT TTT rufipogon 3 ATG TGG GAT CCA TTG CTT AAT GAG TTC CCT GAA TG CAC GGT TTT 0. Nivara ATG TGG GAT CCA TTC CTT AAC GAG TTC CCT GAA ACG GTG CAC GGT TTT Measure of polymorphism T(Tajima 1983): the average pairwise nucleotide diversity 0(Watterson 1975): Watterson's estimator. Number of separating sites per nucleotide site
• Measure of polymorphism • π (Tajima 1983): the average pairwise nucleotide diversity • θ (Watterson 1975): Watterson’s estimator. Number of separating sites per nucleotide site indica_1 ATG CGG GAT CCA TTC CTT AAT GAG TTT CCT AAA ACG GTG CAC GGT TTT indica_2 ATG TGG GAT CCA TTC CTT AAT GAG TTT CCC GAA AC G GTG CAC GGT TTT indica_3 ATG TGG GAT CCA TTC CTT AAT GAG TTC CCT GAA ACG GTG CAC GGT TTT joponica_1 ATG TGG --- CCA TTC CTT AAT GAG TTC CCT GAA ACC GTG CAC GGT TTT joponica_2 ATG CGG --- CCA TTG CTT AAT GAG TTC CCT GAA ACC GTG CAC GGT TTT joponica_3 ATG TGG GAT CCA TTG CTT AAT GAG TTC CCT GAA ACC GTG CAC GGT TTT rufipogon_1 ATG TGG GAT CCA TTC CTT AAT GAG TTC CCT GAA ACG GTG CAC GGT TTT rufipogon_2 ATG TGG GAT CCA TTG CTT AAT GAG TTC CCT GAA ACC GTG CAC GGT TTT rufipogon_3 ATG TGG GAT CCA TTG CTT AAT GAG TTC CCT GAA ACC GTG CAC GGT TTT O.Nivara ATG TGG GAT CCA TTC CTT AAC GAG TTC CCT GAA AC G GTG CAC GGT TTT m n
Why do a phylogenetic analysis? Important for deciphering relationships in gene function and protein structure and function in different organIsms Helps to utilize genetic information of a model organism to analyze a second organism Helps to sort out gene family relationships Valuable tool for tracing the evolutionary history of genes
Why do a phylogenetic analysis? • Important for deciphering relationships in gene function and protein structure and function in different organisms • Helps to utilize genetic information of a model organism to analyze a second organism • Helps to sort out gene family relationships • Valuable tool for tracing the evolutionary history of genes
Evaluating sequence relationships sequence A ERKSIQDLFQSFTLFERRLLIEF sequence B ERLSISELIGSLRLYERRLIIEY sequence C DRKSISDLIGSLRLALLIEF sequence D DRKDLISSLRKALLIEW 1. Account for all column variations A, B and C, D form similar groups based on col. 1 A, C, D based on col. 3 2. Count differences between sequences A B 17 23 similar, 6/23 different C D 21/23 similar. 2/23 different
Evaluating sequence relationships sequence A ERKSIQDLFQSFTLFERRLLIEF sequence B ERLSISELIGSLRLYERRLIIEY sequence C DRKSISDLIGSLRLALLIEF sequence D DRKDLISSLRKALLIEW 1. Account for all column variations |A,B and C,D form similar groups based on col. 1 |A,C,D based on col. 3 2. Count differences between sequences A,B 17/23 similar, 6/23 different C,D 21/23 similar, 2/23 different
What is a tree? A graphical representation of the sequence similarities among a group of nucleic acid or protein sequences For example: number of differences between 3 sequences may be represented by AB C 7 A B AB
What is a tree? • A graphical representation of the sequence similarities among a group of nucleic acid or protein sequences • For example: number of differences between 3 sequences may be represented by .. A B C A 9 7 B 12 7 A B C 5 2
Tree of life Gram Bacteria positives see Proteobacteria lower Chlamydia Archaea (purple bacteria) right chlamydia Korarchaeota see above Euryarchaeota- see above β, Spirochetes see below Crenarchaeota Cyanobacteria Sulfolobus soMfataricus Aquifex Synechocystis sp Aquilex aeolicus Bacteroides Flavobacteria Eukarya Porphyromonas Thermotogales gingivalis Thermotoga Green non maritima sulfur bacteria Deinococcus radiodurans Cenancestor
Tree of life
Phylogenetic tree (dendrogram) Nodes: branching points Branches: lines Topology: branching pattern Terminal (leaf) Internal node (hypothetical ancestor) Branch (edge Root Fig 2.1 A simple tree and associated terms
Phylogenetic tree (dendrogram) Nodes: branching points Branches: lines Topology: branching pattern