7.03 Lecture28-30 11/17/03,11/19/03,11/21/03 Lecture 28: Polymorphisms in Human DNA Sequences sNPs Ssrs
7.03 Lecture 28-30 11/17/03, 11/19/03, 11/21/03 1 Lecture 28: Polymorphisms in Human DNA Sequences •SNPs •SSRs
7.03 Lecture28-30 11/17/03,11/19/03,11/21/03 The methods of genetic analysis that you have been learning are applicable to mammals-even to humans. However, we need to combine these genetic principles with an understanding of the physical realities of the human genome. To genetics we will add genomics Eukaryotic Genes and Genomes genome= dNa content of a complete haploid set of chromosomes DNA content of a gamete(sperm or egg) DNA Species Chromosomes CM content/ sequence haploid(Mb) completed haploid E. coli N/A 1997 4,200 S. cerevisiae 4000 12 1997 5.800 elegans 6 300 100 1998 19.000 D. melanogaster 14.000 M. musculus 1700 2002 draft 2005 finished300002 H sapiens 3300 3000 2001 draft 30000? 2003 finishe Note: CM= centi Morgan=1% recombination Mb= megabase= 1 million base-pairs of DNA Kb= kilobase 1 thousand base-pairs of dnA
7.03 Lecture 28-30 11/17/03, 11/19/03, 11/21/03 2 The methods of genetic analysis that you have been learning are applicable to mammals — even to humans. However, we need to combine these genetic principles with an understanding of the physical realities of the human genome. To genetics we will add genomics. Eukaryotic Genes and Genomes = DNA content of a gamete (sperm or egg) genome = DNA content of a complete haploid set of chromosomes H. sapiens M. musculus D. melanogaster C. elegans S. cerevisiae E. coli genes/ haploid year sequence completed DNA content/ haploid(Mb) Species Chromosomes cM 1 16 6 4 20 23 N/A 4000 300 280 1700 3300 5 12 100 180 3000 3000 1997 1997 1998 2000 4,200 5,800 19,000 14,000 30,000? 30,000? Mb = megabase = 1 million base-pairs of DNA Kb = kilobase = 1 thousand base-pairs of DNA Note: cM = centi Morgan = 1% recombination 2002 draft 2001 draft 2005 finished? 2003 finished
7.03 Lecture28-30 11/17/03,11/19/03,11/21/03 Let's add some columns to a table we constructed several lectures back Species DNA content/ generation true breeding haploid(Mb) crosses? strains? E. coli N/A 5 30 min es es S cerevisiae 4000 90 min yes es C elegans 300 100 4 d yes yes D melanogaster 280 2 wk es es M. musculus 1700 3000 3 mo es yes H sapiens no no You might add a column indicating the number of offspring per adult. What are the implications of this table for human genetic studies? Obviously they re difficult
7.03 Lecture 28-30 11/17/03, 11/19/03, 11/21/03 3 H. sapiens 3300 3000 M. musculus 1700 3000 D. melanogaster 280 180 C. elegans 300 100 S. cerevisiae 4000 12 E. coli N/A 5 true breeding strains? design crosses? generation time DNA content/ haploid (Mb) Species cM 30 min 90 min 4 d 2 wk 3 mo 20 yr yes yes yes yes yes yes yes yes yes yes no no Let's add some columns to a table we constructed several lectures back: You might add a column indicating the number of offspring per adult. What are the implications of this table for human genetic studies? Obviously they're difficult
7.03 Lecture28-30 11/17/03,11/19/03,11/21/03 More specifically Human genetics is retrospective (vs prospective). Human geneticists cannot test hypotheses prospectively. The mouse provides a prospective surrogate Cant do selections Meager amounts of data Human geneticists typically rely upon statistical arguments as opposed to overwhelming amounts of data in drawing connections between genotype and phenotype Highly dependent on DNA-based maps and DNA-based analysis The unique advantages of human genetics a large population which is self-screening to a considerable degree Phenotypic subtlety is not lost on the observer The self interest of our species
7.03 Lecture 28-30 11/17/03, 11/19/03, 11/21/03 4 • Human genetics is retrospective (vs prospective). Human geneticists cannot test hypotheses prospectively. The mouse provides a prospective surrogate. • Can’t do selections • Meager amounts of data Human geneticists typically rely upon statistical arguments as opposed to overwhelming amounts of data in drawing connections between genotype and phenotype. • Highly dependent on DNA-based maps and DNA-based analysis The unique advantages of human genetics: • A large population which is self-screening to a considerable degree • Phenotypic subtlety is not lost on the observer • The self interest of our species More specifically:
7.03 Lecture28-30 11/1703,11/19/03,11/21/03 Let's consider the types and frequency of polymorphisms at the dNa level in the human genome DNA polymorphisms are of many types, including substitutions, duplications, deletions types of dNa polymorphisms are of particular importance in human genetics today A locus is said to be polymorphic if two or more alleles are each present at a frequency of at least 1% in a population of animals 1)sNPs= single nucleotide polymorphisms single nucleotide substitutions In human populations Hnuc= average heterozygosity per nucleotide site =0.001 This means that, on average, at a randomly selected locus, two randomly selected human alleles chromosomes)differ at about 1 nucleotide per 1000. This implies that your maternal genome(the haploid genome that you inherited from your mother) differs from your paternal genome at about 1 nucleotide per 1000 Similarities and differences: This also implies that the genomes of any two individuals are 99.9% identical. Conversely, the genomes of two randomly selected individuals will differ at several million nucleotides.(Identical twins are a notable exception
7.03 Lecture 28-30 11/17/03, 11/19/03, 11/21/03 5 Let's consider the types and frequency of polymorphisms at the DNA level in the human genome. DNA polymorphisms are of many types, including substitutions, duplications, deletions, etc. Two types of DNA polymorphisms are of particular importance in human genetics today: This means that, on average, at a randomly selected locus, two randomly selected human alleles (chromosomes) differ at about 1 nucleotide per 1000. This implies that your maternal genome (the haploid genome that you inherited from your mother) differs from your paternal genome at about 1 nucleotide per 1000. Similarities and differences: This also implies that the genomes of any two individuals are 99.9% identical. Conversely, the genomes of two randomly selected individuals will differ at several million nucleotides. (Identical twins are a notable exception.) 1) SNPs = single nucleotide polymorphisms = single nucleotide substitutions Hnuc = A locus is said to be polymorphic if two or more alleles are each present at a frequency of at least 1% in a population of animals. In human populations: average heterozygosity per nucleotide site = 0.001
7.03 Lecture28-30 11/17/03,11/19/03,11/21/03 The great majority(probably 99%)of SNPs are selectively"neutral changes of little or no functional consequence outside coding or gene regulatory regions(97% of human genome) silent substitutions in coding sequences some amino acid substitutions do not affect protein stability or function disadvantageous SNPs selected against --> further underrepresentation A small minority of SNPs are of functional consequence and are selectively advantageous or disadvantageous
7.03 Lecture 28-30 11/17/03, 11/19/03, 11/21/03 6 The great majority (probably 99%) of SNPs are selectively “neutral” changes of little or no functional consequence: • outside coding or gene regulatory regions (>97% of human genome) • silent substitutions in coding sequences • some amino acid substitutions do not affect protein stability or function A small minority of SNPs are of functional consequence and are selectively advantageous or disadvantageous. • disadvantageous SNPs selected against --> further underrepresentation
7.03 Lecture28-30 11/17/03,11/19/03,11/21/03 Affymetrix chip to identify SNPs Image removed due to copyright considerations. 6000 datapoints tabular and visual views of the data Note that only 1500 showing in image on left,a few hundred at most on right Following slides show. how we visualize data
7.03 Lecture 28-30 11/17/03, 11/19/03, 11/21/03 7 Affymetrix chip to identify SNPs 6000 datapoints, tabular and visual views of the data. Note that only 1500 showing in image on left, a few hundred at most on right. Following slides show… how we visualize data Image removed due to copyright considerations
7.03 Lecture28-30 11/17/03,11/19/03,11/21/03 Genetic Traits Simplex or monogenic Complex or multifactorial 三= Disease gene Susceptibility Susceptibility Susceptibility gene gene gene Modifying gene Environmen Life Style etc Phenotype Phenotype
7.03 Lecture 28-30 11/17/03, 11/19/03, 11/21/03 8
7.03 Lecture28-30 11/17/03,11/19/03,11/21/03 PEDIGREE: DOMINANT TRAIT WITH SUPPRESSOR SEGREGATING looks like we've been lucky. Allele A at SSR37 appears to segregate with HD. But can you be confident that the hd gene is in close proximity to the ssr37 locus, or even that it is on chromosome 4?
7.03 Lecture 28-30 11/17/03, 11/19/03, 11/21/03 9 PEDIGREE: DOMINANT TRAIT WITH SUPPRESSOR SEGREGATING It looks like we've been lucky. Allele A at SSR37 appears to segregate with HD. But can you be confident that the HD gene is in close proximity to the SSR37 locus, or even that it is on chromosome 4?
7.03 Lecture28-30 11/17/03,11/19/03,11/21/03 AKR HAS A GENE THAT SUPPRESSES TUMORS TUMORS NON-TUMORS C57black X AKR AAbb aaBB Aabb All normal 13/16 normal: . 3/16 tumors A-B aaB aabb looks like we've been lucky. Allele A at SSR37 appears to segregate with HD. But can you be confident that the hd gene is in close proximity to the ssr37 locus, or even that it is on chromosome 4?
7.03 Lecture 28-30 11/17/03, 11/19/03, 11/21/03 10 All normal . C57black X AKR 13/16 normal:: 3/16 tumors TUMORS NON-TUMORS AAbb aaBB AaBb A-BaaBaabb aaBAKR HAS A GENE THAT SUPPRESSES TUMORS It looks like we've been lucky. Allele A at SSR37 appears to segregate with HD. But can you be confident that the HD gene is in close proximity to the SSR37 locus, or even that it is on chromosome 4?