正在加载图片...
ARTICLES NATUREIVol 437 27 October 2005 Table 3 HapMap Phase I genotyping success measures SNP categories Assays submitt 1273,716 1302849 1,273,703 Passed QC filters 1123296(88%) 157,650(89%) Did not pass QC filters 150,420(12%) 14519901% 138977(11%) 986(65%) 107,626(74%) 22,815(15% 13600(9% <0.001 Hardy-Weinberg P-value 12052(8%) 16,176(12% Other failures 17,692(12% 23,722(17%) Non-redundant (unique) SNPs 076.392 ( 268325(25%) Polymorphic 920.102 870,498 818980(75%) All analysis panels 156,77 Passed in two analysis panels 231(8% Passed in three analysis panel 1007,337(87%) Monomorphic across three analysis panel Polymorphic in all three analysis panels MAF 20.05 in at least one of three analysis panels 877,351 Out of 95 samples in CEU, YRI: 94 samples in CHB+JPT. t'other failures'includes SNPs with discrepancies during the data transmission process. Some SNPs failed in more than one way, so these percentages add up to an 1009%6 where a segment of the maternal haplotype is incorrectly joined to the Methods). These data reveal two important conclusions: first, that paternal-occur extraordinarily rarely in the trio samples(every 8 Mb PCR-based sequencing of diploid samples may be biased against very in CEU; 3.6 Mb in YRI. The switch rate is higher in the CHB+JPT rare variants(that is, those seen only as a single heterozygote), and amples(one per 0.34 Mb)due to the lack of information from second, that the vast majority of common variants are either parent-offspring trios, but even for the unrelated individuals, stat- represented in dbSNP, or show tight correlation to other SNPs that istical reconstruction of haplotypes is remarkably accurate are in dbSNP(Fig 3). Estimating properties of SNP discovery and dbSNP. Extensive Allele frequency distributions within population samples. The sequencing and genotyping in the ENCODE regions characterized underlying allele frequency distributions for these samples are best 6-88-10101212-14141616-181820>20 Figure 2 Distribution of inter- SNP distances. The distributions are shown for each analysis panel for the Hap Mappable genome(defined in the Methods), for all common SNPs(with MAF 20.05) 2005 Nature Publishing Group© 2005 Nature Publishing Group separately to each analysis panel. (Phased haplotypes are available for download at the Project website.) We estimate that ‘switch’ errors— where a segment of the maternal haplotype is incorrectly joined to the paternal—occur extraordinarily rarely in the trio samples (every 8 Mb in CEU; 3.6 Mb in YRI). The switch rate is higher in the CHBþJPT samples (one per 0.34 Mb) due to the lack of information from parent–offspring trios, but even for the unrelated individuals, stat￾istical reconstruction of haplotypes is remarkably accurate. Estimating properties of SNP discovery and dbSNP. Extensive sequencing and genotyping in the ENCODE regions characterized the false-positive and false-negative rates for dbSNP, as well as polymerase chain reaction (PCR)-based resequencing (see Methods). These data reveal two important conclusions: first, that PCR-based sequencing of diploid samples may be biased against very rare variants (that is, those seen only as a single heterozygote), and second, that the vast majority of common variants are either represented in dbSNP, or show tight correlation to other SNPs that are in dbSNP (Fig. 3). Allele frequency distributions within population samples. The underlying allele frequency distributions for these samples are best Figure 2 | Distribution of inter-SNP distances. The distributions are shown for each analysis panel for the HapMappable genome (defined in the Methods), for all common SNPs (with MAF $ 0.05). Table 3 | HapMap Phase I genotyping success measures Analysis panel SNP categories YRI CEU CHB þ JPT Assays submitted 1,273,716 1,302,849 1,273,703 Passed QC filters 1,123,296 (88%) 1,157,650 (89%) 1,134,726 (89%) Did not pass QC filters* 150,420 (12%) 145,199 (11%) 138,977 (11%) . 20% missing data 98,116 (65%) 107,626 (74%) 93,710 (67%) . 1 duplicate inconsistent 7,575 (5%) 6,254 (4%) 10,725 (8%) . 1 mendelian error 22,815 (15%) 13,600 (9%) 0 (0%) , 0.001 Hardy–Weinberg P-value 12,052 (8%) 9,721 (7%) 16,176 (12%) Other failures† 23,478 (16%) 17,692 (12%) 23,722 (17%) Non-redundant (unique) SNPs 1,076,392 1,104,980 1,087,305 Monomorphic 156,290 (15%) 234,482 (21%) 268,325 (25%) Polymorphic 920,102 (85%) 870,498 (79%) 818,980 (75%) All analysis panels Unique QC-passed SNPs 1,156,772 Passed in one analysis panel 52,204 (5%) Passed in two analysis panels 97,231 (8%) Passed in three analysis panels 1,007,337 (87%) Monomorphic across three analysis panels 75,997 Polymorphic in all three analysis panels 682,397 MAF $ 0.05 in at least one of three analysis panels 877,351 *Out of 95 samples in CEU, YRI; 94 samples in CHB þ JPT. †‘Other failures’ includes SNPs with discrepancies during the data transmission process. Some SNPs failed in more than one way, so these percentages add up to more than 100%. ARTICLES NATURE|Vol 437|27 October 2005 1302
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有