正在加载图片...
LETTERS rs10954213 Quantitative RT-PCR validation Ps3023263 P=0.1 Short isoform b Ps3023264 250=0.33 P=827×10n2 2500 2000 Ps3023264 25P=104×10 Genotype E Figure 4 Validation of 3 UTR change in IRF5 by quantitative real-time RT-PCR. (a)Schematic of the 3 ends of the long and short isoforms of /RF5 Exons are shown in blue introns are dashed lines d solid horizontal lines below the exons indicate probe sets. b)Regression analyses of probe sets 0 3023263 and 3023264 against SNP rs10954213. (c)Regression analysis of Ct counts fro quantitative real-time RT-PCR against the ge of SNP rs10954213, to confirm the original microarray data. We used two sets of primers on the panel of individuals, designed to amplify probe sets 3023263 and 3023264, respectively. Q. Affymetrix exon arrays. We isolated RNA using TRIzol reagent following the probe intensity levels by magnitude of response, removin s manufacturer's instructions(Invitrogen) and assessed the RNA quality using seemed to be in the background. Probe intensities were extracted for a series RNA 6000 Nano Chips with the Agilent 2100 Bioanalyzer(Agilent). Biotin- of 16,934 antigenomic probes targeted to nonhuman sequences and averaged z labeled targets for the microarray experiment were prepared using I ug of total by their relative G+C content. The threshold for background expression eo RNA. Ribosomal RNA was removed with the RiboMinus Human/Mouse was defined as the average intensity for a given G+C content plus 2 s.d. s Transcriptome Isolation Kit (Invitrogen) and cDNA was synthesized using For any given genomic probe on the array, if the intensity across all sampl the Gene Chip Wr (Whole Transcript) Sense Target Labeling and Control was below the threshold for the same G+C percentage, then it was consi- Reagents kit as described by the manufacturer(Affymetrix). The sense cDNA dered background and masked from the analysis. In total, 670, 809 probes as then fragmented by uracil DNa glycosylase and apurinic/apyrim corresponding to core annotated probe sets from the analys endonuclease-I and biotin-labeled with terminal deoxynucleotidyl trans- reducing the number of core probe sets in the analysis to 244, 027 probe sets. ferase using the Gene Chip wrTeminal labeling kit (attymetrix). Hybridiza- Association analysis and multiple test correction. We examined probe se tion was performed using 5 micrograms of biotinylated target, which was incubated with the GeneChip Human Exon 1.0 ST array (Affymetrix)at 45 C expression levels for association with flanking SNPs. For each of the 244,027 tested for association of the by washing and specifically bound target was detected using the genechip expression levels to HapMap phase Il (release 2n) SNPs with a minor allele (Affymetrix). The arrays were scanned using the Gene Chip Scanner 3000 7G containing the probe set, using a linear regression model in the R software (Affymetrix)and raw data was extracted from the scanned images and analyzed package. Raw P-values were obtained from the regression using the standard ith the Affymetrix Power Tools software package(Affymetrix) To correct for testing of associations between multiple probe sets and SNPs, Preprocessing and analysis of array hybridization data. The Affymetrix we carried out permutation tests followed by FDR correction. within each Power Tools software package was used to quantile-normalize the probe expression-versus-genotype matrix, we randomly permuted the expression fluorescence intensities and to summarize the probe set (representing exon values for all probe sets belonging to the same meta-probe set(to prese expression) and meta-probe set(representing gene expression) intensities using the haplotype block structure). For each expression measurement, a probe logarithmic-intensity error model(see URLs below). High false- puted and retained only the highest asymptotic P-value and produced the positive rates are common in microarray studies, and previous studies have distribution of maximum P-values within the permuted dataset. The maximum suggested that a major factor arises from probes overlapping SNPs asymptotic P-values from the experimental data were then converted into that result in changes to hybridization intensity potentially influencing the empirical P-values by mapping onto the permuted distribution. The above apparent association between the snP genotype and probe intensities. procedure corrects for testing multiple SNPs against each expression valu reduce potential influences of SNPs on false positives, all probes containing Subsequently, we performed an FDR correction2on the empirical P-values, te known SNPs(dbSNP release 126) were masked out before summarizing control the FDR across multiple expression values. The procedure was applied robe set and meta-probe set scores. The presence of unannotated SNPs separately to measurements at the probe set and meta-probe set levels. We used affecting probe hybridization will remain(see below), but these cannot be a 0.05 FDR criterion as a significance cutoff in our analysis. For the sake of detected by any statistical methods except for the impractical solution of clarity, all of the values and cutoffs quoted in the results correspond to the raw, ng all probes across the panel used in the study. We also filtered uncorrected P-values. NATURE GENETICS VOLUME 40 NUMBER 2 I FEBRUARY 2008Affymetrix exon arrays. We isolated RNA using TRIzol reagent following the manufacturer’s instructions (Invitrogen) and assessed the RNA quality using RNA 6000 NanoChips with the Agilent 2100 Bioanalyzer (Agilent). Biotin￾labeled targets for the microarray experiment were prepared using 1 mg of total RNA. Ribosomal RNA was removed with the RiboMinus Human/Mouse Transcriptome Isolation Kit (Invitrogen) and cDNA was synthesized using the GeneChip WT (Whole Transcript) Sense Target Labeling and Control Reagents kit as described by the manufacturer (Affymetrix). The sense cDNA was then fragmented by uracil DNA glycosylase and apurinic/apyrimidic endonuclease-1 and biotin-labeled with terminal deoxynucleotidyl trans￾ferase using the GeneChip WT Terminal labeling kit (Affymetrix). Hybridiza￾tion was performed using 5 micrograms of biotinylated target, which was incubated with the GeneChip Human Exon 1.0 ST array (Affymetrix) at 45 1C for 16–20 h. After hybridization, nonspecifically bound material was removed by washing and specifically bound target was detected using the GeneChip Hybridization, Wash and Stain kit, and the GeneChip Fluidics Station 450 (Affymetrix). The arrays were scanned using the GeneChip Scanner 3000 7G (Affymetrix) and raw data was extracted from the scanned images and analyzed with the Affymetrix Power Tools software package (Affymetrix). Preprocessing and analysis of array hybridization data. The Affymetrix Power Tools software package was used to quantile-normalize the probe fluorescence intensities and to summarize the probe set (representing exon expression) and meta–probe set (representing gene expression) intensities using a probe logarithmic-intensity error model (see URLs below). High false￾positive rates are common in microarray studies, and previous studies have suggested that a major factor arises from probes overlapping SNPs that result in changes to hybridization intensity28, potentially influencing the apparent association between the SNP genotype and probe intensities. To reduce potential influences of SNPs on false positives, all probes containing known SNPs (dbSNP release 126) were masked out before summarizing probe set and meta–probe set scores. The presence of unannotated SNPs affecting probe hybridization will remain (see below), but these cannot be detected by any statistical methods except for the impractical solution of resequencing all probes across the panel used in the study. We also filtered probe intensity levels by magnitude of response, removing probes that seemed to be in the background. Probe intensities were extracted for a series of 16,934 antigenomic probes targeted to nonhuman sequences and averaged by their relative G+C content. The threshold for background expression was defined as the average intensity for a given G+C content plus 2 s.d. For any given genomic probe on the array, if the intensity across all samples was below the threshold for the same G+C percentage, then it was consi￾dered background and masked from the analysis. In total, 670,809 probes corresponding to core annotated probe sets were masked from the analysis, reducing the number of core probe sets in the analysis to 244,027 probe sets. Association analysis and multiple test correction. We examined probe set expression levels for association with flanking SNPs. For each of the 244,027 core probe sets and 17,653 meta–probe sets, we tested for association of the expression levels to HapMap phase II (release 21) SNPs with a minor allele frequency of at least 5% within a 50-kb region flanking either side of the gene containing the probe set, using a linear regression model in the R software package. Raw P-values were obtained from the regression using the standard asymptotic t-statistic. To correct for testing of associations between multiple probe sets and SNPs, we carried out permutation tests followed by FDR correction. Within each expression-versus-genotype matrix, we randomly permuted the expression values for all probe sets belonging to the same meta–probe set (to preserve the haplotype block structure). For each expression measurement, we com￾puted and retained only the highest asymptotic P-value and produced the distribution of maximum P-values within the permuted dataset. The maximum asymptotic P-values from the experimental data were then converted into empirical P-values by mapping onto the permuted distribution. The above procedure corrects for testing multiple SNPs against each expression value. Subsequently, we performed an FDR correction29 on the empirical P-values, to control the FDR across multiple expression values. The procedure was applied separately to measurements at the probe set and meta–probe set levels. We used a 0.05 FDR criterion as a significance cutoff in our analysis. For the sake of clarity, all of the values and cutoffs quoted in the results correspond to the raw, uncorrected P-values. PS 3023263 2,500 Long isoform Short isoform Probe set 3023261 3023262 A G rs10954213 3023263 3023264 2,000 1,500 1,000 500 Probe set intensity 0 AA AG Genotype Microarray data GG P = 0.33 b a c PS 3023264 2,500 2,000 1,500 1,000 500 Probe set intensity 0 AA AG Genotype GG P = 8.27 × 10–22 PS 3023263 25 20 15 10 5 Ct count 0 AA AG Genotype Quantitative RT-PCR validation GG P = 0.11 25 20 15 10 5 Ct count 0 PS 3023264 AA AG Genotype GG P = 1.04 × 10–8 Figure 4 Validation of 3¢ UTR change in IRF5 by quantitative real-time RT-PCR. (a) Schematic of the 3¢ ends of the long and short isoforms of IRF5. Exons are shown in blue, introns are dashed lines, and solid horizontal lines below the exons indicate probe sets. (b) Regression analyses of probe sets 3023263 and 3023264 against SNP rs10954213. (c) Regression analysis of Ct counts from quantitative real-time RT-PCR against the genotype of SNP rs10954213, to confirm the original microarray data. We used two sets of primers on the panel of individuals, designed to amplify probe sets 3023263 and 3023264, respectively. NATURE GENETICS VOLUME 40 [ NUMBER 2 [ FEBRUARY 2008 229 LETTERS © 2008 Nature Publishing Group http://www.nature.com/naturegenetics
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有