正在加载图片...
ARTICLE RESEARCH METHODS after a minimum of 96 h of puromyci https:// DNA Midi Ka D (SI (CR ing wi the 200 N74 itial ol the r orrelat OVMIU 0.8) nd Ovs and Marc mis r than c RNAS erage 1 RISPR L e pre CRISPR uysis set of 32. pe of lib CBISPR-Ca50 ng a celrTiter-Glo 20 Ass RNAs on a targeted-go Article RESEARCH Methods CRISPR–Cas9 screening. Plasmids. All plasmids have previously been described10 and are available through Addgene (Cas9 vector, 68343; gRNA vector, 67974). Plasmids were packaged using the ViraPower Lentiviral Expression System (Invitrogen, K4975-00) as per the manufacturer’s instructions. Cell culture. Cell lines used in this study (Supplementary Table 1) were selected from 1,000 cell line panel7 of the Genomics of Drug Sensitivity in Cancer study, had been annotated in the Cell Model Passports database (https://cellmodelpass￾ports.sanger.ac.uk/) and were maintained as previously described7 . To control for cross-contamination and sample swaps a panel of 92 single-nucleotide poly￾morphisms was profiled for each cell line before and following completion of the CRISPR–Cas9 screening pipeline. This study includes commonly misidentified cell lines: Ca9-22, short tandem repeat (STR) analysis confirmed that the identity matched the Japanese Collection of Research Bioresources Cell Bank (JCRB) refer￾ence (JCRB0625) and RIKEN (RCB1976); MKN28, noted as derivative of MKN74 in Cell Model Passports and clinical information matches MKN74; KP-1N, known misidentification issue, Cell Model Passports data for both KP-1N & Panc-1 are identical; OVMIU, known misidentification issue, Cell Model Passports data for both OVMIU and OVSAYO are identical; SK-MG-1, STR profile matches JCRB profile, which internally matches Marcus, Cell Model Passport data for both SK-MG-1 and Marcus are identical. Commonly misidentified lines have been noted in Supplementary Table 1 and on the Cell Model Passport. All com￾monly misidentified cell lines were retained, because the misidentification does not impact tissue or cancer type of origin, and all datasets used were generated in-house from the same matched cell line. A separate set of HCT116 cell lines was used for WRN validation experiments: HCT116 parental cells and HCT116 cells carrying Chr.3 or Chr.5, or both were a gift from M. Koi. HCT116 cells carrying Chr.2 were a gift from A. Goel. HCT116 cells carrying Chr.2 or Chr.3 were maintained in 400 μg ml−1 G418 (Thermo Fisher Scientific, 10131027); HCT116 cells carrying Chr.5 were maintained in 6 μg ml−1 blasticidin (Thermo Fisher Scientific, A1113903); and HCT116 cells carrying Chr.3 + Chr.5 were maintained in the presence of 400 μg ml−1 G418 and 6 μg ml−1 blasticidin. All cells were cultured in McCoy’s 5A medium (Sigma￾Aldrich, M4892) with 10% FBS. Generation of Cas9-expressing cancer cell lines. Cells were transduced with a len￾tivirus containing Cas9 in T25 or T75 flasks at approximately 80% confluence in the presence of polybrene (8 μg ml−1 ). Cells were incubated overnight followed by replacement of the lentivirus-containing medium with fresh complete medium. Blasticidin selection commenced 72 h after transduction at an appropriate concen￾tration determined for each cell line using a blasticidin dose–response assay (blas￾ticidin range, 10–75 μg ml−1 ) and cell viability was assessed using the CellTiter-Glo 2.0 Assay (Promega, G9241). Cas9 activity was assessed as described previously10. Cell lines with Cas9 activity over 75% were used for sgRNA library transduction. Genome-wide sgRNA library and screen. Two genome-wide sgRNA libraries were used in this study: the Human CRISPR Library v.1.0 and v.1.1. The Human CRISPR Library v.1.0 was described previously and targets 18,009 genes with 90,709 sgRNAs (Addgene, 67989)10. The Human CRISPR Library v.1.1 contains all sgRNAs from v.1.0 plus 1,004 non-targeting sgRNAs and 5 additional sgRNAs against 1,876 selected genes that encode kinases, epigenetic-related proteins and pre-defined fitness genes. An oligo pool of Library v.1.1 was synthesized using high-throughput silicon platform technology (Twist Bioscience) and cloned as described previously10. For consistency, all computational analyses were per￾formed considering only the overlapping sgRNAs between the two libraries (90,709 sgRNAs). Data for the additional sgRNAs in Library v.1.1 can be found in the raw read count files for cell lines screened with this library version (available at available at https://cog.sanger.ac.uk/cmp/download/raw_sgrnas_counts.zip), but have been removed before quality control analysis. The HT-29 cell line was screened with both libraries and resulting datasets were kept separated for comparative analyses (results are summarized in Extended Data Fig. 2j). A total of 3.3 × 107 cells were transduced with an appropriate volume of the lentiviral-packaged whole-genome sgRNA library to achieve 30% transduction efficiency (100× library coverage). The volume was determined for each cell line using a titration of the packaged library and assessing the percentage of blue flu￾orescent protein (BFP)-positive cells by flow cytometry. Transductions were per￾formed in technical triplicate (or duplicate for cell lines with a large cell size such as glioblastoma). Owing to the large number of screens performed, multiple batches of packaged library virus were prepared. Each batch was tested in HT-29 cells to ensure consistency between batch preparations. In addition, the HT-29 cell line was screened every 3 months to ensure the quality of data generated by the pipeline was consistent. Transduction efficiency was assessed 72 h after transduction. Samples with a transduction efficiency between 15 and 60% were used for puromycin selec￾tion. The appropriate concentration of puromycin for each individual cell line was determined from a dose–response curve (puromycin range, 1–5 μg ml−1 ) and cell viability was assessed using a CellTiter-Glo 2.0 Assay (Promega, G9241). The percentage BFP-positive cells was reassessed after a minimum of 96 h of puromycin selection. For samples with less than 80% BFP-positive cells, puromycin selection was extended for an additional 3 days and the percentage of BFP-positive cells was assessed again. Cells were maintained until day 14 after transduction with a minimum of 5.0 × 107  cells reseeded at each passage (500× library coverage). Approximately 2.5 × 107  cells were collected, pelleted and stored at −80 °C for DNA extraction. DNA extraction, sgRNA PCR amplification, Illumina sequencing and sgRNA count￾ing. Genomic DNA was extracted from cell pellets using either the QIAsymphony automated extraction platform (Qiagen, QIAsymphony DSP DNA Midi Kit, 937255) or by manual extraction (Qiagen, Blood & Cell Culture DNA Maxi Kit, 13362) as per the manufacturer’s instructions. PCR amplification, Illumina sequencing (19-bp single-end sequencing with custom primers on the HiSeq2000 v.4 platform) and sgRNA counting were performed as described previously10. CRISPR screen data analyses. Low-level quality control assessment and filtering. To perform initial low-level quality control, the Pearson’s correlation of treatment counts between replicates was assessed for each cell line (Extended Data Fig. 1c). The resulting correlation scores were generally high (median = 0.8), but not suffi￾ciently distinguishable from expectation (median correlation between replicates of any pair of randomly selected cell lines). Thus, to define a reproducibility threshold, we developed an approach based on a previously published study29. Specifically, we selected a set of the 838 most-informative sgRNAs, defined as those with an average pairwise Pearson’s correlation greater than 0.6 between corresponding patterns of the count fold changes 14 days after transfection versus plasmid library across all screened cell lines. We next computed average gene-level profiles for 308 genes targeted by these informative sgRNAs for each individual technical replicate, and then computed all possible pairwise Pearson’s correlation scores between the resulting profiles. This enabled the estimation of a null distribution of replicate correlations (plotted in grey in Extended Data Fig. 1d). We then defined a reproducibility threshold R value of 0.68, for which the estimated probability mass function of the correlation scores that was computed between replicates of the same cell line (considering the identified 308 genes only) was at least twice that of the null mass probability function (Extended Data Fig. 1d). Of the 332 screened cell lines with at least two technical replicates, 305 had an average replicate correlation higher than this threshold, and therefore passed the reproducibility assessment; for 7 cell lines there were no replicates. Excluding the least reproducible replicate for the 14 cell lines that did not pass the first reproducibility assessment allowed their average replicate correlation to exceed the threshold defined above, thus result￾ing in a set of 326 cell lines that passed the low-level quality control assessment (Supplementary Table 1). Screening performance assessment. We considered the genome-wide profiles of gene-level sgRNA fold change values (averaged across targeting sgRNAs and replicates) of each cell line to be a classifier of predefined sets of essential and non-essential genes30 by means of receiver operating characteristic (ROC) indi￾cators (Extended Data Fig. 1g and Supplementary Table 1). In addition, we meas￾ured the magnitude of the depletion signal observed in each screened cell line by evaluating the median log(change in sgRNA count), and the discriminative distance between their distributions (as measured by the Glass’s Δ) for prede￾fined essential and non-essential genes30 and ribosomal protein genes31. In total, 2 out of the 326 cell lines were manually removed, because they had area under the ROC curve, area under the precision/recall curve and both Glass’s Δ values that were 3 s.d. lower than the average. On the basis of our low-level quality con￾trol and screening performance, the final analysis set was composed of 324 cell lines (Supplementary Table 1). Further details on these analyses are included in the Supplementary Information. sgRNA count preprocessing and CRISPR-bias correction. The analysis set of 324 cell lines was further processed using CRISPRcleanR32 (https://github.com/franc￾escojm/CRISPRcleanR). sgRNAs with less than 30 reads in the plasmid counts and sgRNAs belonging to only the Library v1.1 were first removed. The remain￾ing sgRNAs were assembled into one file per cell line, including the read counts from the matching library plasmid and all replicates and then normalized using a median–ratio method to adjust for the effect of library sizes and read count distributions33. Depletion/enrichment fold changes for individual sgRNAs were quantified between post library-transduction read counts and library plasmid read counts at the individual replicate level. This was performed using the ccr. NormfoldChanges function of CRISPRcleanR. Next we performed a correction of gene-independent responses to CRISPR–Cas9 targeting34 using the ccr.GWclean function of CRISPRcleanR with default parameters. Calling CRISPR–Cas9 gene knockout fitness effects. The CRISPRcleanR-corrected sgRNAs-level values (corrected fold change values) were used as input into an in-house-generated R implementation of the BAGEL method30 to call signifi￾cantly depleted genes (code publicly available at https://github.com/francescojm/ BAGELR). Our BAGEL implementation computes gene-level Bayesian factors by the sgRNAs on a targeted-gene basis, by averaging instead of summing them
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有