
生物信息学(高等遗传学)2017/10/10生物信息学2017/10/101/25用宁
⽣物信息学(⾼等遗传学) 2017/10/10 周宇 ⽣物信息学 2017/10/10 1 / 25

内容介绍工具与实践(How)生物信息与组学What·查看(GenomeBrowser)生物学问题和需求获取(TableBrowser)·高通量测序技术介绍和应用。分析(Galaxy)·生物信息技术的需求和应用生物信息学2017/10/102/25同字
内容介绍 ⽣物信息与组学 What ⽣物学问题和需求 ⾼通量测序技术介绍和应用 ⽣物信息技术的需求和应用 ⼯具与实践(How) 查看 (Genome Browser) 获取 (Table Browser) 分析 (Galaxy) 周宇 ⽣物信息学 2017/10/10 2 / 25

What is Bioinformatics?WhatisBioinformatics?Bioinformatics is theapplication of computer scienceand information technology to thefield ofbiologyThe primarygoal of bioinformatics is to increase theunderstandingof biological processes.What sets itapartfrom other approaches, however, is its focus on developingand applying computationally intensive techniques (e.g.pattern recognition, data mining, machine learningalgorithms, and visualization) to achieve this goal.WIKIPEDLAwww.wilipedia.erg同宇生物信息学2017/10/103/25
What is Bioinformatics? 周宇 ⽣物信息学 2017/10/10 3 / 25

Fundamental questionsHidden message in genome?Functionalelements,regulatorycode,..Development and aging?Cell differentiationand reprogramming,...Causeofvariousdiseases includingcancersClassification,drivermutationsandmechanisms,...HT-sequencing and bioinformatics have revolutionized the research!生物信息学2017/10/104/25同宇
Fundamental questions Hidden message in genome? Functional elements, regulatory code, . Development and aging? Cell differentiation and reprogramming, . Cause of various diseases including cancers Classification, driver mutations and mechanisms, . . HT-sequencing and bioinformatics have revolutionized the research! 周宇 ⽣物信息学 2017/10/10 4 / 25

Current biological research enters big data eraCostperGenomeDATAEXPLOSIONTheamountofgeneticsequencingdatastoredattheEuropean BioinformaticsInstitutetakeslessthanayeartodoubleinsizeencers begiresofdataNIH272013A200420062008201020122lhttp://www.genome.gov/sequencingcosts/2Nature, Vol 498, 255生物信息学2017/10/105/25阿字
Current biological research enters big data era 1 DATA EXPLOSION The amount of genetic sequencing data stored at the European Bioinformatics Institute takes less than a year to double in size. Sequencers begin giving urries of data 200 Terabases 160 120 80 40 2004 2006 2008 2010 2012 0 2 1 http://www.genome.gov/sequencingcosts/ 2Nature, Vol 498, 255 周宇 ⽣物信息学 2017/10/10 5 / 25

Sequencing technology:lluminaondgenerationsequencing,sequencingbysynthesis1.PrpGAttathONAte surhat3.BidgeArepificatshruet FiaBstecrintl生物信息学2017/10/106/25
Sequencing technology: Illumina 2 nd generation sequencing, sequencing by synthesis. 周宇 ⽣物信息学 2017/10/10 6 / 25

Sequencingtechnologyapplication3.53.0AREMRRACPOZ--1.5DRNCAOErFracARSsui201320062007200820092010201120122014Figure4,OverviewofSelectedHTSApplicationsPbicationdaerpreertataticescribingamodersuseumbrcitaonsthattheaticleclived,Methodsarecolrdbycaoryansizeofthedatapointisproportioraltopubicationrate (citations/monthaj,.Theeetindicatesrecolorkeyasweltheproportionofmeods ineachgroup,Forclarity,seq hasbeen omited from the labels.生物信息学2017/10/107/25同字
Sequencing technology application 周宇 ⽣物信息学 2017/10/10 7 / 25

Sequencing technology application。基因组测序Genome(cancers)功能元件FunctionalElements·表观基因组学Epigenome转录组学Transcriptome·三/四维基因组3DGenome同宇生物信息学2017/10/108/25
Sequencing technology application 基因组测序 Genome (cancers) 功能元件 Functional Elements 表观基因组学 Epigenome 转录组学 Transcriptome 三/四维基因组 3D Genome 周宇 ⽣物信息学 2017/10/10 8 / 25

New reguirements in sequencing era.GenomeassemblyMappingbillionsofshortreadsontogenome。Gene predictionGeneexpressionquantificationAlternativesplicingidentification and quantificationTFs/RBPsbindingsiteidentificationMotifdiscoveryforTFsandRBPsDifferential expression,differential bindingModificationidentificationDNA/Histones/RNAAl/Machine Learning同字生物信息学2017/10/109/25
New requirements in sequencing era Genome assembly Mapping billions of short reads onto genome Gene prediction Gene expression quantification Alternative splicing identification and quantification TFs/RBPs binding site identification Motif discovery for TFs and RBPs Differential expression, differential binding Modification identification DNA/Histones/RNA AI / Machine Learning 周宇 ⽣物信息学 2017/10/10 9 / 25

Sequencing driven bioinformaticsToolsforspecifictasksBowtie/2,BWA (algorithms)。MACS/2.PeakSeqTophat/2,CufflinksSAMtoolsBEDToolsedgeR,DEseg (statistical model)Visualization:UCSCGenomeBrowser,Ensembl,IGV,IGB,..Reusablepackages:BioPython,BioPerl,Bioconductor/R,...Databaseforspecificinformation:miRBase,GEO,ENCODE,.Integrativedataanalysis:ChromHMM,CellNet,...生物信息学2017/10/1010/25同宇
Sequencing driven bioinformatics Tools for specific tasks Bowtie/2, BWA (algorithms) MACS/2, PeakSeq Tophat/2, Cufflinks SAMtools BEDTools edgeR, DEseq (statistical model) Visualization: UCSC Genome Browser, Ensembl, IGV, IGB, . Reusable packages: BioPython, BioPerl, Bioconductor/R, . Database for specific information: miRBase, GEO, ENCODE, . Integrative data analysis: ChromHMM, CellNet, . 周宇 ⽣物信息学 2017/10/10 10 / 25