基因序列的比对、挖掘和功能分析 邹权(PHD.& Professor) 天津大学讣算杋科学与技术学院 2017.10 无库大孝 Tianjin University
Tianjin University 基因序列的比对、挖掘和功能分析 邹权 (PH.D.&Professor) 天津大学 计算机科学与技术学院 2017.10
Outline Sequence alignment Algorithm Paralle Identification and mining micro rna machine learning related works Function prediction mirna disease relationship crops yield related genes 无库大孝 Tianjin University
Tianjin University • Sequence alignment – Algorithm – Parallel • Identification and mining – microRNA – machine learning related works • Function prediction – miRNA disease relationship – crops yield related genes Outline
英器 TIANDIN ONDIVERSTTY Muitple Sequence Alignment(MSA)VS BLAST Output ry Database Input Outp O TIANJIN UNIVERSITY
Multiple Sequence Alignment(MSA) VS BLAST Query Database Output input Output
器 TIANDIN ONDIVERSTTY Multiple Sequence Alignment(MSA): What Where Multiple Sequence Alignment Phylogenetic tree Multiple DNA Sequence Virus sequences Alignment Population SNv calling Multiple Similar DNA Sequence Alignment ●●● Our focus Application O TIANJIN UNIVERSITY
Multiple Sequence Alignment(MSA): What & Where Multiple Sequence Alignment Multiple DNA Sequence Alignment Multiple Similar DNA Sequence Alignment Our Focus Phylogenetic tree Virus sequences Population SNV calling … Application
器 TIANDIN ONDIVERSTTY Techniques for similar DNA MSA 1. k-band Dynamic Programming 0 3 4 C bahd 0 2 -5 30-0 0 23456 221 g 0 C 3456 2 0 3 23 O TIANJIN UNIVERS
Techniques for similar DNA MSA 1. k-band Dynamic Programming j i 0 1 c 2 a 3 t 4 g 5 t 0 0 -1 1 a -1 -1 1 2 c -2 1 0 0 3 g 0 0 -1 2 4 c -1 -1 1 1 5 t 1 0 3 6 g 3 2 -1 -1 -4 -5 0 K-band
器 TIANDIN ONDIVERSTTY How to set k for k-band? a_ta_g9 t a cc g a!αg ag t ag ag K-band g d K-band agtatgtaccgagat gαg O TIANJIN UNIVERSITY
How to set k for k-band?
TIANDIN ONDIVERSTTY AGTAGGTACCGATAGC AGTAGG ATAGC ITACCGIIII AGTATGTACCGaGatc AGTATG AGATE gta g grace t ag e dynamic programming match t segments ⑨ TIANJIN UNIVERSITY
器 TIANDIN ONDIVERSTTY Greedy search with suffix tree S=GTCC GAAGCTCCGIG 0-SGGCCTCGA G 心ccc (1,1,4) 0 (56,9) T=GTCC!TIGAAGCTCCGT 1234567890123456 ⑨ TIANJIN UNIVERSITY
Greedy search with suffix tree T=GTCCTGAAGCTCCGT 1234567890123456 S=GTCCGAAGCTCCGG (1,1,4) (5,6,9)
器 TIANDIN UNDVIERSTATV Techniques for similar DNA MSA 2. Center star strategy S3 S2 S4 tree alignment Center star strategy ⑨ TIANJIN UNIVERSITY
Techniques for similar DNA MSA 2. Center star strategy S1 S2 S3 S4 S5 S1 S2 S3 S4 S5 tree alignment Center star strategy
器 TIANDIN ONDIVERSTTY Extreme MSA for Very Similar DNA Sequences S GGCCTCGA opcGcc2cG G final result S S S → S S 4 S sum up update s5 ⑨ TIANJIN UNIVERSITY
sum up update final result Extreme MSA for Very Similar DNA Sequences