正在加载图片...
第6期 常征,等:多特征融合的IncRNA识别与其功能预测 ·933· 在构建调控网络并进行模块分析后,使用 调控功能。例如NONATHTO02539参与到氮化合 GO术语检查模块中的mRNA的功能注释,并对 物代谢、分解代谢以及生物合成过程;NONATHT 和mRNA相关的IncRNA可能参与的生物调控过 000372促进蛋白质磷酸化;NONATHT002765和 程进行预测,部分结果如表6。可以看到根据相 NONATHT002470、NONATHT002469都会影响细 关联的RNA,本文预测的IncRNA所具有的生物 胞转化的过程等。 表6 IncRNA功能预测 Table 6 IncRNA function prediction microRNA 基因 生物过程 IncRNA AT5G63640:AT1G54220: lysosomal transport;protein targeting;cellular ath miR3434 3p NONATHT002765 AT4G20360 process ath_miR844_5p AT5G24940 Protein phosphorylation NONATHTO00372 anion transport;protein localization;cellular NONATHT002470: ath miR399c 5p AT3G14460:AT3G08960 process NONATHT002469 ath_miR8173 AT1G79920:AT4G15100 protein complex assembly;proteolysis NONATHTO03067 ath miR397a AT1G79920AT1G21160 protein complex assembly;biosynthetic process NONATHT003189 AT1G21160: biosynthetic process;catabolic process;nitrogen ath_miR828 AT1G66380AT5G54670: NONATHT002539 compound metabolic process AT3G43210 4结束语 locus-specific methylation in response to low-dose irradi- ation[J].Cell reports,2015,11(3):474-485. 本文基于植物RNA序列,提取开放阅读框、 [5]CUI Jun,LUAN Yushi,JIANG Ning,et al.Comparative 二级结构和k-mers3类特征,并将它们融合成一 transcriptome analysis between resistant and susceptible 个90维的特征向量作为输入,训练朴素贝叶斯、 tomato allows the identification of IncRNA16397 confer- 支持向量机、梯度提升决策树3种机器学习模 ring resistance to Phytophthora infestans by co-expressing 型,并采用加权投票分法来集成分类结果。通过 glutaredoxin[J].The plant journal,2017,89(3):577-589. 与现有的识别软件CNCI和PLEK相比,本文提 [6]HAN Siyu,LIANG Yanchun,LI Ying,et al.Long noncod- 出方法取得了较好的性能,可以有效地识别预测 ing RNA identification:comparing machine learning based 植物IncRNA。基于内源性竞争规则,筛选 tools for long noncoding transcripts discrimination[J].Bio- IncRNA-microRNA、microRNA-mRNA相互作用数 Med research international,2016,2016:Article No. 据,并整合两类数据构建调控网络,基于互作网 8496165 络利用GO术语对各个模块的mRNA注释,进而 [7]KONG Lei,ZHANG Yong,YE Zhiqiang,et al.CPC:as- sess the protein-coding potential of transcripts using se- 通过mRNA预测IncRNA功能。未来将结合深度 quence features and support vector machine[J].Nucleic 学习技术,进一步改善预测的准确率。 acids research,2007,36(S2):W345-W349. 参考文献: [8]WANG Liguo,PARK H J,DASARI S,et al.CPAT:cod- ing-potential assessment tool using an alignment-free lo- [1]COSTA FF.Non-coding RNAs:meet thy masters[J]. gistic regression model[J].Nucleic acids research,2013, Bioassays,2010,32(7):599-608. 41(6):Article No.e74. [2]PALAZZO A F.LEE E S.Non-coding RNA:what is func- [9]SUN Liang,LUO Haitao,BU Dechao,et al.Utilizing se- tional and what is junk?[J].Frontiers in genetics,2015,6: quence intrinsic composition to classify protein-coding and Article No.2. long non-coding transcripts[J].Nucleic acids research, [3]SCHMITZ S U,GROTE P.HERRMANN B G.Mechan- 2013,41(17):Article No.e166. isms of long noncoding RNA function in development and [10]LI Aimin,ZHANG Junying,ZHOU Zhongyin.PLEK:a disease[J].Cellular and molecular life sciences,2016, tool for predicting long non-coding RNAs and messenger 73(13):2491-2509. RNAs based on an improved k-mer scheme[J].BMC [4]O'LEARY V B.OVSEPIAN S V,CARRASCOSA L G,et bioinformatics,2014,15:Article No.311. al.PARTICLE,a triplex-forming long ncRNA,regulates [11]郭杏莉,高琳,刘永轩,等.长非编码RNA生物特征研在构建调控网络并进行模块分析后,使用 GO 术语检查模块中的 mRNA 的功能注释,并对 和 mRNA 相关的 lncRNA 可能参与的生物调控过 程进行预测,部分结果如表 6。可以看到根据相 关联的 RNA,本文预测的 lncRNA 所具有的生物 调控功能。例如 NONATHT002539 参与到氮化合 物代谢、分解代谢以及生物合成过程;NONATHT 000372 促进蛋白质磷酸化;NONATHT002765 和 NONATHT002470、NONATHT002469 都会影响细 胞转化的过程等。 表 6 lncRNA 功能预测 Table 6 lncRNA function prediction microRNA 基因 生物过程 lncRNA ath_miR3434_3p AT5G63640; AT1G54220; AT4G20360 lysosomal transport; protein targeting; cellular process NONATHT002765 ath_miR844_5p AT5G24940 Protein phosphorylation NONATHT000372 ath_miR399c_5p AT3G14460; AT3G08960 anion transport; protein localization; cellular process NONATHT002470; NONATHT002469 ath_miR8173 AT1G79920; AT4G15100 protein complex assembly; proteolysis NONATHT003067 ath_miR397a AT1G79920 AT1G21160 protein complex assembly; biosynthetic process NONATHT003189 ath_miR828 AT1G21160; AT1G66380AT5G54670; AT3G43210 biosynthetic process; catabolic process; nitrogen compound metabolic process NONATHT002539 4 结束语 本文基于植物 RNA 序列,提取开放阅读框、 二级结构和 k-mers 3 类特征,并将它们融合成一 个 90 维的特征向量作为输入,训练朴素贝叶斯、 支持向量机、梯度提升决策树 3 种机器学习模 型,并采用加权投票分法来集成分类结果。通过 与现有的识别软件 CNCI 和 PLEK 相比,本文提 出方法取得了较好的性能,可以有效地识别预测 植 物 lncRNA。基于内源性竞争规则,筛 选 lncRNA-microRNA、microRNA-mRNA 相互作用数 据,并整合两类数据构建调控网络,基于互作网 络利用 GO 术语对各个模块的 mRNA 注释,进而 通过 mRNA 预测 lncRNA 功能。未来将结合深度 学习技术,进一步改善预测的准确率。 参考文献: COSTA F F. Non-coding RNAs: meet thy masters[J]. Bioassays, 2010, 32(7): 599–608. [1] PALAZZO A F, LEE E S. Non-coding RNA: what is func￾tional and what is junk?[J]. Frontiers in genetics, 2015, 6: Article No.2. [2] SCHMITZ S U, GROTE P, HERRMANN B G. Mechan￾isms of long noncoding RNA function in development and disease[J]. Cellular and molecular life sciences, 2016, 73(13): 2491–2509. [3] O’LEARY V B, OVSEPIAN S V, CARRASCOSA L G, et al. PARTICLE, a triplex-forming long ncRNA, regulates [4] locus-specific methylation in response to low-dose irradi￾ation[J]. Cell reports, 2015, 11(3): 474–485. CUI Jun, LUAN Yushi, JIANG Ning, et al. Comparative transcriptome analysis between resistant and susceptible tomato allows the identification of lncRNA16397 confer￾ring resistance to Phytophthora infestans by co-expressing glutaredoxin[J]. The plant journal, 2017, 89(3): 577–589. [5] HAN Siyu, LIANG Yanchun, LI Ying, et al. Long noncod￾ing RNA identification: comparing machine learning based tools for long noncoding transcripts discrimination[J]. Bio￾Med research international, 2016, 2016: Article No. 8496165. [6] KONG Lei, ZHANG Yong, YE Zhiqiang, et al. CPC: as￾sess the protein-coding potential of transcripts using se￾quence features and support vector machine[J]. Nucleic acids research, 2007, 36(S2): W345–W349. [7] WANG Liguo, PARK H J, DASARI S, et al. CPAT: cod￾ing-potential assessment tool using an alignment-free lo￾gistic regression model[J]. Nucleic acids research, 2013, 41(6): Article No.e74. [8] SUN Liang, LUO Haitao, BU Dechao, et al. Utilizing se￾quence intrinsic composition to classify protein-coding and long non-coding transcripts[J]. Nucleic acids research, 2013, 41(17): Article No.e166. [9] LI Aimin, ZHANG Junying, ZHOU Zhongyin. PLEK: a tool for predicting long non-coding RNAs and messenger RNAs based on an improved k-mer scheme[J]. BMC bioinformatics, 2014, 15: Article No.311. [10] [11] 郭杏莉, 高琳, 刘永轩, 等. 长非编码 RNA 生物特征研 第 6 期 常征,等:多特征融合的 lncRNA 识别与其功能预测 ·933·
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有