The functional module detection of PPI network by incorporating protein complex data LIU Guangming,YANG Liu,GAO Panpan,WANG Bangjun,ZHOU Xuezhong,YU Jian (School of Computer and Information Technology,Beijing Jiaotong University,Beijing 100044,China) Abstract:Functional module detection of protein-protein interaction (PPI)network has been a major challenge i- dentified recently by medical researchers.It allows understanding and recognizing the interaction between proteins in an efficient manner.In this study,topological module detection methods,popular in the field of complex protein networks,were applied to the PPI network to obtain these modules,followed by a biological analysis of the topolog- ical modules.The interaction mechanism was observed for only 10%~20%of the protein pairs because of incom- plete PPI data.Furthermore,the data for noise interaction always existed in PPI;therefore,the number of biologi- cally precise modules decreased according to topological community-detection methods.In this study,the protein complex data was incorporated into the PPI network to identify more biologically precise protein modules.K-Means clustering and non-negative matrix factorization algorithms were used to segregate the PPI network into different modules.Gene ontology (GO)and pathway analysis were conducted for each of these modules to quantify their bio- logical significance.The results of the experiments showed that the modules detected by combining the protein com- plex and PPI network demonstrate a higher tendency to achieve larger homogeneity values compared with those de- tected using GO and pathway analysis. Keywords:PPI;protein complex;functional module;module detection;gene ontology;pathway 蛋白质分子是通过与其他蛋白质分子相互作用 量的蛋白质相互作用数据被挖掘出来,从而形成蛋 发挥功能的,近年来随着高通量技术的快速发展,海 白质相互作用网络(protein-protein interaction,PPI)。 网络医学近年来在计算医学领域发展迅速,PPI网 收稿日期:2016-03-18.网络出版日期:2016-09-26 络中的蛋白模块往往具有特定的生物功能。 基金项目:国家自然科学基金项目(61105055,81230086). 通信作者:刘光明.E-mail:guangmingliu@bjtu.cd.cn Barabasi等认为疾病的产生是由于PPI中某个局部第 11 卷第 5 期 智 能 系 统 学 报 Vol.11 №.5 2016 年 10 月 CAAI Transactions on Intelligent Systems Oct. 2016 DOI:10.11992 / tis.201603034 网络出版地址:http: / / www.cnki.net / kcms/ detail / 23.1538.TP.20160926.0920.002.html 融合蛋白质复合体的人类蛋白互作网络功能模块发现 刘光明,杨柳,高盼盼,王邦军,周雪忠,于剑 ( 北京交通大学 计算机与信息技术学院,北京 100044) 摘 要:人类蛋白互作网络中功能模块的检测是目前网络医学研究的一个热点问题。 好的功能模块可以帮助我们 更好地去理解和认识蛋白质相互作用的分子机理。 近年来的一些研究大多数是基于复杂网络中的拓扑模块发现算 法对蛋白质相互作用网络进行模块划分,然后对其进行生物学上的功能研究。 由于 PPI 网络中的蛋白之间相互作 用的数据获取的不完整,相关研究表明目前人类只获得了人类蛋白之间相互作用数据的10% ~ 20%,其中已经获取 的数据中还包含着一些噪声,这就导致基于拓扑结构的社团检测算法的精度降低。 为了克服这个问题,本文将蛋白 质复合体数据融入到模块检测算法中,分别使用 K⁃Means 和 NMF 算法对 PPI 网络进行模块划分,然后从基因本体和 通路 2 个方面对检测到的模块进行功能分析。 实验结果表明融合了蛋白质复合体的 PPI 网络更容易得到具有生物 学意义的功能模块。 关键词:蛋白质相互作用网络;蛋白质复合体;功能模块;模块检测;基因本体;通路 中图分类号:TP391 文献标志码:A 文章编号:1673⁃4785(2016)05⁃0703⁃08 中文引用格式:刘光明,杨柳,高盼盼,等.融合蛋白质复合体的人类蛋白互作网络功能模块发现[ J]. 智能系统学报, 2016, 11(5): 703⁃710. 英文引用格式:LIU Guangming,YANG Liu,GAO Panpan,et al.The functional module detection of PPI network by incorporating protein complex data [J]. Furthermore, the data for noise interaction always existed in PPI; therefore, the number of biologi⁃ cally precise modules decreased according to topological community⁃detection methods. In this study, the protein complex data was incorporated into the PPI network to identify more biologically precise protein modules. K⁃Means clustering and non⁃negative matrix factorization algorithms were used to segregate the PPI network into different modules. Gene ontology (GO) and pathway analysis were conducted for each of these modules to quantify their bio⁃ logical significance. The results of the experiments showed that the modules detected by combining the protein com⁃ plex and PPI network demonstrate a higher tendency to achieve larger homogeneity values compared with those de⁃ tected using GO and pathway analysis.
Keywords:PPI; protein complex; functional module; module detection; gene ontology; pathway Furthermore, the data for noise interaction always existed in PPI; therefore, the number of biologi⁃ cally precise modules decreased according to topological community⁃detection methods. In this study, the protein complex data was incorporated into the PPI network to identify more biologically precise protein modules. K⁃Means clustering and non⁃negative matrix factorization algorithms were used to segregate the PPI network into different modules. Gene ontology (GO) and pathway analysis were conducted for each of these modules to quantify their bio⁃ logical significance. The results of the experiments showed that the modules detected by combining the protein com⁃ plex and PPI network demonstrate a higher tendency to achieve larger homogeneity values compared with those de⁃ tected using GO and pathway analysis. Keywords:PPI; protein complex; functional module; module detection; gene ontology; pathway 收稿日期:2016⁃03⁃18. 网络出版日期:2016⁃09⁃26. 基金项目:国家自然科学基金项目(61105055,81230086). 通信作者:刘光明.E⁃mail:guangmingliu @ bjtu.edu.cn. 蛋白质分子是通过与其他蛋白质分子相互作用 发挥功能的,近年来随着高通量技术的快速发展,海 量的蛋白质相互作用数据被挖掘出来,从而形成蛋 白质相互作用网络(protein⁃protein interaction,PPI)。 网络医学近年来在计算医学领域发展迅速,PPI 网 络中 的 蛋 白 模 块 往 往 具 有 特 定 的 生 物 功 能。 Barabasi 等认为疾病的产生是由于 PPI 中某个局部
