Institute of Computing Technology, Chinese Academy IR
COMSAT: Residue contact prediction of transmembrane proteins based on support vector machines and mixed integer linear programming | |
Zhang, Huiling1; Huang, Qingsheng1; Bei, Zhendong2; Wei, Yanjie1; Floudas, Christodoulos A.3,4 | |
2016-03-01 | |
发表期刊 | PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS
![]() |
ISSN | 0887-3585 |
卷号 | 84期号:3页码:332-348 |
摘要 | In this article, we present COMSAT, a hybrid framework for residue contact prediction of transmembrane (TM) proteins, integrating a support vector machine (SVM) method and a mixed integer linear programming (MILP) method. COMSAT consists of two modules: COMSAT_SVM which is trained mainly on position-specific scoring matrix features, and COMSAT_MILP which is an ab initio method based on optimization models. Contacts predicted by the SVM model are ranked by SVM confidence scores, and a threshold is trained to improve the reliability of the predicted contacts. For TM proteins with no contacts above the threshold, COMSAT_MILP is used. The proposed hybrid contact prediction scheme was tested on two independent TM protein sets based on the contact definition of 14 angstrom between C alpha-C alpha atoms. First, using a rigorous leave-one-protein-out cross validation on the training set of 90 TM proteins, an accuracy of 66.8%, a coverage of 12.3%, a specificity of 99.3% and a Matthews' correlation coefficient (MCC) of 0.184 were obtained for residue pairs that are at least six amino acids apart. Second, when tested on a test set of 87 TM proteins, the proposed method showed a prediction accuracy of 64.5%, a coverage of 5.3%, a specificity of 99.4% and a MCC of 0.106. COMSAT shows satisfactory results when compared with 12 other state-of-the-art predictors, and is more robust in terms of prediction accuracy as the length and complexity of TM protein increase. |
关键词 | protein structure prediction hybrid framework machine learning ab initio prediction MILP |
DOI | 10.1002/prot.24979 |
收录类别 | SCI |
语种 | 英语 |
资助项目 | National Science Foundation of China[11204342] ; Shenzhen Peacock Plan[KQCX20130628112914299] ; Science Technology and Innovation Committee of Shenzhen Municipality[JCYJ20120615140912201] ; National High Technology Research and Development Program of China[2015AA020109] |
WOS研究方向 | Biochemistry & Molecular Biology ; Biophysics |
WOS类目 | Biochemistry & Molecular Biology ; Biophysics |
WOS记录号 | WOS:000373352100004 |
出版者 | WILEY-BLACKWELL |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://119.78.100.204/handle/2XEOYT63/8444 |
专题 | 中国科学院计算技术研究所期刊论文_英文 |
通讯作者 | Wei, Yanjie; Floudas, Christodoulos A. |
作者单位 | 1.Chinese Acad Sci, Shenzhen Inst Adv Technol, Ctr High Performance Comp, Shenzhen 518055, Peoples R China 2.Chinese Acad Sci, Shenzhen Inst Adv Technol, Ctr Cloud Comp, Shenzhen 518055, Peoples R China 3.Texas A&M Univ, Dept Chem Engn, College Stn, TX 77843 USA 4.Texas A&M Univ, Texas A&M Energy Inst, College Stn, TX 77843 USA |
推荐引用方式 GB/T 7714 | Zhang, Huiling,Huang, Qingsheng,Bei, Zhendong,et al. COMSAT: Residue contact prediction of transmembrane proteins based on support vector machines and mixed integer linear programming[J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS,2016,84(3):332-348. |
APA | Zhang, Huiling,Huang, Qingsheng,Bei, Zhendong,Wei, Yanjie,&Floudas, Christodoulos A..(2016).COMSAT: Residue contact prediction of transmembrane proteins based on support vector machines and mixed integer linear programming.PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS,84(3),332-348. |
MLA | Zhang, Huiling,et al."COMSAT: Residue contact prediction of transmembrane proteins based on support vector machines and mixed integer linear programming".PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS 84.3(2016):332-348. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论