CSpace  > 中国科学院计算技术研究所期刊论文  > 英文
Prediction and analysis of multiple protein lysine modified sites based on conditional wasserstein generative adversarial networks
Yang, Yingxi1; Wang, Hui2; Li, Wen1; Wang, Xiaobo1; Wei, Shizhao3; Liu, Yulong3; Xu, Yan1
2021-03-31
发表期刊BMC BIOINFORMATICS
ISSN1471-2105
卷号22期号:1页码:17
摘要BackgroundProtein post-translational modification (PTM) is a key issue to investigate the mechanism of protein's function. With the rapid development of proteomics technology, a large amount of protein sequence data has been generated, which highlights the importance of the in-depth study and analysis of PTMs in proteins.MethodWe proposed a new multi-classification machine learning pipeline MultiLyGAN to identity seven types of lysine modified sites. Using eight different sequential and five structural construction methods, 1497 valid features were remained after the filtering by Pearson correlation coefficient. To solve the data imbalance problem, Conditional Generative Adversarial Network (CGAN) and Conditional Wasserstein Generative Adversarial Network (CWGAN), two influential deep generative methods were leveraged and compared to generate new samples for the types with fewer samples. Finally, random forest algorithm was utilized to predict seven categories.ResultsIn the tenfold cross-validation, accuracy (Acc) and Matthews correlation coefficient (MCC) were 0.8589 and 0.8376, respectively. In the independent test, Acc and MCC were 0.8549 and 0.8330, respectively. The results indicated that CWGAN better solved the existing data imbalance and stabilized the training error. Alternatively, an accumulated feature importance analysis reported that CKSAAP, PWM and structural features were the three most important feature-encoding schemes. MultiLyGAN can be found at https://github.com/Lab-Xu/MultiLyGAN.ConclusionsThe CWGAN greatly improved the predictive performance in all experiments. Features derived from CKSAAP, PWM and structure schemes are the most informative and had the greatest contribution to the prediction of PTM.
关键词Post-translational modification Deep learning Generative adversarial networks Random forest
DOI10.1186/s12859-021-04101-y
收录类别SCI
语种英语
资助项目Natural Science Foundation of China[12071024] ; Ministry of Science and Technology of China[2019AAA0105103]
WOS研究方向Biochemistry & Molecular Biology ; Biotechnology & Applied Microbiology ; Mathematical & Computational Biology
WOS类目Biochemical Research Methods ; Biotechnology & Applied Microbiology ; Mathematical & Computational Biology
WOS记录号WOS:000636449300003
出版者BMC
引用统计
被引频次:13[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/16735
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Xu, Yan
作者单位1.Univ Sci & Technol Beijing, Dept Informat & Comp Sci, Beijing 100083, Peoples R China
2.Chinese Acad Sci, Inst Comp Technol, Beijing 100080, Peoples R China
3.China Elect Technol Grp Corp, Res Inst 15, Beijing 100083, Peoples R China
推荐引用方式
GB/T 7714
Yang, Yingxi,Wang, Hui,Li, Wen,et al. Prediction and analysis of multiple protein lysine modified sites based on conditional wasserstein generative adversarial networks[J]. BMC BIOINFORMATICS,2021,22(1):17.
APA Yang, Yingxi.,Wang, Hui.,Li, Wen.,Wang, Xiaobo.,Wei, Shizhao.,...&Xu, Yan.(2021).Prediction and analysis of multiple protein lysine modified sites based on conditional wasserstein generative adversarial networks.BMC BIOINFORMATICS,22(1),17.
MLA Yang, Yingxi,et al."Prediction and analysis of multiple protein lysine modified sites based on conditional wasserstein generative adversarial networks".BMC BIOINFORMATICS 22.1(2021):17.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Yang, Yingxi]的文章
[Wang, Hui]的文章
[Li, Wen]的文章
百度学术
百度学术中相似的文章
[Yang, Yingxi]的文章
[Wang, Hui]的文章
[Li, Wen]的文章
必应学术
必应学术中相似的文章
[Yang, Yingxi]的文章
[Wang, Hui]的文章
[Li, Wen]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。