Speeding up tandem mass spectrometry-based database searching by longest common prefix

doi:10.1186/1471-2105-11-577

	Speeding up tandem mass spectrometry-based database searching by longest common prefix
	Zhou,Chen 1,2,3; Chi,Hao 1,2,3; Wang,Le-Heng 1,2; Li,You 1,2; Wu,Yan-Jie 1,2,3; Fu,Yan 1,2; Sun,Rui-Xiang 1,2; He,Si-Min 1,2
	2010-11-25
发表期刊	BMC Bioinformatics
ISSN	1471-2105
卷号	11 期号:1
摘要	AbstractBackgroundTandem mass spectrometry-based database searching has become an important technology for peptide and protein identification. One of the key challenges in database searching is the remarkable increase in computational demand, brought about by the expansion of protein databases, semi- or non-specific enzymatic digestion, post-translational modifications and other factors. Some software tools choose peptide indexing to accelerate processing. However, peptide indexing requires a large amount of time and space for construction, especially for the non-specific digestion. Additionally, it is not flexible to use.ResultsWe developed an algorithm based on the longest common prefix (ABLCP) to efficiently organize a protein sequence database. The longest common prefix is a data structure that is always coupled to the suffix array. It eliminates redundant candidate peptides in databases and reduces the corresponding peptide-spectrum matching times, thereby decreasing the identification time. This algorithm is based on the property of the longest common prefix. Even enzymatic digestion poses a challenge to this property, but some adjustments can be made to this algorithm to ensure that no candidate peptides are omitted. Compared with peptide indexing, ABLCP requires much less time and space for construction and is subject to fewer restrictions.ConclusionsThe ABLCP algorithm can help to improve data analysis efficiency. A software tool implementing this algorithm is available at http://pfind.ict.ac.cn/pfind2dot5/index.htm
DOI	10.1186/1471-2105-11-577
语种	英语
WOS记录号	BMC:10.1186/1471-2105-11-577
出版者	BioMed Central
引用统计
文献类型	期刊论文
条目标识符	http://119.78.100.204/handle/2XEOYT63/3967
专题	中国科学院计算技术研究所期刊论文_英文
通讯作者	He,Si-Min
作者单位	1.Chinese Academy of Sciences; Key Lab of Intelligent Information Processing 2.Chinese Academy of Sciences; Institute of Computing Technology 3.Graduate University of Chinese Academy of Sciences
推荐引用方式 GB/T 7714	Zhou,Chen,Chi,Hao,Wang,Le-Heng,et al. Speeding up tandem mass spectrometry-based database searching by longest common prefix[J]. BMC Bioinformatics,2010,11(1).
APA	Zhou,Chen.,Chi,Hao.,Wang,Le-Heng.,Li,You.,Wu,Yan-Jie.,...&He,Si-Min.(2010).Speeding up tandem mass spectrometry-based database searching by longest common prefix.BMC Bioinformatics,11(1).
MLA	Zhou,Chen,et al."Speeding up tandem mass spectrometry-based database searching by longest common prefix".BMC Bioinformatics 11.1(2010).