Bayesian protein inference algorithm

MSBayesPro is a Bayesian protein inference algorithm for LC-MS/MS proteomics experiment. It is the proof-of-principle implementation of the Bayesian protein inference algorithm published in RECOMB 2008. It is available for the following platforms: Win32, Linux32, Linux64. And here are sample data1(detectability file, peptide identification), sample data2(detectability file, peptide identification), readme, and LICENSE for the program. Please read the readme and the reference paper for details of the algorithm and what the the program can do.


In case you want to test your own dataset, you will need to get the peptide detectability predictions (for all the peptides, IDENTIFIED or NOT, of the candidate proteins). With accurate peptide detectabilities, the algorithm is able to identify proteins with good performance and to some extend distinguish tie-proteins (proteins sharing the same set of identified peptides). To achieve desired performance, peptide detectability should be predicted using model from the same experimental platform and protocol. In order to obtain such experiment specific model, you may use the DQmodel program. Alternatively, you can try the peptide detectability predictor from here, which is based on a neural network model trained on a small sample with 12 proteins; however, be aware that this old predictor will NOT lead to optimal results for protein inference.



Yong Fuga Li, Randy J. Arnold, Haixu Tang, and Predrag Radivojac (2010). The importance of peptide detectability for protein identification, quantification, and experiment design in MS/MS proteomics. Journal of Proteome Research.  9(12): 6288–6297. PDF

The paper demonstrated the extension of peptide detectability to complex biology samples and proposed novel algorithms for improved detectability predictions. Among the various applications, the paper showed that improving peptide detectability prediction led to improved Bayesian protein inference.

Yong Fuga Li, Randy Arnold, Yixue Li, Predrag Radivojac, Quanhu Sheng and Haixu Tang. A Bayesian approach to protein inference problem in shotgun proteomics. J Comput Biol (2009) 16(8): 1-11. (PDF.)

This paper is an extension of the RECOMB paper, it further demonstrated that posterior peptide probability (which leverages the power of protein inference) leads to better peptide identifications.

Yong Fuga Li, Randy Arnold, Yixue Li, Predrag Radivojac, Quanhu Sheng and Haixu Tang. A Bayesian approach to protein inference problem in shotgun proteomics. RECOMB 2008; & LNBI 4955, pp. 167 - 180, 2008. PDF

The paper demonstrated variants of protein inferences models based on peptide identification and detectability and concluded that using the likelihood ratios (NOT probabilities) of peptide identifications together with predicted detectabilities leads to the best protein inference outcomes.

H. Tang, R. J. Arnold, P. Alves, Z. Xun, D. E. Clemmer, M. V. Novotny, J. P. Reilly and P. Radivojac, A computational approach toward label-free protein quantification using predicted peptide detectability. ISMB (Supplement of Bioinformatics) 2006: 481-488


               Last modified: 12/04/2010, Questions?