Description: Peptide mass fingerprinting (PMF) was the first automated method for protein identification in proteomics, and it remains in common usage today because of its simplicity and the low equipment costs for generating fingerprints. However, one of the problems with PMF is its limited specificity and sensitivity in protein identification. Here I present a method that shows potential to significantly enhance the accuracy of peptide mass fingerprinting, using a machine learning approach based on a hidden Markov model (HMM). This method is applied to improve differentiation of real protein matches from those that occur by chance. The system was trained using 300 examples of combined real and false-positive protein identification results, and 10-fold cross-validation applied to assess model discrimination. The model can achieve 93% accuracy in distinguishing correct and real protein identification results versus false-positive matches. The receiver operating characteristic (ROC) curve area for the best model was 0.833.
Date: December 2004
Creator: Yang, Dongmei