Towards a species search engine: KISSE offers a rigorous statistical framework for bone collagen tandem mass spectrometry data
Towards a species search engine: KISSE offers a rigorous statistical framework for bone collagen tandem mass spectrometry data
Gharibi, H.; Saei, A. A.; Chernobrovkin, A. L.; Lundstrom, S. L.; Lyu, H.; Meng, Z.; Vegvari, A.; Gaetani, M.; Zubarev, R. A.
AbstractDNA and bone collagen are two key sources of resilient molecular markers used to identify species from their remains. Collagen is more stable than DNA, and thus it is preferred for ancient and degraded samples. Current mass spectrometry-based collagen sequencing approaches are empirical and lack a rigorous statistical framework. Based on the well-developed approaches to protein identification in shotgun proteomics, we introduce a first approximation of the species search engine (SSE). Our SSE named KISSE is based on a species-specific library of collagenous peptides that uses both peptide sequences and their relative abundances. The developed statistical model can identify the species and the probability of correct identification, as well as determine the likelihood of the analyzed species not being in the library. We discuss the advantages and limitations of the proposed approach and the possibility of extending it to other tissues.