Machine Learning-Driven Identification of Virulence Determinants in Borrelia burgdorferi Associated with Human Dissemination

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

Machine Learning-Driven Identification of Virulence Determinants in Borrelia burgdorferi Associated with Human Dissemination

Authors

Nguyen, H. T.; Brissette, C. A.

Abstract

Lyme disease, the most common tick-born infectious diseases in the United States, presents with highly variable clinical outcomes, ranging from localized erythema migrans to severe disseminated complications affecting the heart, joints, and nervous system. The bacterial determinants underlying this phenotypic variation remain largely unknown, limiting our ability to predict disease progression and optimize treatment strategies. Here, we applied machine learning (ML) approaches to identify specific amino acid residues within surface-exposed virulence factors that predict human dissemination phenotypes. Utilizing the whole genome sequences from 299 clinical Bb isolates, we extracted and characterized variants of seven known virulence factors (BB_0406, BBK32, DbpA, OspA, OspC, P66, and RevA). Protein variants were classified based on their association with disseminated versus localized infections using clinical metadata. Cramer\'s V analysis revealed strong associations between dissemination phenotypes and five adhesins: BBK32, DbpA, OspC, P66, and RevA. We developed ML models using five algorithms with multiple feature selection strategies, achieving robust predictive performance for DbpA, OspC, and RevA variants (all performance metrics >0.7). Feature importance analysis identified key predictive amino acid residues for DbpA, OspC, and RevA. Notably, B-cell epitope prediction revealed significant enrichment of ML-identified residues within predicted epitope regions for OspC and RevA, suggesting these residues may influence immune recognition and bacterial persistence. This study establishes the first computational framework linking Borrelia burgdorferi protein sequence variants to clinical dissemination phenotypes, providing molecular insights into Lyme disease pathogenesis that may inform development of improved diagnostics and therapeutic targets.

Follow Us on

0 comments

Add comment