A universal taxonomic and functional human gut microbiome model for disease classification and phenotype discovery
A universal taxonomic and functional human gut microbiome model for disease classification and phenotype discovery
Karwowska, Z.; Mozejko, M.; Nowak, W.; Romanchenko, A.; Szczurek, E.; Kosciolek, T.
AbstractThe human gut microbiome is a powerful indicator of host health, yet its compositional nature, high sparsity, and inter-individual variability complicate downstream analysis. Here, we introduce two complementary approaches to characterize gut microbiome structure at population scale. First, we define eight functional signatures of the human gut microbiome using Non-negative Matrix Factorization, revealing coordinated metabolic patterns that partially decouple from taxonomic composition. Second, we present GUT-FORMer, a transformer-based autoencoder that jointly models taxonomic and functional metagenomic profiles from close to 21,000 publicly available samples. The learned latent representations capture biologically meaningful structure, reflect geographic and disease-associated variation, and enable accurate classification of 25 diseases in both binary and multiclass settings, as well as regression of host age and BMI. GUT-FORMer outperforms existing microbiome indices and deep learning methods across all tasks, establishing a generalizable framework for microbiome-based precision medicine.