EffectorGeneP: accurate gene annotation in pathogen genomes from infection transcriptomes

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

EffectorGeneP: accurate gene annotation in pathogen genomes from infection transcriptomes

Authors

Sperschneider, J.; Langlands-Perry, C.; Chen, J.; Lubega, J.; Arndell, T.; Lewis, D.; Henningsen, E.; Blundell, C.; Vanhercke, T.; Kanyuka, K.; Figueroa, M.; Dodds, P.

Abstract

Accurate gene annotation is crucial for inference of biological knowledge from genomes. However, non-canonical genes such as orphan or single-exon genes as well as those residing in rapidly evolving regions are routinely dismissed in annotation pipelines. In filamentous pathogen genomes, this disproportionately affects the annotation of genes encoding disease-promoting effector proteins. We introduce EffectorGeneP, a machine learning tool that self-trains on transcript data, predicts the most likely coding sequence from transcripts and effectively separates bona fide genes from transcriptional noise. EffectorGeneP annotates over 95% of known effectors correctly, while other state-of-the-art methods annotate 15%-78%. We show that EffectorGeneP expands the predicted secretome of pathogens by over 50% and that high-throughput screening of an effector library in plant protoplasts uncovers the previously poorly annotated AvrSr26 gene family in the wheat stem rust fungus. EffectorGeneP decodes genomes at unprecedented resolution and will enable the study of biological processes in important pathogen species.

Follow Us on

0 comments

Add comment