Functional Profiling of Thousands of Sequence-Diverse Protease Homologs with GROQ-seq

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

Functional Profiling of Thousands of Sequence-Diverse Protease Homologs with GROQ-seq

Authors

McLellan, J. R.; Ikonomova, S. P.; Sreenivasan, S.; Amin, A. N.; Baranowski, C.; Apel, A. R.; Kelly, P.; Ross, D.; Spinner, A.

Abstract

High-quality datasets that span broad sequence diversity are essential for understanding protein sequence-function relationships beyond local mutational landscapes. Here, we applied Growth-based Quantitative Sequencing (GROQ-seq) to measure function across an 11,722 member protease library, comprised of natural homologs and AI-shrunken variants. This library spans vast sequence diversity, with Levenshtein distances of up to 245 and a mean pairwise sequence identity of 41% to TEV protease S219V. We identified sequence-divergent TEV protease homologs that preserve function against the native TEV protease substrate. These findings reveal the robustness of protease activity across highly diverse sequences. Here, we demonstrate the aptitude of the GROQ-seq assay for screening large, diverse protein libraries for function, enabling efficient data generation at scale for training machine learning models across broad sequence landscapes.

Follow Us on

0 comments

Add comment