High-Throughput Machine Learning-Aided Antibody Discovery for Cell Surface Antigens

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

High-Throughput Machine Learning-Aided Antibody Discovery for Cell Surface Antigens

Authors

Kothiwal, D.; Kollasch, A. W.; Hollmer, N.; Ghosh, A.; Zhang, R.; Anuganti, M.; Paul, S. B.; Zagar, Y.; Abdollahi, M.; Anderson, Z.; Belay, F.; Salotto, M.; Ulmer, S.; AbdelAlim, Y. A.; Kumar, S.; Vangala, M.; Yang, C.; Chedotal, A.; Jardine, J. G.; Teixeira, A. A. R.; Moshinsky, D. J.; Zhu, H.; Zhu, S.; Springer, T. A.; Marks, D. S.; Meijers, R.

Abstract

Machine learning (ML) has the potential to revolutionize antibody design and selection, but its success depends on access to extensive, well-curated datasets of antibody-antigen interactions. To address this need, we developed a synthetic Fab yeast display library optimized for seamless ML integration, focusing on sequence diversity within the CDRH3 loop. The library incorporates key sequence features derived from human B cell repertoires essential for efficient antibody generation captured in a compact antigen recognition module (ARM) format. Built using the VH1-69 heavy chain and four light chains, the library was evaluated against ten human and murine cell surface antigens, including PD-L1, TIGIT, and ROBO1. This approach yielded hundreds of antibodies with robust biophysical properties, validated for functional performance in flow cytometry and immunohistochemistry. Furthermore, ML analysis identified additional antibodies for ROBO2 and PD-L2 from the aggregate sequencing data, demonstrating utility for hybrid in silico and experimental workflows. We provide a publicly accessible dataset comprising more than 68,000 Fab sequences and 486 characterized antibodies. This study establishes an ML-compatible framework designed to accelerate and streamline antibody discovery and development.

Follow Us on

0 comments

Add comment