ParSeek: Accurate cryo-EM particle picking with a deep learning model trained on synthetic data

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

ParSeek: Accurate cryo-EM particle picking with a deep learning model trained on synthetic data

Authors

Qian, J.; Gong, Y.; Liu, F.; Huang, Y.; Guo, G.; Zhu, Y.; Huang, Q.

Abstract

Accurate particle picking from noisy cryo-EM micrographs is essential for high-resolution reconstruction. Current deep learning methods rely on manually annotated data, which is labor-intensive, subjective, and limits particle recall under low signal-to-noise ratio (SNR). Here we introduce ParSeek, an automated picker trained entirely on synthetic data without human annotation. Synthetic micrographs are generated by projecting known 3D structures into realistic background patches that reproduce experimental noise. On seven public cryo-EM datasets, ParSeek outperformed Topaz and CryoSegNet on four datasets, achieving the highest F1-score (up to 0.82) and reaching 0.63 on a challenging membrane protein dataset. Density maps from ParSeek-picked particles showed cross-correlation coefficients up to 0.995 with the reference and a minimal resolution difference of 0.1 [A]. ParSeek also overcame severe orientation bias on an influenza dataset, yielding a reasonable reconstruction. Applied to three experimental datasets (an antibody-antigen complex and two GPCRs), ParSeek enabled reconstructions at 5.0 [A], 4.0 [A], and 2.8 [A], respectively. The 2.8 [A] map resolved side-chain densities and ligand flexibility. This study establishes a fully synthetic-data-driven strategy that eliminates manual annotation for training cryo-EM deep-learning models, paving the way for automated, unbiased particle picking.

Follow Us on

0 comments

Add comment