Secondary structure distances reveal a new dimension of protein evolution
Secondary structure distances reveal a new dimension of protein evolution
Bastida, A.; Mun oz Morales, A. M.; Egea-Cortines, M.
AbstractMolecular phylogenetics based on primary sequence comparisons has been central to reconstructing protein evolution. However, structural evolution does not necessarily parallel sequence divergence, particularly in proteins combining ordered domains with intrinsically disordered regions (IDRs). Here, we introduce a quantitative secondary structure distance (S2D) metric that enables systematic comparison of protein secondary structure, including both ordered elements and IDRs. Using the MADS-box transcription factor family as a model, we show that structural divergence is domain-specific and only partially coupled to sequence-based phylogeny. Domain-resolved analyses reveal that the DNA-binding M domain remains structurally constrained, whereas the I and C domains exhibit extensive sequence divergence while retaining conserved intrinsic disorder. In contrast, the K domain contributes disproportionately to global structural variability. Integrating S2D with phylogenetic distance uncovers both convergent structural architectures among distantly related proteins and pronounced structural remodelling within closely related paralogs patterns not evident from primary sequence comparisons alone. Residue-level analyses further demonstrate that the structural impact of mutation depends strongly on amino acid identity and does not scale directly with substitution frequency or conservation metrics. Together, these findings indicate that secondary structural evolution provides an additional dimension of protein diversification beyond sequence divergence. By integrating phylogenetic and structural distances, this framework offers a complementary approach to interpreting protein evolution, particularly in families containing mixtures of ordered domains and intrinsically disordered regions.