Francesca Cuturello
Title: Evolution guides Protein Language Models for predicting stability variations upon single mutations
Protein Language Models effectively address biological challenges in structure prediction relying exclusively on sequence information. Recent works investigate their application for predicting thermodynamic stability changes induced by single amino acid mutations, a notoriously complicated task due to the limited size and variability of the available datasets caused by experimental constraints. In this study, we introduce a strategy for predicting stability changes based on the fine-tuning of several pre-trained protein language models on a recently released mega-scale dataset. Our findings reveal that the MSA Transformer, leveraging explicitly the co-evolution signal encoded in homologous sequences, surpasses existing methods and exhibits enhanced generalization power. We define a stringent filtering pipeline to curate the datasets employed for training and testing the models to prevent overfitting. Our results highlight the robustness of our approach and its potential role in future developments in protein stability prediction without relying on structural information.