Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Integrating Generative and Experimental Platforms for Biomolecular Design

Exploring zero-shot structure-based protein fitness prediction

Arnav Sharma · Anthony Gitter


Abstract:

The ability to make zero-shot predictions about the fitness consequences ofprotein sequence changes with pre-trained machine learning models enablesmany practical applications. Such models can be applied for downstreamtasks like genetic variant interpretation and protein engineering withoutadditional labeled data. The advent of capable protein structure predictiontools has led to the availability of orders of magnitude more precomputedpredicted structures, giving rise to powerful structure-based fitness predic-tion models. Through our experiments, we assess several modeling choicesfor structure-based models and their effects on downstream fitness predic-tion. We find that training on predicted structures can negatively affectdownstream predictions when using experimental structures, zero-shot fit-ness prediction models can struggle to learn fitness landscape of proteinswith disordered regions (lacking a fixed 3D structure), and that predictedstructures for disordered regions can be misleading in this setting and affectpredictive performance. Lastly, we evaluate an additional structure-basedmodel on the ProteinGym substitution benchmark and show that simplemulti-modal ensembles are strong baselines.

Chat is not available.