Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Integrating Generative and Experimental Platforms for Biomolecular Design

Towards Protein Sequence & Structure Co-Design with Multi-Modal Language Models

Stephen Lu · Jiarui Lu · Hongyu Guo · Jian Tang


Abstract:

Proteins perform diverse biological functions, governed by the intricate relationship between their sequence and three-dimensional structure. While protein language models (PLMs) have demonstrated remarkable success in functional annotation and structure prediction, their potential for sequence-structure co-design remains underexplored. This limitation arises from pre-training objectives that favor masked token prediction over generative modeling. In this work, we systematically explore sampling strategies to enhance the generative capabilities of PLMs for co-design. We introduce a ranked iterative decoding with re-masking scheme, enabling PLMs to generate sequences and structures more effectively. Benchmarking ESM3 across multiple scales, we demonstrate that using PLMs effectively at sampling time for co-design tasks can outperform specialized architectures that lack comparable scaling properties. To this end, we develop a highly effective ranked decoding with re-masking scheme and benchmark ESM3 on this task at various scales. Our work advances the field of computational protein design by equipping PLMs with robust generative capabilities tailored to sequence-structure interdependence.

Chat is not available.