Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Learning Meaningful Representations of Life (LMRL) Workshop @ ICLR 2025

Mutagenic: An Embedding-Based Approach to Protein Masking for Functional Redesign

Robin Pan · Richard Zhu · Vihan Lakshman · Fiona Qu


Abstract: Recent advances in language models have been applied to protein sequences because of their critical functions in biological processes and the availability of large datasets. Protein engineering has already proven to be impactful in areas such as therapeutics, agriculture, the environment, and bio-manufacturing. Motivated by the challenge of protein design, this paper investigates the following question: How can we efficiently identify residues to edit in the engineering of proteins with specific target functions? In this paper, we propose a novel embedding-based masking approach to edit a given protein to achieve a new target function. More formally, let $F = \{f_1, f_2, \dots, f_n\}$ denote the set of possible protein functions. Given a protein sequence $s = s_1 s_2 \dots s_N$ composed of amino acids $\{s_i\}_{i=1}^N$ with function $f \in F$ and a target function $f^\prime \in F$, our goal is to return a new protein sequence $s^\prime$ with functionality $f^\prime$.

Chat is not available.