Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Learning Meaningful Representations of Life (LMRL) Workshop @ ICLR 2025

Flexible Models of Functional Annotations to Variant Effects using Accelerated Linear Algebra

Alan Amin · Andres Potapczynski · Andrew Gordon Wilson


Abstract:

To predict and understand the causes of disease, geneticists build models that predict how a genetic variant impacts phenotype from genomic features. There is a vast amount of data available from the large projects that have sequence hundreds of thousands of genomes; yet, state-of-the-art models, like LD score regression, cannot leverage this data as they lack flexibility due to their simplifying assumptions. These models use simplifying assumptions to avoid solving the large linear algebra problems introduced by the genomic correlation matrices. In this paper, we leverage modern fast linear algebra techniques to develop WASP (genome Wide Association Studies with Preconditioned iteration), a method to train large and flexible neural network models. On semi-synthetic and real data we show that WASP better predicts phenotype and better recovers its functional causes compared to LD score regression. Finally, we show that training larger WASP models on larger data leads to better explanations of phenotypes.

Chat is not available.