Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Learning Meaningful Representations of Life (LMRL) Workshop @ ICLR 2025

Generalized Representation Learning for Multimodal Histology Imaging Data Through Vision-Language Modeling

Jacob Leiby · Alexandro Trevino · Aaron Mayer · Zhenqin Wu · Dokyoon Kim · Zhi Huang


Abstract:

We introduce a trimodal vision-language framework that unifies multiplexed spatial proteomics (SP), H&E histology, and textual metadata in a single embedding space. A specialized transformer-based SP encoder, alongside pretrained H&E and language models, captures diverse morphological, molecular, and semantic signals. Preliminary results demonstrate improved retrieval, zero-shot classification, and patient-level phenotype predictions, indicating the promise of this multimodal approach for deeper insights and translational applications in digital pathology.

Chat is not available.