Poster
in
Workshop: Machine Learning for Genomics Explorations (MLGenX)
CellMemory: Hierarchical Interpretation of Out-of-Distribution Cells Using Bottlenecked Transformer
Qifei Wang
Applying machine learning to cellular data presents several challenges. One such challenge is making the methods interpretable concerning both the cellular information and its context. Another less-explored challenge is the accurate representation of cells outside existing references, referred to as out-of-distribution (OOD) cells. OOD cells arise from physiological conditions (e.g., diseased vs. healthy) or technical variations (e.g., single-cell references vs. spatial queries). Inspired by the Global Workspace Theory in cognitive neuroscience, we introduce CellMemory, a bottlenecked Transformer with improved generalization designed for the hierarchical interpretation of OOD cells. CellMemory outperforms large-scale foundation models pre-trained on tens of millions of cells, even without pre-training. Moreover, it robustly characterizes malignant cells and their founder cells across different patients, revealing cellular changes caused by the diseases. We further propose leveraging CellMemory’s capacity to integrate multi-modalities and phenotypic information, advancing toward the construction of VIRTUAL ORGAN.