Skip to yearly menu bar Skip to main content


Oral

Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

Victor Weixin Liang · Lili Yu · Liang Luo · Srini Iyer · Ning Dong · Chunting Zhou · Gargi Ghosh · Mike Lewis · Luke Zettlemoyer · Victoria Lin

Keywords: [ metric learning ] [ dimensionality reduction ] [ distance learning ] [ bound guarantees ]

[ PDF
2025 Oral

Abstract:

Chat is not available.