Oral
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
Victor Weixin Liang · Lili Yu · Liang Luo · Srini Iyer · Ning Dong · Chunting Zhou · Gargi Ghosh · Mike Lewis · Luke Zettlemoyer · Victoria Lin
Keywords: [ metric learning ] [ dimensionality reduction ] [ distance learning ] [ bound guarantees ]
Abstract:
Chat is not available.
Successful Page Load