Skip to yearly menu bar Skip to main content


Poster

Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization

Taishi Nakamura · Takuya Akiba · Kazuki Fujii · Yusuke Oda · Rio Yokota · Jun Suzuki
2025 Poster

Abstract

Video

Chat is not available.