Skip to yearly menu bar Skip to main content


Poster

SERE: Similarity-based Expert Re-routing for Efficient Batch Decoding in MoE Models

Juntong Wu · Jialiang Cheng · Fuyu Lv · Dan Ou · Li Yuan

Abstract

Log in and register to view live content