Skip to yearly menu bar Skip to main content


Poster

On the Optimization and Generalization of Two-layer Transformers with Sign Gradient Descent

Bingrui Li · Wei Huang · Andi Han · Zhanpeng Zhou · Taiji Suzuki · Jun Zhu · Jianfei Chen
2025 Poster

Abstract

Video

Chat is not available.