Skip to yearly menu bar Skip to main content


Poster

Weight Decay may matter more than µP for Learning Rate Transfer in Practice

Atli Kosson · Jeremy Welborn · Yang Liu · Martin Jaggi · Xi Chen

Abstract

Log in and register to view live content