Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Blog Track Poster Session

Decay No More

Fabian Schaipp


Abstract:

Weight decay is among the most important tuning parameters to reach high accuracy for large-scale machine learning models. In this blog post, we revisit AdamW, the weight decay version of Adam, summarizing empirical findings as well as theoretical motivations from an optimization perspective.

Chat is not available.