Skip to yearly menu bar Skip to main content


Poster

Small-scale proxies for large-scale Transformer training instabilities

Mitchell Wortsman · Peter Liu · Lechao Xiao · Katie Everett · Alexander Alemi · Ben Adlam · John Co-Reyes · Izzeddin Gur · Abhishek Kumar · Roman Novak · Jeffrey Pennington · Jascha Sohl-Dickstein · Kelvin Xu · Jaehoon Lee · Justin Gilmer · Simon Kornblith
2024 Poster

Abstract

Video

Chat is not available.