Skip to yearly menu bar Skip to main content



Abstract:

Algorithms for training denoising diffusion and flow matching models on data are now very scalable. However, reward-driven learning settings, where we do not have access to any target data examples but must train based only on scalar reward signals, are much more difficult to scale than data-driven settings.

In this talk, I will provide an in-depth---but hopefully also friendly---introduction to the theory of Stochastic Optimal Control, which so far has not been widely adopted at scale due to prohibitive computational costs. I will explore its application to large scale generative modeling in two of our recent works, Adjoint Matching and Adjoint Sampling, which introduces new perspectives on solving stochastic control with an emphasis on reliability and high scalability.

Chat is not available.