Affinity Posters
Blog Track Session 5
David Dobre · Leo Schwinn · Claire Vernade · Charlie Gauthier · Fabian Pedregosa · Gauthier Gidel
Halle B
Live content is unavailable. Log in and register to view live content
Schedule
Thu 1:45 a.m. - 3:45 a.m.
|
Behavioral Differences in Mode-Switching Exploration for Reinforcement Learning
(
Poster
#2
)
>
link
Poster Location: Halle B #2 The exploration versus exploitation dilemma prevails as a fundamental challenge of reinforcement learning (RL), whereby an agent must exploit its knowledge of the environment to accrue the largest returns while also needing to explore the environment to discover these large returns. The vast majority of deep RL (DRL) algorithms manage this dilemma with a monolithic behavior policy that interleaves exploration actions randomly throughout the more frequent exploitation actions. In 2022, researchers from Google DeepMind presented an initial study on mode-switching exploration, by which an agent separates its exploitation and exploration actions more coarsely throughout an episode by intermittently and significantly changing its behavior policy. This study was partly motivated by the exploration strategies of humans and animals that exhibit similar behavior, and they showed how mode-switching policies outperformed monolithic policies when trained on hard-exploration Atari games. We supplement their work in this blog post by showcasing some observed behavioral differences between mode-switching and monolithic exploration on the Atari suite and presenting illustrative examples of its benefits. This work aids practitioners and researchers by providing practical guidance and eliciting future research directions in mode-switching exploration. |
Loren Anderson · Nathan Bittner 🔗 |
Thu 1:45 a.m. - 3:45 a.m.
|
Towards Robust Foundation Models: Adversarial Contrastive Learning
(
Poster
#1
)
>
link
Poster Location: Halle B #1 Foundation models pre-trained on large-scale unlabelled datasets using self-supervision can be generalizable to a wide range of downstream tasks. Existing work has shown that there exist adversarial attacks that can effectively fool any downstream model obtained by fine-tuning foundation models. The existence of such adversarial attacks necessitates the development of robust foundation models which can yield both standard generalization and adversarial robustness in safety-critical downstream tasks. Currently, adversarial contrastive learning (ACL) is one of the most effective methods for building robust foundation models. ACL incorporates contrastive learning with adversarial data to effectively learn robust representations without requiring costly annotations. In this blog, based on two NeurIPS 2023 publications, we will introduce two techniques for enhancing ACL's effectiveness and efficiency, respectively. (1) This blog introduces Adversarial Invariant Regularization (AIR) which is the state-of-the-art ACL algorithm. A causal theoretical framework is built to interpret ACL and the AIR algorithm is derived from the causal framework to regulate and improve ACL. (2) This blog introduces a Robustness-aware Coreset Selection (RCS) method to speed up ACL. RCS does not require label information and searches for an informative training subset that helps maintain the adversarial robustness of the representation. RCS for the first time applies the ACL on the large-scale ImageNet-1K dataset. |
Jingfeng Zhang · Xilie Xu 🔗 |