Show Detail |
Timezone: America/Los_Angeles |

Filter Events:

Filter Rooms:

SUN 30 APR

10 p.m.

(ends 10:30 AM)

11:15 p.m.

11:30 p.m.

MON 1 MAY

12:30 a.m.

Coffee Break

1 a.m.

Orals 1:00-2:20

[1:00]
Quantifying Memorization Across Neural Language Models

[1:10]
Human-Guided Fair Classification for Natural Language Processing

[1:20]
Is Adversarial Training Really a Silver Bullet for Mitigating Data Poisoning?

[1:30]
Is the Performance of My Deep Network Too Good to Be True? A Direct Approach to Estimating the Bayes Error in Binary Classification

[1:40]
UNICORN: A Unified Backdoor Trigger Inversion Framework

[1:50]
Canary in a Coalmine: Better Membership Inference with Ensembled Adversarial Queries

[2:00]
Learning to Estimate Shapley Values with Vision Transformers

[2:10]
Provable Defense Against Geometric Transformations

(ends 2:30 AM)

Orals 1:00-2:10

[1:00]
Phase2vec: dynamical systems embedding with a physics-informed convolutional network

[1:10]
Evolve Smoothly, Fit Consistently: Learning Smooth Latent Dynamics For Advection-Dominated Systems

[1:20]
Compressing multidimensional weather and climate data into neural networks

[1:30]
D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory

[1:40]
Conditional Antibody Design as 3D Equivariant Graph Translation

[1:50]
Equiformer: Equivariant Graph Attention Transformer for 3D Atomistic Graphs

[2:00]
CROM: Continuous Reduced-Order Modeling of PDEs Using Implicit Neural Representations

(ends 2:30 AM)

Orals 1:00-2:10

[1:00]
DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems

[1:10]
The In-Sample Softmax for Offline Reinforcement Learning

[1:20]
Emergence of Maps in the Memories of Blind Navigation Agents

[1:30]
Does Zero-Shot Reinforcement Learning Exist?

[1:40]
Learning Soft Constraints From Constrained Expert Demonstrations

[1:50]
Offline Q-learning on Diverse Multi-Task Data Both Scales And Generalizes

[2:00]
VIP: Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training

(ends 2:30 AM)

Orals 1:10-2:00

[1:10]
Token Merging: Your ViT But Faster

[1:20]
TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second

[1:30]
Learning Group Importance using the Differentiable Hypergeometric Distribution

[1:40]
Neural Networks and the Chomsky Hierarchy

[1:50]
Learning on Large-scale Text-attributed Graphs via Variational Inference

(ends 2:30 AM)

Orals 1:00-1:50

[1:00]
A probabilistic framework for task-aligned intra- and inter-area neural manifold estimation

[1:10]
Neuroevolution is a Competitive Alternative to Reinforcement Learning for Skill Discovery

[1:20]
Disentanglement with Biological Constraints: A Theory of Functional Cell Types

[1:30]
Hebbian Deep Learning Without Feedback

[1:40]
Domain Generalization via Heckman-type Selection Models

(ends 2:30 AM)

Orals 1:00-2:10

[1:00]
Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!

[1:10]
Fisher-Legendre (FishLeg) optimization of deep neural networks

[1:20]
Modeling the Data-Generating Process is Necessary for Out-of-Distribution Generalization

[1:30]
Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets

[1:40]
NeRN: Learning Neural Representations for Neural Networks

[1:50]
Divide to Adapt: Mitigating Confirmation Bias for Domain Adaptation of Black-Box Predictors

[2:00]
Continual Unsupervised Disentangling of Self-Organizing Representations

(ends 2:30 AM)

2:30 a.m.

(ends 4:30 AM)

Lunch

3:30 a.m.

4:30 a.m.

Invited Talk:

Girmaw Abebe Tadesse

(ends 5:30 AM)

5:30 a.m.

Coffee Break

6 a.m.

Orals 6:00-7:00

[6:00]
Neural Optimal Transport

[6:10]
Implicit Bias of Large Depth Networks: a Notion of Rank for Nonlinear Functions

[6:20]
Effects of Graph Convolutions in Multi-layer Networks

[6:40]
Modeling content creator incentives on algorithm-curated platforms

[6:50]
Optimal Transport for Offline Imitation Learning

(ends 7:30 AM)

Orals 6:00-7:20

[6:00]
The Symmetric Generalized Eigenvalue Problem as a Nash Equilibrium

[6:10]
Metadata Archaeology: Unearthing Data Subsets by Leveraging Training Dynamics

[6:20]
Targeted Hyperparameter Optimization with Lexicographic Preferences Over Multiple Objectives

[6:40]
LAVA: Data Valuation without Pre-Specified Learning Algorithms

[7:00]
Learning a Data-Driven Policy Network for Pre-Training Automated Feature Engineering

[7:10]
Learning where and when to reason in neuro-symbolic inference

(ends 7:30 AM)

Orals 6:00-7:20

[6:00]
Multi-skill Mobile Manipulation for Object Rearrangement

[6:10]
The Surprising Effectiveness of Equivariant Models in Domains with Latent Symmetry

[6:20]
A System for Morphology-Task Generalization via Unified Representation and Behavior Distillation

[6:30]
Sample-Efficient Reinforcement Learning by Breaking the Replay Ratio Barrier

[6:40]
Powderworld: A Platform for Understanding Generalization via Rich Task Distributions

[6:50]
Near-optimal Policy Identification in Active Reinforcement Learning

[7:00]
BC-IRL: Learning Generalizable Reward Functions from Demonstrations

[7:10]
Learning About Progress From Experts

(ends 7:30 AM)

Orals 6:00-7:00

[6:00]
Diffusion Posterior Sampling for General Noisy Inverse Problems

[6:10]
Prompt-to-Prompt Image Editing with Cross-Attention Control

[6:20]
Sequential Latent Variable Models for Few-Shot High-Dimensional Time-Series Forecasting

[6:30]
Diffusion Models Already Have A Semantic Latent Space

[6:40]
DreamFusion: Text-to-3D using 2D Diffusion

[6:50]
Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions

(ends 7:30 AM)

Orals 6:00-7:20

[6:00]
GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation

[6:20]
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization

[6:30]
Human Motion Diffusion Model

[6:40]
NTFields: Neural Time Fields for Physics-Informed Robot Motion Planning

[6:50]
UNIFIED-IO: A Unified Model for Vision, Language, and Multi-modal Tasks

[7:00]
Mass-Editing Memory in a Transformer

[7:10]
On the Usefulness of Embeddings, Clusters and Strings for Text Generation Evaluation

(ends 7:30 AM)

Orals 6:00-7:20

[6:00]
A Call to Reflect on Evaluation Practices for Failure Detection in Image Classification

[6:10]
Associative Memory Augmented Asynchronous Spatiotemporal Representation Learning for Event-based Perception

[6:20]
MapTR: Structured Modeling and Learning for Online Vectorized HD Map Construction

[6:30]
ROSCOE: A Suite of Metrics for Scoring Step-by-Step Reasoning

[6:40]
Ask Me Anything: A simple strategy for prompting language models

[6:50]
Code Translation with Compiler Representations

[7:00]
Hidden Markov Transformer for Simultaneous Machine Translation

[7:10]
Selection-Inference: Exploiting Large Language Models for Interpretable Logical Reasoning

(ends 7:30 AM)

7:30 a.m.

8 a.m.

9:30 a.m.

9:50 a.m.

11 p.m.

(ends 9:00 AM)

11:30 p.m.

Invited Talk:

Masashi Sugiyama

(ends 12:30 AM)

TUE 2 MAY

12:30 a.m.

Coffee Break

1 a.m.

Orals 1:00-2:20

[1:00]
On the duality between contrastive and non-contrastive self-supervised learning

[1:10]
Unsupervised Meta-learning via Few-shot Pseudo-supervised Contrastive Learning

[1:20]
The Trade-off between Universality and Label Efficiency of Representations from Contrastive Learning

[1:30]
Self-supervised learning with rotation-invariant kernels

[1:40]
DINO as a von Mises-Fisher mixture model

[1:50]
Loss Landscapes are All You Need: Neural Network Generalization Can Be Explained Without the Implicit Bias of Gradient Descent

[2:00]
Efficient Discrete Multi Marginal Optimal Transport Regularization

[2:10]
Sparsity-Constrained Optimal Transport

(ends 2:30 AM)

Orals 1:00-2:20

[1:00]
Efficient Conditionally Invariant Representation Learning

[1:10]
Image to Sphere: Learning Equivariant Features for Efficient Pose Prediction

[1:20]
Omnigrok: Grokking Beyond Algorithmic Data

[1:30]
Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers

[1:40]
Multi-Rate VAE: Train Once, Get the Full Rate-Distortion Curve

[1:50]
Multi-lingual Evaluation of Code Generation Models

[2:00]
Rethinking the Expressive Power of GNNs via Graph Biconnectivity

[2:10]
Hyperbolic Deep Reinforcement Learning

(ends 2:30 AM)

Orals 1:00-2:20

[1:00]
The Role of ImageNet Classes in Fréchet Inception Distance

[1:10]
Learning Diffusion Bridges on Constrained Domains

[1:20]
Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow

[1:30]
Learning multi-scale local conditional probability models of images

[1:40]
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion

[1:50]
Rarity Score : A New Metric to Evaluate the Uncommonness of Synthesized Images

[2:00]
Deterministic training of generative autoencoders using invertible layers

[2:10]
3D generation on ImageNet

(ends 2:30 AM)

Orals 1:00-2:20

[1:00]
Adversarial Diversity in Hanabi

[1:10]
Moving Forward by Moving Backward: Embedding Action Impact over Action Semantics

[1:20]
Programmatically Grounded, Compositionally Generalizable Robotic Manipulation

[1:30]
On the Sensitivity of Reward Inference to Misspecified Human Models

[1:40]
Understanding and Adopting Rational Behavior by Bellman Score Estimation

[1:50]
SMART: Self-supervised Multi-task pretrAining with contRol Transformers

[2:00]
Dichotomy of Control: Separating What You Can Control from What You Cannot

[2:10]
Fast and Precise: Adjusting Planning Horizon with Adaptive Subgoal Search

(ends 2:30 AM)

Orals 1:00-2:20

[1:00]
Sign and Basis Invariant Networks for Spectral Graph Representation Learning

[1:10]
ACMP: Allen-Cahn Message Passing with Attractive and Repulsive Forces for Graph Neural Networks

[1:20]
Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task

[1:30]
QuAnt: Quantum Annealing with Learnt Couplings

[1:40]
Unmasking the Lottery Ticket Hypothesis: What's Encoded in a Winning Ticket's Mask?

[1:50]
The Asymmetric Maximum Margin Bias of Quasi-Homogeneous Neural Networks

[2:00]
The Lie Derivative for Measuring Learned Equivariance

[2:10]
Training language models to summarize narratives improves brain alignment

(ends 2:30 AM)

2:30 a.m.

(ends 4:30 AM)

Lunch

3:30 a.m.

4:30 a.m.

5:30 a.m.

Coffee Break

6 a.m.

Orals 6:00-7:20

[6:00]
Planning Goals for Exploration

[6:10]
Outcome-directed Reinforcement Learning by Uncertainty \& Temporal Distance-Aware Curriculum Goal Generation

[6:20]
Pink Noise Is All You Need: Colored Noise Exploration in Deep Reinforcement Learning

[6:30]
Benchmarking Offline Reinforcement Learning on Real-Robot Hardware

[6:40]
Choreographer: Learning and Adapting Skills in Imagination

[6:50]
A CMDP-within-online framework for Meta-Safe Reinforcement Learning

[7:00]
Confidence-Conditioned Value Functions for Offline Reinforcement Learning

[7:10]
Extreme Q-Learning: MaxEnt RL without Entropy

(ends 7:30 AM)

Orals 6:00-7:10

[6:00]
Active Learning in Bayesian Neural Networks with Balanced Entropy Learning Principle

[6:10]
SAM as an Optimal Relaxation of Bayes

[6:20]
Generative Augmented Flow Networks

[6:30]
A Laplace-inspired Distribution on SO(3) for Probabilistic Rotation Estimation

[6:40]
Domain-Indexing Variational Bayes: Interpretable Domain Index for Domain Adaptation

[6:50]
GRACE-C: Generalized Rate Agnostic Causal Estimation via Constraints

[7:00]
Rhino: Deep Causal Temporal Relationship Learning with History-dependent Noise

(ends 7:30 AM)

Orals 6:00-7:20

[6:00]
Minimalistic Unsupervised Representation Learning with the Sparse Manifold Transform

[6:10]
AANG : Automating Auxiliary Learning

[6:20]
STUNT: Few-shot Tabular Learning with Self-generated Tasks from Unlabeled Tables

[6:30]
Task-customized Masked Autoencoder via Mixture of Cluster-conditional Experts

[6:40]
When Source-Free Domain Adaptation Meets Learning with Noisy Labels

[6:50]
Towards Stable Test-time Adaptation in Dynamic Wild World

[7:00]
Proposal-Contrastive Pretraining for Object Detection from Fewer Data

[7:10]
Unsupervised Semantic Segmentation with Self-supervised Object-centric Representations

(ends 7:30 AM)

Orals 6:00-7:10

[6:00]
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs

[6:10]
Flow Annealed Importance Sampling Bootstrap

[6:20]
Learning Controllable Adaptive Simulation for Multi-resolution Physics

[6:30]
Minimax Optimal Kernel Operator Learning via Multilevel Training

[6:40]
Neural Lagrangian Schr\"{o}dinger Bridge: Diffusion Modeling for Population Dynamics

[6:50]
Pre-training via Denoising for Molecular Property Prediction

[7:00]
MARS: Meta-learning as Score Matching in the Function Space

(ends 7:30 AM)

Orals 6:00-7:20

[6:00]
Transformers are Sample-Efficient World Models

[6:10]
Building a Subspace of Policies for Scalable Continual Learning

[6:20]
Neural Episodic Control with State Abstraction

[6:30]
Learnable Behavior Control: Breaking Atari Human World Records via Sample-Efficient Behavior Selection

[6:40]
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization

[6:50]
Is Conditional Generative Modeling all you need for Decision Making?

[7:00]
RLx2: Training a Sparse Deep Reinforcement Learning Model from Scratch

[7:10]
Towards Effective and Interpretable Human-Agent Collaboration in MOBA Games: A Communication Perspective

(ends 7:30 AM)

Orals 6:00-7:20

[6:00]
CUDA: Curriculum of Data Augmentation for Long-tailed Recognition

[6:10]
One-Pixel Shortcut: On the Learning Preference of Deep Neural Networks

[6:20]
Learning Label Encodings for Deep Regression

[6:30]
Multifactor Sequential Disentanglement via Structured Koopman Autoencoders

[6:40]
A Unified Algebraic Perspective on Lipschitz Neural Networks

[6:50]
From Play to Policy: Conditional Behavior Generation from Uncurated Robot Data

[7:00]
Git Re-Basin: Merging Models modulo Permutation Symmetries

[7:10]
In-context Reinforcement Learning with Algorithm Distillation

(ends 7:30 AM)

6:30 a.m.

7:30 a.m.

Posters 7:30-9:30

(ends 9:30 AM)

8 a.m.

11 p.m.

(ends 9:00 AM)

11:30 p.m.

WED 3 MAY

12:30 a.m.

Coffee Break

1 a.m.

Orals 1:00-2:10

[1:00]
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language

[1:10]
Clean-image Backdoor: Attacking Multi-label Models with Poisoned Labels Only

[1:20]
DocPrompting: Generating Code by Retrieving the Docs

[1:30]
View Synthesis with Sculpted Neural Points

[1:40]
VA-DepthNet: A Variational Approach to Single Image Depth Prediction

[1:50]
Visual Classification via Description from Large Language Models

[2:00]
Mitigating Gradient Bias in Multi-objective Learning: A Provably Convergent Approach

(ends 2:30 AM)

Orals 1:00-2:20

[1:00]
Hungry Hungry Hippos: Towards Language Modeling with State Space Models

[1:10]
Relative representations enable zero-shot latent space communication

[1:20]
ExpressivE: A Spatio-Functional Embedding For Knowledge Graph Completion

[1:30]
Distilling Model Failures as Directions in Latent Space

[1:40]
Graph Neural Networks for Link Prediction with Subgraph Sketching

[1:50]
The Influence of Learning Rule on Representation Dynamics in Wide Neural Networks

[2:00]
REVISITING PRUNING AT INITIALIZATION THROUGH THE LENS OF RAMANUJAN GRAPH

[2:10]
A Closer Look at Model Adaptation using Feature Distortion and Simplicity Bias

(ends 2:30 AM)

Orals 1:00-2:20

[1:00]
Progress measures for grokking via mechanistic interpretability

[1:10]
Localized Randomized Smoothing for Collective Robustness Certification

[1:20]
Towards Interpretable Deep Reinforcement Learning with Human-Friendly Prototypes

[1:30]
CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks

[1:40]
Model-based Causal Bayesian Optimization

[1:50]
Corrupted Image Modeling for Self-Supervised Visual Pre-Training

[2:00]
SimPer: Simple Self-Supervised Learning of Periodic Targets

[2:10]
Simplicial Embeddings in Self-Supervised Learning and Downstream Classification

(ends 2:30 AM)

Orals 1:00-2:10

[1:00]
DASHA: Distributed Nonconvex Optimization with Communication Compression and Optimal Oracle Complexity

[1:10]
Single-shot General Hyper-parameter Optimization for Federated Learning

[1:20]
Solving Constrained Variational Inequalities via a First-order Interior Point-based Method

[1:30]
FedExP: Speeding Up Federated Averaging via Extrapolation

[1:40]
LMC: Fast Training of GNNs via Subgraph Sampling with Provable Convergence

[1:50]
Multi-Objective Online Learning

[2:00]
Continuous PDE Dynamics Forecasting with Implicit Neural Representations

(ends 2:30 AM)

Orals 1:00-2:20

[1:00]
Efficient recurrent architectures through activity sparsity and sparse back-propagation through time

[1:10]
PLOT: Prompt Learning with Optimal Transport for Vision-Language Models

[1:20]
Aligning Model and Macaque Inferior Temporal Cortex Representations Improves Model-to-Human Behavioral Alignment and Adversarial Robustness

[1:30]
Implicit regularization in Heavy-ball momentum accelerated stochastic gradient descent

[1:40]
A Primal-Dual Framework for Transformers and Neural Networks

[1:50]
Learning with Logical Constraints but without Shortcut Satisfaction

[2:00]
No Reason for No Supervision: Improved Generalization in Supervised Models

[2:10]
Generating Diverse Cooperative Agents by Learning Incompatible Policies

(ends 2:30 AM)

2:30 a.m.

Posters 2:30-4:30

Revocable Deep Reinforcement Learning with Affinity Regularization for Outlier-Robust Graph Matching

Efficient recurrent architectures through activity sparsity and sparse back-propagation through time

(ends 4:30 AM)

Lunch

3:30 a.m.

4:30 a.m.

Invited Talk:

Jascha Sohl-Dickstein

(ends 5:30 AM)

6 a.m.

Orals 6:00-7:10

[6:00]
DaxBench: Benchmarking Deformable Object Manipulation with Differentiable Physics

[6:10]
Betty: An Automatic Differentiation Library for Multilevel Optimization

[6:20]
WikiWhy: Answering and Explaining Cause-and-Effect Questions

[6:30]
MEDFAIR: Benchmarking Fairness for Medical Imaging

[6:40]
Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation

[6:50]
Confidential-PROFITT: Confidential PROof of FaIr Training of Trees

[7:00]
Disparate Impact in Differential Privacy from Gradient Misalignment

(ends 7:30 AM)

Orals 6:00-7:20

[6:00]
Binding Language Models in Symbolic Languages

[6:10]
MeshDiffusion: Score-based Generative 3D Mesh Modeling

[6:20]
The Modality Focusing Hypothesis: Towards Understanding Crossmodal Knowledge Distillation

[6:30]
AutoGT: Automated Graph Transformer Architecture Search

[6:40]
Zero-Shot Image Restoration Using Denoising Diffusion Null-Space Model

[6:50]
LightGCL: Simple Yet Effective Graph Contrastive Learning for Recommendation

[7:00]
Certified Training: Small Boxes are All You Need

[7:10]
Inequality phenomenon in $l_{\infty}$-adversarial training, and its unrealized threats

(ends 7:30 AM)

Orals 6:00-7:20

[6:00]
Agree to Disagree: Diversity through Disagreement for Better Transferability

[6:10]
What learning algorithm is in-context learning? Investigations with linear models

[6:20]
Addressing Parameter Choice Issues in Unsupervised Domain Adaptation by Aggregation

[6:30]
Encoding Recurrence into Transformers

[6:40]
Universal Few-shot Learning of Dense Prediction Tasks with Visual Token Matching

[6:50]
Simplified State Space Layers for Sequence Modeling

[7:00]
Relational Attention: Generalizing Transformers for Graph-Structured Tasks

[7:10]
Sparse Mixture-of-Experts are Domain Generalizable Learners

(ends 7:30 AM)

Orals 6:00-7:20

[6:00]
Near-optimal Coresets for Robust Clustering

[6:10]
Efficiently Computing Nash Equilibria in Adversarial Team Markov Games

[6:20]
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning

[6:30]
Statistical Efficiency of Score Matching: The View from Isoperimetry

[6:40]
Subquadratic Algorithms for Kernel Matrices via Kernel Density Estimation

[6:50]
Depth Separation with Multilayer Mean-Field Networks

[7:00]
Learning with Stochastic Orders

[7:10]
Nonlinear Reconstruction for Operator Learning of PDEs with Discontinuities

(ends 7:30 AM)

Orals 6:00-6:50

[6:00]
Unsupervised Model Selection for Time Series Anomaly Detection

[6:10]
A Kernel Perspective of Skip Connections in Convolutional Networks

[6:20]
ReAct: Synergizing Reasoning and Acting in Language Models

[6:30]
A framework for benchmarking Class-out-of-distribution detection and its application to ImageNet

[6:40]
Packed Ensembles for efficient uncertainty estimation

(ends 7:30 AM)

Orals 6:00-7:20

[6:00]
Language Modelling with Pixels

[6:10]
Parametrizing Product Shape Manifolds by Composite Networks

[6:20]
ImageNet-X: Understanding Model Mistakes with Factor of Variation Annotations

[6:30]
Data Continuity Matters: Improving Sequence Modeling with Lipschitz Regularizer

[6:40]
Dual Algorithmic Reasoning

[6:50]
DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion

[7:00]
Warping the Space: Weight Space Rotation for Class-Incremental Few-Shot Learning

[7:10]
Last Layer Re-Training is Sufficient for Robustness to Spurious Correlations

(ends 7:30 AM)

7:30 a.m.