Skip to yearly menu bar Skip to main content


Show Detail Timezone:
America/Los_Angeles
 
Filter Rooms:  

 

SUN 30 APR
10 p.m.
(ends 10:30 AM)
11:15 p.m.
Remarks:
(ends 11:30 PM)
11:30 p.m.
Invited Talk:
Sofia Crespo
(ends 12:30 AM)
MON 1 MAY
12:30 a.m.
Coffee Break
1 a.m.
Orals 1:00-2:20
[1:00] Quantifying Memorization Across Neural Language Models
[1:10] Human-Guided Fair Classification for Natural Language Processing
[1:20] Is Adversarial Training Really a Silver Bullet for Mitigating Data Poisoning?
[1:30] Is the Performance of My Deep Network Too Good to Be True? A Direct Approach to Estimating the Bayes Error in Binary Classification
[1:40] UNICORN: A Unified Backdoor Trigger Inversion Framework
[1:50] Canary in a Coalmine: Better Membership Inference with Ensembled Adversarial Queries
[2:00] Learning to Estimate Shapley Values with Vision Transformers
[2:10] Provable Defense Against Geometric Transformations
(ends 2:30 AM)
Orals 1:00-2:10
[1:00] Phase2vec: dynamical systems embedding with a physics-informed convolutional network
[1:10] Evolve Smoothly, Fit Consistently: Learning Smooth Latent Dynamics For Advection-Dominated Systems
[1:20] Compressing multidimensional weather and climate data into neural networks
[1:30] D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory
[1:40] Conditional Antibody Design as 3D Equivariant Graph Translation
[1:50] Equiformer: Equivariant Graph Attention Transformer for 3D Atomistic Graphs
[2:00] CROM: Continuous Reduced-Order Modeling of PDEs Using Implicit Neural Representations
(ends 2:30 AM)
Orals 1:00-2:10
[1:00] DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems
[1:10] The In-Sample Softmax for Offline Reinforcement Learning
[1:20] Emergence of Maps in the Memories of Blind Navigation Agents
[1:30] Does Zero-Shot Reinforcement Learning Exist?
[1:40] Learning Soft Constraints From Constrained Expert Demonstrations
[1:50] Offline Q-learning on Diverse Multi-Task Data Both Scales And Generalizes
[2:00] VIP: Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training
(ends 2:30 AM)
Orals 1:10-2:00
[1:10] Token Merging: Your ViT But Faster
[1:20] TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second
[1:30] Learning Group Importance using the Differentiable Hypergeometric Distribution
[1:40] Neural Networks and the Chomsky Hierarchy
[1:50] Learning on Large-scale Text-attributed Graphs via Variational Inference
(ends 2:30 AM)
Orals 1:00-1:50
[1:00] A probabilistic framework for task-aligned intra- and inter-area neural manifold estimation
[1:10] Neuroevolution is a Competitive Alternative to Reinforcement Learning for Skill Discovery
[1:20] Disentanglement with Biological Constraints: A Theory of Functional Cell Types
[1:30] Hebbian Deep Learning Without Feedback
[1:40] Domain Generalization via Heckman-type Selection Models
(ends 2:30 AM)
Orals 1:00-2:10
[1:00] Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
[1:10] Fisher-Legendre (FishLeg) optimization of deep neural networks
[1:20] Modeling the Data-Generating Process is Necessary for Out-of-Distribution Generalization
[1:30] Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets
[1:40] NeRN: Learning Neural Representations for Neural Networks
[1:50] Divide to Adapt: Mitigating Confirmation Bias for Domain Adaptation of Black-Box Predictors
[2:00] Continual Unsupervised Disentangling of Self-Organizing Representations
(ends 2:30 AM)
2:30 a.m.
Posters 2:30-4:30
(ends 4:30 AM)
Lunch
3:30 a.m.
4:30 a.m.
Invited Talk:
Girmaw Abebe Tadesse
(ends 5:30 AM)
5:30 a.m.
Coffee Break
6 a.m.
Orals 6:00-7:00
[6:00] Neural Optimal Transport
[6:10] Implicit Bias of Large Depth Networks: a Notion of Rank for Nonlinear Functions
[6:20] Effects of Graph Convolutions in Multi-layer Networks
[6:40] Modeling content creator incentives on algorithm-curated platforms
[6:50] Optimal Transport for Offline Imitation Learning
(ends 7:30 AM)
Orals 6:00-7:20
[6:00] The Symmetric Generalized Eigenvalue Problem as a Nash Equilibrium
[6:10] Metadata Archaeology: Unearthing Data Subsets by Leveraging Training Dynamics
[6:20] Targeted Hyperparameter Optimization with Lexicographic Preferences Over Multiple Objectives
[6:40] LAVA: Data Valuation without Pre-Specified Learning Algorithms
[7:00] Learning a Data-Driven Policy Network for Pre-Training Automated Feature Engineering
[7:10] Learning where and when to reason in neuro-symbolic inference
(ends 7:30 AM)
Orals 6:00-7:20
[6:00] Multi-skill Mobile Manipulation for Object Rearrangement
[6:10] The Surprising Effectiveness of Equivariant Models in Domains with Latent Symmetry
[6:20] A System for Morphology-Task Generalization via Unified Representation and Behavior Distillation
[6:30] Sample-Efficient Reinforcement Learning by Breaking the Replay Ratio Barrier
[6:40] Powderworld: A Platform for Understanding Generalization via Rich Task Distributions
[6:50] Near-optimal Policy Identification in Active Reinforcement Learning
[7:00] BC-IRL: Learning Generalizable Reward Functions from Demonstrations
[7:10] Learning About Progress From Experts
(ends 7:30 AM)
Orals 6:00-7:00
[6:00] Diffusion Posterior Sampling for General Noisy Inverse Problems
[6:10] Prompt-to-Prompt Image Editing with Cross-Attention Control
[6:20] Sequential Latent Variable Models for Few-Shot High-Dimensional Time-Series Forecasting
[6:30] Diffusion Models Already Have A Semantic Latent Space
[6:40] DreamFusion: Text-to-3D using 2D Diffusion
[6:50] Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions
(ends 7:30 AM)
Orals 6:00-7:20
[6:00] GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation
[6:20] Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
[6:30] Human Motion Diffusion Model
[6:40] NTFields: Neural Time Fields for Physics-Informed Robot Motion Planning
[6:50] UNIFIED-IO: A Unified Model for Vision, Language, and Multi-modal Tasks
[7:00] Mass-Editing Memory in a Transformer
[7:10] On the Usefulness of Embeddings, Clusters and Strings for Text Generation Evaluation
(ends 7:30 AM)
Orals 6:00-7:20
[6:00] A Call to Reflect on Evaluation Practices for Failure Detection in Image Classification
[6:10] Associative Memory Augmented Asynchronous Spatiotemporal Representation Learning for Event-based Perception
[6:20] MapTR: Structured Modeling and Learning for Online Vectorized HD Map Construction
[6:30] ROSCOE: A Suite of Metrics for Scoring Step-by-Step Reasoning
[6:40] Ask Me Anything: A simple strategy for prompting language models
[6:50] Code Translation with Compiler Representations
[7:00] Hidden Markov Transformer for Simultaneous Machine Translation
[7:10] Selection-Inference: Exploiting Large Language Models for Interpretable Logical Reasoning
(ends 7:30 AM)
7:30 a.m.
Posters 7:30-9:30
(ends 9:30 AM)
8 a.m.
9:30 a.m.
Remarks:
(ends 9:50 AM)
9:50 a.m.
Reception:
(ends 11:00 AM)
11 p.m.
(ends 9:00 AM)
11:30 p.m.
Invited Talk:
Masashi Sugiyama
(ends 12:30 AM)
TUE 2 MAY
12:30 a.m.
Coffee Break
1 a.m.
Orals 1:00-2:20
[1:00] On the duality between contrastive and non-contrastive self-supervised learning
[1:10] Unsupervised Meta-learning via Few-shot Pseudo-supervised Contrastive Learning
[1:20] The Trade-off between Universality and Label Efficiency of Representations from Contrastive Learning
[1:30] Self-supervised learning with rotation-invariant kernels
[1:40] DINO as a von Mises-Fisher mixture model
[1:50] Loss Landscapes are All You Need: Neural Network Generalization Can Be Explained Without the Implicit Bias of Gradient Descent
[2:00] Efficient Discrete Multi Marginal Optimal Transport Regularization
[2:10] Sparsity-Constrained Optimal Transport
(ends 2:30 AM)
Orals 1:00-2:20
[1:00] Efficient Conditionally Invariant Representation Learning
[1:10] Image to Sphere: Learning Equivariant Features for Efficient Pose Prediction
[1:20] Omnigrok: Grokking Beyond Algorithmic Data
[1:30] Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers
[1:40] Multi-Rate VAE: Train Once, Get the Full Rate-Distortion Curve
[1:50] Multi-lingual Evaluation of Code Generation Models
[2:00] Rethinking the Expressive Power of GNNs via Graph Biconnectivity
[2:10] Hyperbolic Deep Reinforcement Learning
(ends 2:30 AM)
Orals 1:00-2:20
[1:00] The Role of ImageNet Classes in Fréchet Inception Distance
[1:10] Learning Diffusion Bridges on Constrained Domains
[1:20] Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow
[1:30] Learning multi-scale local conditional probability models of images
[1:40] An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion
[1:50] Rarity Score : A New Metric to Evaluate the Uncommonness of Synthesized Images
[2:00] Deterministic training of generative autoencoders using invertible layers
[2:10] 3D generation on ImageNet
(ends 2:30 AM)
Orals 1:00-2:20
[1:00] Adversarial Diversity in Hanabi
[1:10] Moving Forward by Moving Backward: Embedding Action Impact over Action Semantics
[1:20] Programmatically Grounded, Compositionally Generalizable Robotic Manipulation
[1:30] On the Sensitivity of Reward Inference to Misspecified Human Models
[1:40] Understanding and Adopting Rational Behavior by Bellman Score Estimation
[1:50] SMART: Self-supervised Multi-task pretrAining with contRol Transformers
[2:00] Dichotomy of Control: Separating What You Can Control from What You Cannot
[2:10] Fast and Precise: Adjusting Planning Horizon with Adaptive Subgoal Search
(ends 2:30 AM)
Orals 1:00-2:20
[1:00] Sign and Basis Invariant Networks for Spectral Graph Representation Learning
[1:10] ACMP: Allen-Cahn Message Passing with Attractive and Repulsive Forces for Graph Neural Networks
[1:20] Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task
[1:30] QuAnt: Quantum Annealing with Learnt Couplings
[1:40] Unmasking the Lottery Ticket Hypothesis: What's Encoded in a Winning Ticket's Mask?
[1:50] The Asymmetric Maximum Margin Bias of Quasi-Homogeneous Neural Networks
[2:00] The Lie Derivative for Measuring Learned Equivariance
[2:10] Training language models to summarize narratives improves brain alignment
(ends 2:30 AM)
2:30 a.m.
Posters 2:30-4:30
(ends 4:30 AM)
Affinity Poster Session:
(ends 4:30 AM)
Lunch
3:30 a.m.
4:30 a.m.
Invited Talk:
Elaine Nsoesie
(ends 5:30 AM)
5:30 a.m.
Coffee Break
6 a.m.
Orals 6:00-7:20
[6:00] Planning Goals for Exploration
[6:10] Outcome-directed Reinforcement Learning by Uncertainty \& Temporal Distance-Aware Curriculum Goal Generation
[6:20] Pink Noise Is All You Need: Colored Noise Exploration in Deep Reinforcement Learning
[6:30] Benchmarking Offline Reinforcement Learning on Real-Robot Hardware
[6:40] Choreographer: Learning and Adapting Skills in Imagination
[6:50] A CMDP-within-online framework for Meta-Safe Reinforcement Learning
[7:00] Confidence-Conditioned Value Functions for Offline Reinforcement Learning
[7:10] Extreme Q-Learning: MaxEnt RL without Entropy
(ends 7:30 AM)
Orals 6:00-7:10
[6:00] Active Learning in Bayesian Neural Networks with Balanced Entropy Learning Principle
[6:10] SAM as an Optimal Relaxation of Bayes
[6:20] Generative Augmented Flow Networks
[6:30] A Laplace-inspired Distribution on SO(3) for Probabilistic Rotation Estimation
[6:40] Domain-Indexing Variational Bayes: Interpretable Domain Index for Domain Adaptation
[6:50] GRACE-C: Generalized Rate Agnostic Causal Estimation via Constraints
[7:00] Rhino: Deep Causal Temporal Relationship Learning with History-dependent Noise
(ends 7:30 AM)
Orals 6:00-7:20
[6:00] Minimalistic Unsupervised Representation Learning with the Sparse Manifold Transform
[6:10] AANG : Automating Auxiliary Learning
[6:20] STUNT: Few-shot Tabular Learning with Self-generated Tasks from Unlabeled Tables
[6:30] Task-customized Masked Autoencoder via Mixture of Cluster-conditional Experts
[6:40] When Source-Free Domain Adaptation Meets Learning with Noisy Labels
[6:50] Towards Stable Test-time Adaptation in Dynamic Wild World
[7:00] Proposal-Contrastive Pretraining for Object Detection from Fewer Data
[7:10] Unsupervised Semantic Segmentation with Self-supervised Object-centric Representations
(ends 7:30 AM)
Orals 6:00-7:10
[6:00] Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs
[6:10] Flow Annealed Importance Sampling Bootstrap
[6:20] Learning Controllable Adaptive Simulation for Multi-resolution Physics
[6:30] Minimax Optimal Kernel Operator Learning via Multilevel Training
[6:40] Neural Lagrangian Schr\"{o}dinger Bridge: Diffusion Modeling for Population Dynamics
[6:50] Pre-training via Denoising for Molecular Property Prediction
[7:00] MARS: Meta-learning as Score Matching in the Function Space
(ends 7:30 AM)
Orals 6:00-7:20
[6:00] Transformers are Sample-Efficient World Models
[6:10] Building a Subspace of Policies for Scalable Continual Learning
[6:20] Neural Episodic Control with State Abstraction
[6:30] Learnable Behavior Control: Breaking Atari Human World Records via Sample-Efficient Behavior Selection
[6:40] Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization
[6:50] Is Conditional Generative Modeling all you need for Decision Making?
[7:00] RLx2: Training a Sparse Deep Reinforcement Learning Model from Scratch
[7:10] Towards Effective and Interpretable Human-Agent Collaboration in MOBA Games: A Communication Perspective
(ends 7:30 AM)
Orals 6:00-7:20
[6:00] CUDA: Curriculum of Data Augmentation for Long-tailed Recognition
[6:10] One-Pixel Shortcut: On the Learning Preference of Deep Neural Networks
[6:20] Learning Label Encodings for Deep Regression
[6:30] Multifactor Sequential Disentanglement via Structured Koopman Autoencoders
[6:40] A Unified Algebraic Perspective on Lipschitz Neural Networks
[6:50] From Play to Policy: Conditional Behavior Generation from Uncurated Robot Data
[7:00] Git Re-Basin: Merging Models modulo Permutation Symmetries
[7:10] In-context Reinforcement Learning with Algorithm Distillation
(ends 7:30 AM)
6:30 a.m.
7:30 a.m.
Posters 7:30-9:30
(ends 9:30 AM)
8 a.m.
11 p.m.
(ends 9:00 AM)
11:30 p.m.
Invited Talk:
Dilek Hakkani-Tur
(ends 12:30 AM)
WED 3 MAY
12:30 a.m.
Coffee Break
1 a.m.
Orals 1:00-2:10
[1:00] Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language
[1:10] Clean-image Backdoor: Attacking Multi-label Models with Poisoned Labels Only
[1:20] DocPrompting: Generating Code by Retrieving the Docs
[1:30] View Synthesis with Sculpted Neural Points
[1:40] VA-DepthNet: A Variational Approach to Single Image Depth Prediction
[1:50] Visual Classification via Description from Large Language Models
[2:00] Mitigating Gradient Bias in Multi-objective Learning: A Provably Convergent Approach
(ends 2:30 AM)
Orals 1:00-2:20
[1:00] Hungry Hungry Hippos: Towards Language Modeling with State Space Models
[1:10] Relative representations enable zero-shot latent space communication
[1:20] ExpressivE: A Spatio-Functional Embedding For Knowledge Graph Completion
[1:30] Distilling Model Failures as Directions in Latent Space
[1:40] Graph Neural Networks for Link Prediction with Subgraph Sketching
[1:50] The Influence of Learning Rule on Representation Dynamics in Wide Neural Networks
[2:00] REVISITING PRUNING AT INITIALIZATION THROUGH THE LENS OF RAMANUJAN GRAPH
[2:10] A Closer Look at Model Adaptation using Feature Distortion and Simplicity Bias
(ends 2:30 AM)
Orals 1:00-2:20
[1:00] Progress measures for grokking via mechanistic interpretability
[1:10] Localized Randomized Smoothing for Collective Robustness Certification
[1:20] Towards Interpretable Deep Reinforcement Learning with Human-Friendly Prototypes
[1:30] CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks
[1:40] Model-based Causal Bayesian Optimization
[1:50] Corrupted Image Modeling for Self-Supervised Visual Pre-Training
[2:00] SimPer: Simple Self-Supervised Learning of Periodic Targets
[2:10] Simplicial Embeddings in Self-Supervised Learning and Downstream Classification
(ends 2:30 AM)
Orals 1:00-2:10
[1:00] DASHA: Distributed Nonconvex Optimization with Communication Compression and Optimal Oracle Complexity
[1:10] Single-shot General Hyper-parameter Optimization for Federated Learning
[1:20] Solving Constrained Variational Inequalities via a First-order Interior Point-based Method
[1:30] FedExP: Speeding Up Federated Averaging via Extrapolation
[1:40] LMC: Fast Training of GNNs via Subgraph Sampling with Provable Convergence
[1:50] Multi-Objective Online Learning
[2:00] Continuous PDE Dynamics Forecasting with Implicit Neural Representations
(ends 2:30 AM)
Orals 1:00-2:20
[1:00] Efficient recurrent architectures through activity sparsity and sparse back-propagation through time
[1:10] PLOT: Prompt Learning with Optimal Transport for Vision-Language Models
[1:20] Aligning Model and Macaque Inferior Temporal Cortex Representations Improves Model-to-Human Behavioral Alignment and Adversarial Robustness
[1:30] Implicit regularization in Heavy-ball momentum accelerated stochastic gradient descent
[1:40] A Primal-Dual Framework for Transformers and Neural Networks
[1:50] Learning with Logical Constraints but without Shortcut Satisfaction
[2:00] No Reason for No Supervision: Improved Generalization in Supervised Models
[2:10] Generating Diverse Cooperative Agents by Learning Incompatible Policies
(ends 2:30 AM)
2:30 a.m.
Posters 2:30-4:30
(ends 4:30 AM)
Lunch
3:30 a.m.
4:30 a.m.
5:30 a.m.
Remarks:
(ends 5:35 AM)
Coffee Break
6 a.m.
Orals 6:00-7:10
[6:00] DaxBench: Benchmarking Deformable Object Manipulation with Differentiable Physics
[6:10] Betty: An Automatic Differentiation Library for Multilevel Optimization
[6:20] WikiWhy: Answering and Explaining Cause-and-Effect Questions
[6:30] MEDFAIR: Benchmarking Fairness for Medical Imaging
[6:40] Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation
[6:50] Confidential-PROFITT: Confidential PROof of FaIr Training of Trees
[7:00] Disparate Impact in Differential Privacy from Gradient Misalignment
(ends 7:30 AM)
Orals 6:00-7:20
[6:00] Binding Language Models in Symbolic Languages
[6:10] MeshDiffusion: Score-based Generative 3D Mesh Modeling
[6:20] The Modality Focusing Hypothesis: Towards Understanding Crossmodal Knowledge Distillation
[6:30] AutoGT: Automated Graph Transformer Architecture Search
[6:40] Zero-Shot Image Restoration Using Denoising Diffusion Null-Space Model
[6:50] LightGCL: Simple Yet Effective Graph Contrastive Learning for Recommendation
[7:00] Certified Training: Small Boxes are All You Need
[7:10] Inequality phenomenon in $l_{\infty}$-adversarial training, and its unrealized threats
(ends 7:30 AM)
Orals 6:00-7:20
[6:00] Agree to Disagree: Diversity through Disagreement for Better Transferability
[6:10] What learning algorithm is in-context learning? Investigations with linear models
[6:20] Addressing Parameter Choice Issues in Unsupervised Domain Adaptation by Aggregation
[6:30] Encoding Recurrence into Transformers
[6:40] Universal Few-shot Learning of Dense Prediction Tasks with Visual Token Matching
[6:50] Simplified State Space Layers for Sequence Modeling
[7:00] Relational Attention: Generalizing Transformers for Graph-Structured Tasks
[7:10] Sparse Mixture-of-Experts are Domain Generalizable Learners
(ends 7:30 AM)
Orals 6:00-7:20
[6:00] Near-optimal Coresets for Robust Clustering
[6:10] Efficiently Computing Nash Equilibria in Adversarial Team Markov Games
[6:20] Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning
[6:30] Statistical Efficiency of Score Matching: The View from Isoperimetry
[6:40] Subquadratic Algorithms for Kernel Matrices via Kernel Density Estimation
[6:50] Depth Separation with Multilayer Mean-Field Networks
[7:00] Learning with Stochastic Orders
[7:10] Nonlinear Reconstruction for Operator Learning of PDEs with Discontinuities
(ends 7:30 AM)
Orals 6:00-6:50
[6:00] Unsupervised Model Selection for Time Series Anomaly Detection
[6:10] A Kernel Perspective of Skip Connections in Convolutional Networks
[6:20] ReAct: Synergizing Reasoning and Acting in Language Models
[6:30] A framework for benchmarking Class-out-of-distribution detection and its application to ImageNet
[6:40] Packed Ensembles for efficient uncertainty estimation
(ends 7:30 AM)
Orals 6:00-7:20
[6:00] Language Modelling with Pixels
[6:10] Parametrizing Product Shape Manifolds by Composite Networks
[6:20] ImageNet-X: Understanding Model Mistakes with Factor of Variation Annotations
[6:30] Data Continuity Matters: Improving Sequence Modeling with Lipschitz Regularizer
[6:40] Dual Algorithmic Reasoning
[6:50] DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion
[7:00] Warping the Space: Weight Space Rotation for Class-Incremental Few-Shot Learning
[7:10] Last Layer Re-Training is Sufficient for Robustness to Spurious Correlations
(ends 7:30 AM)
Affinity Event:
(ends 8:00 AM)
7:30 a.m.
Posters 7:30-9:30
(ends 9:30 AM)
10 p.m.
(ends 9:00 AM)
11:20 p.m.
Affinity Workshop:
(ends 9:00 AM)
11:30 p.m.
Affinity Workshop:
(ends 8:00 AM)
THU 4 MAY
12:30 a.m.
Coffee Break
3 a.m.
Lunch
5:30 a.m.
Coffee Break
9 a.m.
10 p.m.
(ends 3:00 AM)
11 p.m.
11:50 p.m.