Quantifying Memorization Across Neural Language Models
Human-Guided Fair Classification for Natural Language Processing
Is Adversarial Training Really a Silver Bullet for Mitigating Data Poisoning?
Is the Performance of My Deep Network Too Good to Be True? A Direct Approach to Estimating the Bayes Error in Binary Classification
UNICORN: A Unified Backdoor Trigger Inversion Framework
Canary in a Coalmine: Better Membership Inference with Ensembled Adversarial Queries
Learning to Estimate Shapley Values with Vision Transformers
Provable Defense Against Geometric Transformations
Phase2vec: dynamical systems embedding with a physics-informed convolutional network
Evolve Smoothly, Fit Consistently: Learning Smooth Latent Dynamics For Advection-Dominated Systems
Compressing multidimensional weather and climate data into neural networks
D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory
Conditional Antibody Design as 3D Equivariant Graph Translation
Equiformer: Equivariant Graph Attention Transformer for 3D Atomistic Graphs
CROM: Continuous Reduced-Order Modeling of PDEs Using Implicit Neural Representations
DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems
The In-Sample Softmax for Offline Reinforcement Learning
Emergence of Maps in the Memories of Blind Navigation Agents
Does Zero-Shot Reinforcement Learning Exist?
Learning Soft Constraints From Constrained Expert Demonstrations
Offline Q-learning on Diverse Multi-Task Data Both Scales And Generalizes
VIP: Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training
Token Merging: Your ViT But Faster
TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second
Learning Group Importance using the Differentiable Hypergeometric Distribution
Neural Networks and the Chomsky Hierarchy
Learning on Large-scale Text-attributed Graphs via Variational Inference
A probabilistic framework for task-aligned intra- and inter-area neural manifold estimation
Neuroevolution is a Competitive Alternative to Reinforcement Learning for Skill Discovery
Disentanglement with Biological Constraints: A Theory of Functional Cell Types
Hebbian Deep Learning Without Feedback
Domain Generalization via Heckman-type Selection Models
Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
Fisher-Legendre (FishLeg) optimization of deep neural networks
Modeling the Data-Generating Process is Necessary for Out-of-Distribution Generalization
Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets
NeRN: Learning Neural Representations for Neural Networks
Divide to Adapt: Mitigating Confirmation Bias for Domain Adaptation of Black-Box Predictors
Continual Unsupervised Disentangling of Self-Organizing Representations
Girmaw Abebe Tadesse
Neural Optimal Transport
Implicit Bias of Large Depth Networks: a Notion of Rank for Nonlinear Functions
Effects of Graph Convolutions in Multi-layer Networks
Modeling content creator incentives on algorithm-curated platforms
Optimal Transport for Offline Imitation Learning
The Symmetric Generalized Eigenvalue Problem as a Nash Equilibrium
Metadata Archaeology: Unearthing Data Subsets by Leveraging Training Dynamics
Targeted Hyperparameter Optimization with Lexicographic Preferences Over Multiple Objectives
LAVA: Data Valuation without Pre-Specified Learning Algorithms
Learning a Data-Driven Policy Network for Pre-Training Automated Feature Engineering
Learning where and when to reason in neuro-symbolic inference
Multi-skill Mobile Manipulation for Object Rearrangement
The Surprising Effectiveness of Equivariant Models in Domains with Latent Symmetry
A System for Morphology-Task Generalization via Unified Representation and Behavior Distillation
Sample-Efficient Reinforcement Learning by Breaking the Replay Ratio Barrier
Powderworld: A Platform for Understanding Generalization via Rich Task Distributions
Near-optimal Policy Identification in Active Reinforcement Learning
BC-IRL: Learning Generalizable Reward Functions from Demonstrations
Learning About Progress From Experts
Diffusion Posterior Sampling for General Noisy Inverse Problems
Prompt-to-Prompt Image Editing with Cross-Attention Control
Sequential Latent Variable Models for Few-Shot High-Dimensional Time-Series Forecasting
Diffusion Models Already Have A Semantic Latent Space
DreamFusion: Text-to-3D using 2D Diffusion
Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions
GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Human Motion Diffusion Model
NTFields: Neural Time Fields for Physics-Informed Robot Motion Planning
UNIFIED-IO: A Unified Model for Vision, Language, and Multi-modal Tasks
Mass-Editing Memory in a Transformer
On the Usefulness of Embeddings, Clusters and Strings for Text Generation Evaluation
A Call to Reflect on Evaluation Practices for Failure Detection in Image Classification
Associative Memory Augmented Asynchronous Spatiotemporal Representation Learning for Event-based Perception
MapTR: Structured Modeling and Learning for Online Vectorized HD Map Construction
ROSCOE: A Suite of Metrics for Scoring Step-by-Step Reasoning
Ask Me Anything: A simple strategy for prompting language models
Code Translation with Compiler Representations
Hidden Markov Transformer for Simultaneous Machine Translation
Selection-Inference: Exploiting Large Language Models for Interpretable Logical Reasoning
Masashi Sugiyama
On the duality between contrastive and non-contrastive self-supervised learning
Unsupervised Meta-learning via Few-shot Pseudo-supervised Contrastive Learning
The Trade-off between Universality and Label Efficiency of Representations from Contrastive Learning
Self-supervised learning with rotation-invariant kernels
DINO as a von Mises-Fisher mixture model
Loss Landscapes are All You Need: Neural Network Generalization Can Be Explained Without the Implicit Bias of Gradient Descent
Efficient Discrete Multi Marginal Optimal Transport Regularization
Sparsity-Constrained Optimal Transport
Efficient Conditionally Invariant Representation Learning
Image to Sphere: Learning Equivariant Features for Efficient Pose Prediction
Omnigrok: Grokking Beyond Algorithmic Data
Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers
Multi-Rate VAE: Train Once, Get the Full Rate-Distortion Curve
Multi-lingual Evaluation of Code Generation Models
Rethinking the Expressive Power of GNNs via Graph Biconnectivity
Hyperbolic Deep Reinforcement Learning
The Role of ImageNet Classes in Fréchet Inception Distance
Learning Diffusion Bridges on Constrained Domains
Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow
Learning multi-scale local conditional probability models of images
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion
Rarity Score : A New Metric to Evaluate the Uncommonness of Synthesized Images
Deterministic training of generative autoencoders using invertible layers
3D generation on ImageNet
Adversarial Diversity in Hanabi
Moving Forward by Moving Backward: Embedding Action Impact over Action Semantics
Programmatically Grounded, Compositionally Generalizable Robotic Manipulation
On the Sensitivity of Reward Inference to Misspecified Human Models
Understanding and Adopting Rational Behavior by Bellman Score Estimation
SMART: Self-supervised Multi-task pretrAining with contRol Transformers
Dichotomy of Control: Separating What You Can Control from What You Cannot
Fast and Precise: Adjusting Planning Horizon with Adaptive Subgoal Search
Sign and Basis Invariant Networks for Spectral Graph Representation Learning
ACMP: Allen-Cahn Message Passing with Attractive and Repulsive Forces for Graph Neural Networks
Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task
QuAnt: Quantum Annealing with Learnt Couplings
Unmasking the Lottery Ticket Hypothesis: What's Encoded in a Winning Ticket's Mask?
The Asymmetric Maximum Margin Bias of Quasi-Homogeneous Neural Networks
The Lie Derivative for Measuring Learned Equivariance
Training language models to summarize narratives improves brain alignment
Planning Goals for Exploration
Outcome-directed Reinforcement Learning by Uncertainty \& Temporal Distance-Aware Curriculum Goal Generation
Pink Noise Is All You Need: Colored Noise Exploration in Deep Reinforcement Learning
Benchmarking Offline Reinforcement Learning on Real-Robot Hardware
Choreographer: Learning and Adapting Skills in Imagination
A CMDP-within-online framework for Meta-Safe Reinforcement Learning
Confidence-Conditioned Value Functions for Offline Reinforcement Learning
Extreme Q-Learning: MaxEnt RL without Entropy
Active Learning in Bayesian Neural Networks with Balanced Entropy Learning Principle
SAM as an Optimal Relaxation of Bayes
Generative Augmented Flow Networks
A Laplace-inspired Distribution on SO(3) for Probabilistic Rotation Estimation
Domain-Indexing Variational Bayes: Interpretable Domain Index for Domain Adaptation
GRACE-C: Generalized Rate Agnostic Causal Estimation via Constraints
Rhino: Deep Causal Temporal Relationship Learning with History-dependent Noise
Minimalistic Unsupervised Representation Learning with the Sparse Manifold Transform
AANG : Automating Auxiliary Learning
STUNT: Few-shot Tabular Learning with Self-generated Tasks from Unlabeled Tables
Task-customized Masked Autoencoder via Mixture of Cluster-conditional Experts
When Source-Free Domain Adaptation Meets Learning with Noisy Labels
Towards Stable Test-time Adaptation in Dynamic Wild World
Proposal-Contrastive Pretraining for Object Detection from Fewer Data
Unsupervised Semantic Segmentation with Self-supervised Object-centric Representations
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs
Flow Annealed Importance Sampling Bootstrap
Learning Controllable Adaptive Simulation for Multi-resolution Physics
Minimax Optimal Kernel Operator Learning via Multilevel Training
Neural Lagrangian Schr\"{o}dinger Bridge: Diffusion Modeling for Population Dynamics
Pre-training via Denoising for Molecular Property Prediction
MARS: Meta-learning as Score Matching in the Function Space
Transformers are Sample-Efficient World Models
Building a Subspace of Policies for Scalable Continual Learning
Neural Episodic Control with State Abstraction
Learnable Behavior Control: Breaking Atari Human World Records via Sample-Efficient Behavior Selection
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization
Is Conditional Generative Modeling all you need for Decision Making?
RLx2: Training a Sparse Deep Reinforcement Learning Model from Scratch
Towards Effective and Interpretable Human-Agent Collaboration in MOBA Games: A Communication Perspective
CUDA: Curriculum of Data Augmentation for Long-tailed Recognition
One-Pixel Shortcut: On the Learning Preference of Deep Neural Networks
Learning Label Encodings for Deep Regression
Multifactor Sequential Disentanglement via Structured Koopman Autoencoders
A Unified Algebraic Perspective on Lipschitz Neural Networks
From Play to Policy: Conditional Behavior Generation from Uncurated Robot Data
Git Re-Basin: Merging Models modulo Permutation Symmetries
In-context Reinforcement Learning with Algorithm Distillation
Posters 7:30-9:30
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language
Clean-image Backdoor: Attacking Multi-label Models with Poisoned Labels Only
DocPrompting: Generating Code by Retrieving the Docs
View Synthesis with Sculpted Neural Points
VA-DepthNet: A Variational Approach to Single Image Depth Prediction
Visual Classification via Description from Large Language Models
Mitigating Gradient Bias in Multi-objective Learning: A Provably Convergent Approach
Hungry Hungry Hippos: Towards Language Modeling with State Space Models
Relative representations enable zero-shot latent space communication
ExpressivE: A Spatio-Functional Embedding For Knowledge Graph Completion
Distilling Model Failures as Directions in Latent Space
Graph Neural Networks for Link Prediction with Subgraph Sketching
The Influence of Learning Rule on Representation Dynamics in Wide Neural Networks
A Closer Look at Model Adaptation using Feature Distortion and Simplicity Bias
Progress measures for grokking via mechanistic interpretability
Localized Randomized Smoothing for Collective Robustness Certification
Towards Interpretable Deep Reinforcement Learning with Human-Friendly Prototypes
CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks
Model-based Causal Bayesian Optimization
Corrupted Image Modeling for Self-Supervised Visual Pre-Training
SimPer: Simple Self-Supervised Learning of Periodic Targets
Simplicial Embeddings in Self-Supervised Learning and Downstream Classification
DASHA: Distributed Nonconvex Optimization with Communication Compression and Optimal Oracle Complexity
Single-shot General Hyper-parameter Optimization for Federated Learning
Solving Constrained Variational Inequalities via a First-order Interior Point-based Method
FedExP: Speeding Up Federated Averaging via Extrapolation
LMC: Fast Training of GNNs via Subgraph Sampling with Provable Convergence
Multi-Objective Online Learning
Continuous PDE Dynamics Forecasting with Implicit Neural Representations
Efficient recurrent architectures through activity sparsity and sparse back-propagation through time
PLOT: Prompt Learning with Optimal Transport for Vision-Language Models
Aligning Model and Macaque Inferior Temporal Cortex Representations Improves Model-to-Human Behavioral Alignment and Adversarial Robustness
Implicit regularization in Heavy-ball momentum accelerated stochastic gradient descent
A Primal-Dual Framework for Transformers and Neural Networks
Learning with Logical Constraints but without Shortcut Satisfaction
No Reason for No Supervision: Improved Generalization in Supervised Models
Generating Diverse Cooperative Agents by Learning Incompatible Policies
Revocable Deep Reinforcement Learning with Affinity Regularization for Outlier-Robust Graph Matching
Jascha Sohl-Dickstein
DaxBench: Benchmarking Deformable Object Manipulation with Differentiable Physics
Betty: An Automatic Differentiation Library for Multilevel Optimization
WikiWhy: Answering and Explaining Cause-and-Effect Questions
MEDFAIR: Benchmarking Fairness for Medical Imaging
Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation
Confidential-PROFITT: Confidential PROof of FaIr Training of Trees
Disparate Impact in Differential Privacy from Gradient Misalignment
Binding Language Models in Symbolic Languages
MeshDiffusion: Score-based Generative 3D Mesh Modeling
The Modality Focusing Hypothesis: Towards Understanding Crossmodal Knowledge Distillation
AutoGT: Automated Graph Transformer Architecture Search
Zero-Shot Image Restoration Using Denoising Diffusion Null-Space Model
LightGCL: Simple Yet Effective Graph Contrastive Learning for Recommendation
Certified Training: Small Boxes are All You Need
Inequality phenomenon in $l_{\infty}$-adversarial training, and its unrealized threats
Agree to Disagree: Diversity through Disagreement for Better Transferability
What learning algorithm is in-context learning? Investigations with linear models
Addressing Parameter Choice Issues in Unsupervised Domain Adaptation by Aggregation
Encoding Recurrence into Transformers
Universal Few-shot Learning of Dense Prediction Tasks with Visual Token Matching
Simplified State Space Layers for Sequence Modeling
Relational Attention: Generalizing Transformers for Graph-Structured Tasks
Sparse Mixture-of-Experts are Domain Generalizable Learners
Near-optimal Coresets for Robust Clustering
Efficiently Computing Nash Equilibria in Adversarial Team Markov Games
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning
Statistical Efficiency of Score Matching: The View from Isoperimetry
Subquadratic Algorithms for Kernel Matrices via Kernel Density Estimation
Depth Separation with Multilayer Mean-Field Networks
Learning with Stochastic Orders
Nonlinear Reconstruction for Operator Learning of PDEs with Discontinuities
Unsupervised Model Selection for Time Series Anomaly Detection
A Kernel Perspective of Skip Connections in Convolutional Networks
ReAct: Synergizing Reasoning and Acting in Language Models
A framework for benchmarking Class-out-of-distribution detection and its application to ImageNet
Packed Ensembles for efficient uncertainty estimation
Language Modelling with Pixels
Parametrizing Product Shape Manifolds by Composite Networks
ImageNet-X: Understanding Model Mistakes with Factor of Variation Annotations
Data Continuity Matters: Improving Sequence Modeling with Lipschitz Regularizer
Dual Algorithmic Reasoning
DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion
Warping the Space: Weight Space Rotation for Class-Incremental Few-Shot Learning
Last Layer Re-Training is Sufficient for Robustness to Spurious Correlations