The schedule for 2023 is still being finalized. Recheck this page again closer to the conference start date.

Show Detail » Timezone:
Filter Events:  
Filter Rooms:  


1 a.m.
Oral s 1:00-2:10
[1:00] A framework for benchmarking Class-out-of-distribution detection and its application to ImageNet
[1:10] Token Merging: Your ViT But Faster
[1:20] TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second
[1:30] Learning Group Importance using the Differentiable Hypergeometric Distribution
[1:40] Neural Networks and the Chomsky Hierarchy
[1:50] Learning on Large-scale Text-attributed Graphs via Variational Inference
[2:00] Image as Set of Points
(ends 2:30 AM)
Oral s 1:00-2:30
[1:00] Flow Annealed Importance Sampling Bootstrap
[1:10] Dynamical systems embedding with a physics-informed convolutional network
[1:20] Evolve Smoothly, Fit Consistently: Learning Smooth Latent Dynamics For Advection-Dominated Systems
[1:30] Compressing multidimensional weather and climate data into neural networks
[1:40] D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory
[1:50] Conditional Antibody Design as 3D Equivariant Graph Translation
[2:00] Equiformer: Equivariant Graph Attention Transformer for 3D Atomistic Graphs
[2:10] CROM: Continuous Reduced-Order Modeling of PDEs Using Implicit Neural Representations
[2:20] Continuous PDE Dynamics Forecasting with Implicit Neural Representations
(ends 2:30 AM)
Oral s 1:00-2:30
[1:00] Localized Randomized Smoothing for Collective Robustness Certification
[1:10] Quantifying Memorization Across Neural Language Models
[1:20] Generating Intuitive Fairness Specifications for Natural Language Processing
[1:30] Is Adversarial Training Really a Silver Bullet for Mitigating Data Poisoning?
[1:40] Is the Performance of My Deep Network Too Good to Be True? A Direct Approach to Estimating the Bayes Error in Binary Classification
[1:50] UNICORN: A Unified Backdoor Trigger Inversion Framework
[2:00] Canary in a Coalmine: Better Membership Inference with Ensembled Adversarial Queries
[2:10] Learning to Estimate Shapley Values with Vision Transformers
[2:20] Provable Defense Against Geometric Transformations
(ends 2:30 AM)
Oral s 1:00-2:20
[1:00] Hyperbolic Deep Reinforcement Learning
[1:10] DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems
[1:20] The In-Sample Softmax for Offline Reinforcement Learning
[1:30] Emergence of Maps in the Memories of Blind Navigation Agents
[1:40] Does Zero-Shot Reinforcement Learning Exist?
[1:50] Learning Soft Constraints From Constrained Expert Demonstrations
[2:00] Offline Q-learning on Diverse Multi-Task Data Both Scales And Generalizes
[2:10] Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training
(ends 2:30 AM)
Oral s 1:00-2:20
[1:00] Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
[1:10] Fisher-Legendre (FishLeg) optimization of deep neural networks
[1:20] Modeling the Data-Generating Process is Necessary for Out-of-Distribution Generalization
[1:30] Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets
[1:40] NeRN: Learning Neural Representations for Neural Networks
[1:50] Divide to Adapt: Mitigating Confirmation Bias for Domain Adaptation of Black-Box Predictors
[2:00] Continual Unsupervised Disentangling of Self-Organizing Representations
[2:10] Optimal Transport for Offline Imitation Learning
(ends 2:30 AM)
2:30 a.m.
(ends 4:30 AM)
6 a.m.
Oral s 6:00-7:30
[6:00] GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation
[6:10] ReAct: Synergizing Reasoning and Acting in Language Models
[6:20] Is Reinforcement Learning (Not) for Natural Language Processing?: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
[6:30] Human Motion Diffusion Model
[6:40] NTFields: Neural Time Fields for Physics-Informed Robot Motion Planning
[6:50] PEER: A Collaborative Language Model
[7:00] UNIFIED-IO: A Unified Model for Vision, Language, and Multi-modal Tasks
[7:10] Mass-Editing Memory in a Transformer
[7:20] On the Usefulness of Embeddings, Clusters and Strings for Text Generation Evaluation
(ends 7:30 AM)
Oral s 6:00-7:30
[6:00] The Modality Focusing Hypothesis: Towards Understanding Crossmodal Knowledge Distillation
[6:10] The Generalized Eigenvalue Problem as a Nash Equilibrium
[6:20] Metadata Archaeology: Unearthing Data Subsets by Leveraging Training Dynamics
[6:30] Targeted Hyperparameter Optimization with Lexicographic Preferences Over Multiple Objectives
[6:40] Unsupervised Model Selection for Time Series Anomaly Detection
[6:50] LAVA: Data Valuation without Pre-Specified Learning Algorithms
[7:00] Packed Ensembles for efficient uncertainty estimation
[7:10] Learning a Data-Driven Policy Network for Pre-Training Automated Feature Engineering
[7:20] Learning where and when to reason in neuro-symbolic inference
(ends 7:30 AM)
Oral s 6:00-7:00
[6:00] Diffusion Posterior Sampling for General Noisy Inverse Problems
[6:10] Prompt-to-Prompt Image Editing with Cross-Attention Control
[6:20] Sequential Latent Variable Models for Few-Shot High-Dimensional Time-Series Forecasting
[6:30] Diffusion Models Already Have A Semantic Latent Space
[6:40] DreamFusion: Text-to-3D using 2D Diffusion
[6:50] Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions
(ends 7:30 AM)
Oral s 6:00-7:30
[6:00] Planning Goals for Exploration
[6:10] Multi-skill Mobile Manipulation for Object Rearrangement
[6:20] The Surprising Effectiveness of Equivariant Models in Domains with Latent Symmetry
[6:30] A System for Morphology-Task Generalization via Unified Representation and Behavior Distillation
[6:40] Sample-Efficient Reinforcement Learning by Breaking the Replay Ratio Barrier
[6:50] Powderworld: A Platform for Understanding Generalization via Rich Task Distributions
[7:00] Near-optimal Policy Identification in Active Reinforcement Learning
[7:10] BC-IRL: Learning Generalizable Reward Functions from Demonstrations
[7:20] Learning About Progress From Experts
(ends 7:30 AM)
Oral s 6:00-6:50
[6:00] Neural Optimal Transport
[6:10] Implicit Bias of Large Depth Networks: a Notion of Rank for Nonlinear Functions
[6:20] Effects of Graph Convolutions in Multi-layer Networks
[6:30] A Kernel Perspective of Skip Connections in Convolutional Networks
[6:40] Modeling content creator incentives on algorithm-curated platforms
(ends 7:30 AM)
Oral s 6:00-7:20
[6:00] A Call to Reflect on Evaluation Practices for Failure Detection in Image Classification
[6:10] Associative Memory Augmented Asynchronous Spatiotemporal Representation Learning for Event-based Perception
[6:20] MapTR: Structured Modeling and Learning for Online Vectorized HD Map Construction
[6:30] ROSCOE: A Suite of Metrics for Scoring Step-by-Step Reasoning
[6:40] Ask Me Anything: A simple strategy for prompting language models
[6:50] Code Translation with Compiler Representations
[7:00] Hidden Markov Transformer for Simultaneous Machine Translation
[7:10] Selection-Inference: Exploiting Large Language Models for Interpretable Logical Reasoning
(ends 7:30 AM)
7:30 a.m.
(ends 9:30 AM)
1 a.m.
Oral s 1:00-2:30
[1:00] Neuroevolution is a Competitive Alternative to Reinforcement Learning for Skill Discovery
[1:10] Adversarial Diversity in Hanabi
[1:20] Moving Forward by Moving Backward: Embedding Action Impact over Action Semantics
[1:30] Programmatically Grounded, Compositionally Generalizable Robotic Manipulation
[1:40] On the Sensitivity of Reward Inference to Misspecified Human Models
[1:50] Understanding and Adopting Rational Behavior by Bellman Score Estimation
[2:00] SMART: Self-supervised Multi-task pretrAining with contRol Transformers
[2:10] Dichotomy of Control: Separating What You Can Control from What You Cannot
[2:20] Fast and Precise: Adjusting Planning Horizon with Adaptive Subgoal Search
(ends 2:30 AM)
Oral s 1:00-2:30
[1:00] AutoGT: Automated Graph Transformer Architecture Search
[1:10] Efficient Conditionally Invariant Representation Learning
[1:20] Image to Sphere: Learning Equivariant Features for Efficient Pose Prediction
[1:30] Omnigrok: Grokking Beyond Algorithmic Data
[1:40] Sparse MoE with Random Routing as the New Dropout: Training Bigger and Self-Scalable Models
[1:50] Multi-Rate VAE: Train Once, Get the Full Rate-Distortion Curve
[2:00] Multi-lingual Evaluation of Code Generation Models
[2:10] Rethinking the Expressive Power of GNNs via Graph Biconnectivity
[2:20] When and why Vision-Language Models behave like Bags-of-Words, and what to do about it?
(ends 2:30 AM)
Oral s 1:00-2:30
[1:00] Score-based Generative 3D Mesh Modeling
[1:10] The Role of ImageNet Classes in Fréchet Inception Distance
[1:20] Learning Diffusion Bridges on Constrained Domains
[1:30] Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow
[1:40] Learning multi-scale local conditional probability models of images
[1:50] An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion
[2:00] Rarity Score : A New Metric to Evaluate the Uncommonness of Synthesized Images
[2:10] Closing the gap: Exact maximum likelihood training of generative autoencoders using invertible layers
[2:20] 3D generation on ImageNet
(ends 2:30 AM)
Oral s 1:00-2:30
[1:00] On the duality between contrastive and non-contrastive self-supervised learning
[1:10] Unsupervised Meta-learning via Few-shot Pseudo-supervised Contrastive Learning
[1:20] The Trade-off between Universality and Label Efficiency of Representations from Contrastive Learning
[1:30] Self-supervised learning with rotation-invariant kernels
[1:40] DINO as a von Mises-Fisher mixture model
[1:50] Domain Generalization via Heckman-type Selection Models
[2:00] Gradient-based optimization is not necessary for generalization in neural networks
[2:10] Efficient Discrete Multi Marginal Optimal Transport Regularization
[2:20] Sparsity-Constrained Optimal Transport
(ends 2:30 AM)
Oral s 1:00-2:20
[1:00] Sign and Basis Invariant Networks for Spectral Graph Representation Learning
[1:10] ACMP: Allen-Cahn Message Passing with Attractive and Repulsive Forces for Graph Neural Networks
[1:20] Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task
[1:30] QuAnt: Quantum Annealing with Learnt Couplings
[1:40] Unmasking the Lottery Ticket Hypothesis: What's Encoded in a Winning Ticket's Mask?
[1:50] The Asymmetric Maximum Margin Bias of Quasi-Homogeneous Neural Networks
[2:00] The Lie Derivative for Measuring Learned Equivariance
[2:10] Training language models to summarize narratives improves brain alignment
(ends 2:30 AM)
2:30 a.m.
(ends 4:30 AM)
6 a.m.
Oral s 6:00-7:20
[6:00] Minimalistic Unsupervised Learning with the Sparse Manifold Transform
[6:10] AANG : Automating Auxiliary Learning
[6:20] STUNT: Few-shot Tabular Learning with Self-generated Tasks from Unlabeled Tables
[6:30] Task-customized Masked Autoencoder via Mixture of Cluster-conditional Experts
[6:40] When Source-Free Domain Adaptation Meets Learning with Noisy Labels
[6:50] Towards Stable Test-time Adaptation in Dynamic Wild World
[7:00] Proposal-Contrastive Pretraining for Object Detection from Fewer Data
[7:10] Unsupervised Semantic Segmentation with Self-supervised Object-centric Representations
(ends 7:30 AM)
Oral s 6:00-7:20
[6:00] Active Learning in Bayesian Neural Networks with Balanced Entropy Learning Principle
[6:10] SAM as an Optimal Relaxation of Bayes
[6:20] Generative Augmented Flow Networks
[6:30] A Laplace-inspired Distribution on SO(3) for Probabilistic Rotation Estimation
[6:40] Domain-Indexing Variational Bayes for Domain Adaptation
[6:50] GRACE-C: Generalized Rate Agnostic Causal Estimation via Constraints
[7:00] Rhino: Deep Causal Temporal Relationship Learning with History-dependent Noise
[7:10] Martingale Posterior Neural Processes
(ends 7:30 AM)
Oral s 6:00-7:10
[6:00] Outcome-directed Reinforcement Learning by Uncertainty \& Temporal Distance-Aware Curriculum Goal Generation
[6:10] Pink Noise Is All You Need: Colored Noise Exploration in Deep Reinforcement Learning
[6:20] Benchmarking Offline Reinforcement Learning on Real-Robot Hardware
[6:30] Choreographer: Learning and Adapting Skills in Imagination
[6:40] A CMDP-within-online framework for Meta-Safe Reinforcement Learning
[6:50] Confidence-Conditioned Value Functions for Offline Reinforcement Learning
[7:00] Extreme Q-Learning: MaxEnt RL without Entropy
(ends 7:30 AM)
Oral s 6:00-7:20
[6:00] Transformers are Sample-Efficient World Models
[6:10] Building a Subspace of Policies for Scalable Continual Learning
[6:20] Neural Episodic Control with State Abstraction
[6:30] Learnable Behavior Control: Breaking Atari Human World Records via Sample-Efficient Behavior Selection
[6:40] Sparse Q-Learning: Offline Reinforcement Learning with Implicit Value Regularization
[6:50] Is Conditional Generative Modeling all you need for Decision Making?
[7:00] RLx2: Training a Sparse Deep Reinforcement Learning Model from Scratch
[7:10] Towards Effective and Interpretable Human-Agent Collaboration in MOBA Games: A Communication Perspective
(ends 7:30 AM)
Oral s 6:00-7:00
[6:00] Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs
[6:10] Learning Controllable Adaptive Simulation for Multi-resolution Physics
[6:20] Minimax Optimal Kernel Operator Learning via Multilevel Training
[6:30] Neural Lagrangian Schr\"{o}dinger Bridge: Diffusion Modeling for Population Dynamics
[6:40] Pre-training via Denoising for Molecular Property Prediction
[6:50] MARS: Meta-learning as Score Matching in the Function Space
(ends 7:30 AM)
Oral s 6:00-7:20
[6:00] CUDA: Curriculum of Data Augmentation for Long-tailed Recognition
[6:10] One-Pixel Shortcut: On the Learning Preference of Deep Neural Networks
[6:20] Learning Label Encodings for Deep Regression
[6:30] Multifactor Sequential Disentanglement via Structured Koopman Autoencoders
[6:40] A Unified Algebraic Perspective on Lipschitz Neural Networks
[6:50] From Play to Policy: Conditional Behavior Generation from Uncurated Robot Data
[7:00] Git Re-Basin: Merging Models modulo Permutation Symmetries
[7:10] In-context Reinforcement Learning with Algorithm Distillation
(ends 7:30 AM)
7:30 a.m.
(ends 9:30 AM)
1 a.m.
Oral s 1:00-2:00
[1:00] DASHA: Distributed Nonconvex Optimization with Communication Compression and Optimal Oracle Complexity
[1:10] Single-shot General Hyper-parameter Optimization for Federated Learning
[1:20] Solving Constrained Variational Inequalities via a First-order Interior Point-based Method
[1:30] FedExP: Speeding up Federated Averaging via Extrapolation
[1:40] LMC: Fast Training of GNNs via Subgraph Sampling with Provable Convergence
[1:50] Multi-Objective Online Learning
(ends 2:30 AM)
Oral s 1:00-2:20
[1:00] Hungry Hungry Hippos: Towards Language Modeling with State Space Models
[1:10] Relative representations enable zero-shot latent space communication
[1:20] ExpressivE: A Spatio-Functional Embedding For Knowledge Graph Completion
[1:30] Distilling Model Failures as Directions in Latent Space
[1:40] Graph Neural Networks for Link Prediction with Subgraph Sketching
[1:50] The Influence of Learning Rule on Representation Dynamics in Wide Neural Networks
[2:10] A Closer Look at Model Adaptation using Feature Distortion and Simplicity Bias
(ends 2:30 AM)
Oral s 1:00-2:10
[1:00] Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language
[1:10] Clean-image Backdoor: Attacking Multi-label Models with Poisoned Labels Only
[1:20] DocPrompting: Generating Code by Retrieving the Docs
[1:30] View Synthesis with Sculpted Neural Points
[1:40] VA-DepthNet: A Variational Approach to Single Image Depth Prediction
[1:50] Visual Classification via Description from Large Language Models
[2:00] Mitigating Gradient Bias in Multi-objective Learning: A Provably Convergent Approach
(ends 2:30 AM)
Oral s 1:00-2:20
[1:00] Efficient recurrent architectures through activity sparsity and sparse back-propagation through time
[1:10] PLOT: Prompt Learning with Optimal Transport for Vision-Language Models
[1:20] Aligning Model and Macaque Inferior Temporal Cortex Representations Improves Model-to-Human Behavioral Alignment and Adversarial Robustness
[1:30] Implicit regularization in Heavy-ball momentum accelerated stochastic gradient descent
[1:40] A Primal-Dual Framework for Transformers and Neural Networks
[1:50] Learning with Logical Constraints but without Shortcut Satisfaction
[2:00] No Reason for No Supervision: Improved Generalization in Supervised Models
[2:10] Generating Diverse Cooperative Agents by Learning Incompatible Policies
(ends 2:30 AM)
2:30 a.m.
(ends 4:30 AM)
6 a.m.
Oral s 6:00-7:30
[6:00] Near-optimal Coresets for Robust Clustering
[6:10] Efficiently Computing Nash Equilibria in Adversarial Team Markov Games
[6:20] Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning
[6:30] Statistical Efficiency of Score Matching: The View from Isoperimetry
[6:40] Sublinear Algorithms for Kernel Matrices via Kernel Density Estimation
[6:50] Depth Separation with Multilayer Mean-Field Networks
[7:00] Learning with Stochastic Orders
[7:10] Nonlinear Reconstruction for Operator Learning of PDEs with Discontinuities
[7:20] A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning
(ends 7:30 AM)
Oral s 6:00-7:20
[6:00] Agree to Disagree: Diversity through Disagreement for Better Transferability
[6:10] What learning algorithm is in-context learning? Investigations with linear models
[6:20] Addressing Parameter Choice Issues in Unsupervised Domain Adaptation by Aggregation
[6:30] Encoding Recurrence into Transformers
[6:40] Universal Few-shot Learning of Dense Prediction Tasks with Visual Token Matching
[6:50] Simplified State Space Layers for Sequence Modeling
[7:00] Relational Attention: Generalizing Transformers for Graph-Structured Tasks
[7:10] Sparse Mixture-of-Experts are Domain Generalizable Learners
(ends 7:30 AM)
Oral s 6:00-6:40
[6:10] Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation
[6:20] Confidential-PROFITT: Confidential PROof of FaIr Training of Trees
[6:30] Disparate Impact in Differential Privacy from Gradient Misalignment
(ends 7:30 AM)
Oral s 6:00-6:50
[6:00] Binding Language Models in Symbolic Languages
[6:10] Zero-Shot Image Restoration Using Denoising Diffusion Null-Space Model
[6:20] Simple Yet Effective Graph Contrastive Learning for Recommendation
[6:30] Certified Training: Small Boxes are All You Need
[6:40] Inequality phenomenon in $l_{\infty}$-adversarial training, and its unrealized threats
(ends 7:30 AM)
Oral s 6:00-7:20
[6:00] Language Modelling with Pixels
[6:10] Parametrizing Product Shape Manifolds by Composite Networks
[6:20] ImageNet-X: Understanding Model Mistakes with Factor of Variation Annotations
[6:30] Data Continuity Matters: Improving Sequence Modeling with Lipschitz Regularizer
[6:40] Dual Algorithmic Reasoning
[6:50] DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion
[7:00] Warping the Space: Weight Space Rotation for Class-Incremental Few-Shot Learning
[7:10] Last Layer Re-Training is Sufficient for Robustness to Spurious Correlations
(ends 7:30 AM)
7:30 a.m.
(ends 9:30 AM)