Skip to yearly menu bar
Skip to main content
Main Navigation
ICLR
Help/FAQ
Contact ICLR
Downloads
ICLR Blog
Code of Conduct
Privacy Policy
Create Profile
Reset Password
Journal To Conference Track
Diversity & Inclusion
Proceedings at OpenReview
Future Meetings
Press
Exhibitor Information
ICLR Twitter
About ICLR
My Stuff
Login
Select Year: (2025)
2025
2024
2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
Getting Started
Schedule
Main Conference
Invited Talks
Awards
Papers
In-person Orals
Spotlight Posters
Workshops
Community
Town Hall
Socials
Sponsors
Organizers
Help
Helpdesk
RocketChat Client
Website FAQ
Browse
Visualization
Layout:
mini
compact
topic
detail
×
No topics available
No sessions available
title
author
topic
session
shuffle
by
serendipity
bookmarked first
visited first
not visited first
bookmarked but not visited
Enable Javascript in your browser to see the papers page.
E(n) Equivariant Topological Neural Networks
Bootstrapped Model Predictive Control
Revealing the 3D Cosmic Web through Gravitationally Constrained Neural Fields
Durable Quantization Conditioned Misalignment Attack on Large Language Models
DisPose: Disentangling Pose Guidance for Controllable Human Image Animation
Sparse autoencoders reveal selective remapping of visual concepts during adaptation
Innovative Thinking, Infinite Humor: Humor Research of Large Language Models through Structured Thought Leaps
Warm Diffusion: Recipe for Blur-Noise Mixture Diffusion Models
LR0.FM: LOW-RESOLUTION ZERO-SHOT CLASSIFICATION BENCHMARK FOR FOUNDATION MODELS
ComPC: Completing a 3D Point Cloud with 2D Diffusion Priors
X-NeMo: Expressive Neural Motion Reenactment via Disentangled Latent Attention
Towards Generalization Bounds of GCNs for Adversarially Robust Node Classification
Geometric Inductive Biases of Deep Networks: The Role of Data and Architecture
Learning Structured Representations by Embedding Class Hierarchy with Fast Optimal Transport
Periodic Materials Generation using Text-Guided Joint Diffusion Model
Enhancing Compositional Text-to-Image Generation with Reliable Random Seeds
CAT-3DGS: A Context-Adaptive Triplane Approach to Rate-Distortion-Optimized 3DGS Compression
Harnessing Diversity for Important Data Selection in Pretraining Large Language Models
Hierarchically Encapsulated Representation for Protocol Design in Self-Driving Labs
Bridging Compressed Image Latents and Multimodal Large Language Models
Adversarial Attacks on Data Attribution
Controlling the Fidelity and Diversity of Deep Generative Models via Pseudo Density
Do vision models perceive objects like toddlers ?
SAM 2: Segment Anything in Images and Videos
Differentiable and Learnable Wireless Simulation with Geometric Transformers
Robust Weight Initialization for Tanh Neural Networks with Fixed Point Analysis
Global Convergence in Neural ODEs: Impact of Activation Functions
Expand and Compress: Exploring Tuning Principles for Continual Spatio-Temporal Graph Forecasting
Denoising Task Difficulty-based Curriculum for Training Diffusion Models
Do LLMs have Consistent Values?
DeciMamba: Exploring the Length Extrapolation Potential of Mamba
Transformers Handle Endogeneity in In-Context Linear Regression
Towards a Unified and Verified Understanding of Group-Operation Networks
Neural Context Flows for Meta-Learning of Dynamical Systems
Guaranteed Generation from Large Language Models
PFGuard: A Generative Framework with Privacy and Fairness Safeguards
Boosting Ray Search Procedure of Hard-label Attacks with Transfer-based Priors
SINGER: Stochastic Network Graph Evolving Operator for High Dimensional PDEs
Semantics-Adaptive Activation Intervention for LLMs via Dynamic Steering Vectors
X-Gen: Ego-centric Video Prediction by Watching Exo-centric Videos
Random-Set Neural Networks
DynAlign: Unsupervised Dynamic Taxonomy Alignment for Cross-Domain Segmentation
Emergent Orientation Maps —— Mechanisms, Coding Efficiency and Robustness
PhysPDE: Rethinking PDE Discovery and a Physical HYpothesis Selection Benchmark
Overcoming Slow Decision Frequencies in Continuous Control: Model-Based Sequence Reinforcement Learning for Model-Free Control
PPT: Patch Order Do Matters In Time Series Pretext Task
Block-Attention for Efficient Prefilling
Hierarchical World Models as Visual Whole-Body Humanoid Controllers
CL-DiffPhyCon: Closed-loop Diffusion Control of Complex Physical Systems
Distribution Backtracking Builds A Faster Convergence Trajectory for Diffusion Distillation
Beyond Graphs: Can Large Language Models Comprehend Hypergraphs?
Do WGANs succeed because they minimize the Wasserstein Distance? Lessons from Discrete Generators
Training-Free Dataset Pruning for Instance Segmentation
Theory on Mixture-of-Experts in Continual Learning
Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence
AI Sandbagging: Language Models can Strategically Underperform on Evaluations
DON’T STOP ME NOW: EMBEDDING BASED SCHEDULING FOR LLMS
Lawma: The Power of Specialization for Legal Annotation
Start Smart: Leveraging Gradients For Enhancing Mask-based XAI Methods
AgentStudio: A Toolkit for Building General Virtual Agents
Revolutionizing EMCCD Denoising through a Novel Physics-Based Learning Framework for Noise Modeling
Semialgebraic Neural Networks: From roots to representations
Conformal Generative Modeling with Improved Sample Efficiency through Sequential Greedy Filtering
PuzzleFusion++: Auto-agglomerative 3D Fracture Assembly by Denoise and Verify
Where Am I and What Will I See: An Auto-Regressive Model for Spatial Localization and View Prediction
Complexity Lower Bounds of Adaptive Gradient Algorithms for Non-convex Stochastic Optimization under Relaxed Smoothness
Mini-Monkey: Alleviating the Semantic Sawtooth Effect for Lightweight MLLMs via Complementary Image Pyramid
DeLLMa: Decision Making Under Uncertainty with Large Language Models
How Discrete and Continuous Diffusion Meet: Comprehensive Analysis of Discrete Diffusion Models via a Stochastic Integral Framework
Local Steps Speed Up Local GD for Heterogeneous Distributed Logistic Regression
NovelQA: Benchmarking Question Answering on Documents Exceeding 200K Tokens
Transformers are Universal In-context Learners
CSA: Data-efficient Mapping of Unimodal Features to Multimodal Features
D-FINE: Redefine Regression Task of DETRs as Fine-grained Distribution Refinement
BlendRL: A Framework for Merging Symbolic and Neural Policy Learning
Bonsai: Gradient-free Graph Condensation for Node Classification
Global Convergence of Policy Gradient in Average Reward MDPs
Offline RL in Regular Decision Processes: Sample Efficiency via Language Metrics
Score-based free-form architectures for high-dimensional Fokker-Planck equations
ND-SDF: Learning Normal Deflection Fields for High-Fidelity Indoor Reconstruction
On Bits and Bandits: Quantifying the Regret-Information Trade-off
Joint Fine-tuning and Conversion of Pretrained Speech and Language Models towards Linear Complexity
Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data
Masked Temporal Interpolation Diffusion for Procedure Planning in Instructional Videos
Joint Graph Rewiring and Feature Denoising via Spectral Resonance
MAGNet: Motif-Agnostic Generation of Molecules from Scaffolds
O(d/T) Convergence Theory for Diffusion Probabilistic Models under Minimal Assumptions
Depth Any Video with Scalable Synthetic Data
Graph-Guided Scene Reconstruction from Images with 3D Gaussian Splatting
CREAM: Consistency Regularized Self-Rewarding Language Models
Injective flows for star-like manifolds
Beyond Content Relevance: Evaluating Instruction Following in Retrieval Models
Model-Agnostic Knowledge Guided Correction for Improved Neural Surrogate Rollout
Bias Mitigation in Graph Diffusion Models
Bayesian Analysis of Combinatorial Gaussian Process Bandits
Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws
Nova: Generative Language Models for Assembly Code with Hierarchical Attention and Contrastive Learning
MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning
IFORMER: INTEGRATING CONVNET AND TRANSFORMER FOR MOBILE APPLICATION
Test-Time Ensemble via Linear Mode Connectivity: A Path to Better Adaptation
A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation
REVISITING MULTI-PERMUTATION EQUIVARIANCE THROUGH THE LENS OF IRREDUCIBLE REPRESENTATIONS
BrainACTIV: Identifying visuo-semantic properties driving cortical selectivity using diffusion-based image manipulation
MANTRA: The Manifold Triangulations Assemblage
Neural Multi-Objective Combinatorial Optimization via Graph-Image Multimodal Fusion
Cached Multi-Lora Composition for Multi-Concept Image Generation
Fourier Head: Helping Large Language Models Learn Complex Probability Distributions
CAKE: Cascading and Adaptive KV Cache Eviction with Layer Preferences
STRAP: Robot Sub-Trajectory Retrieval for Augmented Policy Learning
Scaling LLM Test-Time Compute Optimally Can be More Effective than Scaling Parameters for Reasoning
Adversarial Machine Unlearning
BRAID: Input-driven Nonlinear Dynamical Modeling of Neural-Behavioral Data
Fast training and sampling of Restricted Boltzmann Machines
InstantSplamp: Fast and Generalizable Stenography Framework for Generative Gaussian Splatting
DynFrs: An Efficient Framework for Machine Unlearning in Random Forest
SAMRefiner: Taming Segment Anything Model for Universal Mask Refinement
UniCoTT: A Unified Framework for Structural Chain-of-Thought Distillation
Bridging the Gap between Database Search and \emph{De Novo} Peptide Sequencing with SearchNovo
Diverse Preference Learning for Capabilities and Alignment
Visual Description Grounding Reduces Hallucinations and Boosts Reasoning in LVLMs
MoDeGPT: Modular Decomposition for Large Language Model Compression
Improving Instruction-Following in Language Models through Activation Steering
Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents
Learning Spatiotemporal Dynamical Systems from Point Process Observations
PseDet: Revisiting the Power of Pseudo Label in Incremental Object Detection
Does Training with Synthetic Data Truly Protect Privacy?
MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models
NL-Eye: Abductive NLI For Images
UniDrive: Towards Universal Driving Perception Across Camera Configurations
Scalable Benchmarking and Robust Learning for Noise-Free Ego-Motion and 3D Reconstruction from Noisy Video
Revisit the Open Nature of Open Vocabulary Semantic Segmentation
AIR-BENCH 2024: A Safety Benchmark based on Regulation and Policies Specified Risk Categories
Tamper-Resistant Safeguards for Open-Weight LLMs
DOPL: Direct Online Preference Learning for Restless Bandits with Preference Feedback
DeepRTL: Bridging Verilog Understanding and Generation with a Unified Representation Model
Boost Self-Supervised Dataset Distillation via Parameterization, Predefined Augmentation, and Approximation
MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models
Learning from weak labelers as constraints
Deep Kernel Posterior Learning under Infinite Variance Prior Weights
The Pitfalls of Memorization: When Memorization Hurts Generalization
CREIMBO: Cross-Regional Ensemble Interactions in Multi-view Brain Observations
CLIPDrag: Combining Text-based and Drag-based Instructions for Image Editing
PharmacoMatch: Efficient 3D Pharmacophore Screening via Neural Subgraph Matching
Cross-Attention Head Position Patterns Can Align with Human Visual Concepts in Text-to-Image Generative Models
Understanding and Enhancing Safety Mechanisms of LLMs via Safety-Specific Neuron
Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon
Coreset Spectral Clustering
SWE-Search: Enhancing Software Agents with Monte Carlo Tree Search and Iterative Refinement
GenSE: Generative Speech Enhancement via Language Models using Hierarchical Modeling
A Statistical Framework for Ranking LLM-based Chatbots
Diffusion-Based Planning for Autonomous Driving with Flexible Guidance
CTSyn: A Foundation Model for Cross Tabular Data Generation
Matcha: Mitigating Graph Structure Shifts with Test-Time Adaptation
Multimodal Lego: Model Merging and Fine-Tuning Across Topologies and Modalities in Biomedicine
Wicked Oddities: Selectively Poisoning for Effective Clean-Label Backdoor Attacks
Differentiation and Specialization of Attention Heads via the Refined Local Learning Coefficient
Fully-inductive Node Classification on Arbitrary Graphs
Enhancing Vision-Language Model with Unmasked Token Alignment
Bridging Jensen Gap for Max-Min Group Fairness Optimization in Recommendation
When Attention Sink Emerges in Language Models: An Empirical View
MCNC: Manifold-Constrained Reparameterization for Neural Compression
Spectro-Riemannian Graph Neural Networks
Improved Convergence Rate for Diffusion Probabilistic Models
On Calibration of LLM-based Guard Models for Reliable Content Moderation
OpenPRM: Building Open-domain Process-based Reward Models with Preference Trees
Monet: Mixture of Monosemantic Experts for Transformers
CoMRes: Semi-Supervised Time Series Forecasting Utilizing Consensus Promotion of Multi-Resolution
Test-Time Adaptation for Combating Missing Modalities in Egocentric Videos
Optimizing Neural Network Representations of Boolean Networks
CO-MOT: Boosting End-to-end Transformer-based Multi-Object Tracking via Coopetition Label Assignment and Shadow Sets
A Theoretically-Principled Sparse, Connected, and Rigid Graph Representation of Molecules
Can Video LLMs Refuse to Answer? Alignment for Answerability in Video Large Language Models
MetaDesigner: Advancing Artistic Typography through AI-Driven, User-Centric, and Multilingual WordArt Synthesis
Universal generalization guarantees for Wasserstein distributionally robust models
SimpleTM: A Simple Baseline for Multivariate Time Series Forecasting
Weakly-Supervised Affordance Grounding Guided by Part-Level Semantic Priors
PETRA: Parallel End-to-end Training with Reversible Architectures
Sensitivity-Aware Amortized Bayesian Inference
SPARTUN3D: Situated Spatial Understanding of 3D World in Large Language Model
SPD Attack - Prevention of AI Powered Image Editing by Image Immunization
Multimodality Helps Few-shot 3D Point Cloud Semantic Segmentation
Divide and Translate: Compositional First-Order Logic Translation and Verification for Complex Logical Reasoning
ORSO: Accelerating Reward Design via Online Reward Selection and Policy Optimization
Port-Hamiltonian Architectural Bias for Long-Range Propagation in Deep Graph Networks
Grid Cell-Inspired Fragmentation and Recall for Efficient Map Building
Deep Networks Learn Features From Local Discontinuities in the Label Function
DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation
Avoid Overclaims: Summary of Complexity Bounds for Algorithms in Minimization and Minimax Optimization
Towards more rigorous evaluations of language models
How do we interpret the outputs of a neural network trained on classification?
Generative Adversarial Ranking Nets
Adaptive teachers for amortized samplers
Generating Less Certain Adversarial Examples Improves Robust Generalization
Deriving Causal Order from Single-Variable Interventions: Guarantees & Algorithm
Training LLMs over Neurally Compressed Text
MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers
Imputation for prediction: beware of diminishing returns.
FreeVS: Generative View Synthesis on Free Driving Trajectory
Enhancing End-to-End Autonomous Driving with Latent World Model
Generating with Confidence: Uncertainty Quantification for Black-box Large Language Models
ASTrA: Adversarial Self-supervised Training with Adaptive-Attacks
Revisiting Feature Prediction for Learning Visual Representations from Video
Regularized Proportional Fairness Mechanism for Resource Allocation Without Money
Self-Improvement for Neural Combinatorial Optimization: Sample Without Replacement, but Improvement
Towards Unbiased Calibration using Meta-Regularization
AdaIR: Adaptive All-in-One Image Restoration via Frequency Mining and Modulation
MMD-Regularized Unbalanced Optimal Transport
A Deep Generative Learning Approach for Two-stage Adaptive Robust Optimization
DAMO: Decoding by Accumulating Activations Momentum for Mitigating Hallucinations in Vision-Language Models
$F^3Set$: Towards Analyzing Fast, Frequent, and Fine-grained Events from Videos
Weak-to-Strong Generalization Through the Data-Centric Lens
Text4Seg: Reimagining Image Segmentation as Text Generation
SWEb: A Large Web Dataset for the Scandinavian Languages
Poison-splat: Computation Cost Attack on 3D Gaussian Splatting
Test-time Adaptation for Image Compression with Distribution Regularization
Beyond Squared Error: Exploring Loss Design for Enhanced Training of Generative Flow Networks
What Secrets Do Your Manifolds Hold? Understanding the Local Geometry of Generative Models
Interpretable Bilingual Multimodal Large Language Model for Diverse Biomedical Tasks
SigDiffusions: Score-Based Diffusion Models for Time Series via Log-Signature Embeddings
K-HALU: Multiple Answer Korean Hallucination Benchmark for Large Language Models
FairMT-Bench: Benchmarking Fairness for Multi-turn Dialogue in Conversational LLMs
On the Feature Learning in Diffusion Models
Uncertainty Modeling in Graph Neural Networks via Stochastic Differential Equations
BingoGuard: LLM Content Moderation Tools with Risk Levels
MrSteve: Instruction-Following Agents in Minecraft with What-Where-When Memory
Advantage-Guided Distillation for Preference Alignment in Small Language Models
SONICS: Synthetic Or Not - Identifying Counterfeit Songs
ConcreTizer: Model Inversion Attack via Occupancy Classification and Dispersion Control for 3D Point Cloud Restoration
TeaserGen: Generating Teasers for Long Documentaries
Affine Steerable Equivariant Layer for Canonicalization of Neural Networks
Temporal Heterogeneous Graph Generation with Privacy, Utility, and Efficiency
Fast Uncovering of Protein Sequence Diversity from Structure
Contractive Dynamical Imitation Policies for Efficient Out-of-Sample Recovery
Towards Synergistic Path-based Explanations for Knowledge Graph Completion: Exploration and Evaluation
Provably Robust Explainable Graph Neural Networks against Graph Perturbation Attacks
Fundamental Limitations on Subquadratic Alternatives to Transformers
SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation
Spectral-Refiner: Accurate Fine-Tuning of Spatiotemporal Fourier Neural Operator for Turbulent Flows
Dense Video Object Captioning from Disjoint Supervision
Personality Alignment of Large Language Models
On Rollouts in Model-Based Reinforcement Learning
MoDGS: Dynamic Gaussian Splatting from Casually-captured Monocular Videos with Depth Priors
RelitLRM: Generative Relightable Radiance for Large Reconstruction Models
ReSi: A Comprehensive Benchmark for Representational Similarity Measures
Tracing Representation Progression: Analyzing and Enhancing Layer-Wise Similarity
Aligning Language Models with Demonstrated Feedback
DynamicCity: Large-Scale 4D Occupancy Generation from Dynamic Scenes
Efficient Learning with Sine-Activated Low-Rank Matrices
Highly Efficient Self-Adaptive Reward Shaping for Reinforcement Learning
CycleResearcher: Improving Automated Research via Automated Review
Population Transformer: Learning Population-level Representations of Neural Activity
Learning Transformer-based World Models with Contrastive Predictive Coding
InverseBench: Benchmarking Plug-and-Play Diffusion Priors for Inverse Problems in Physical Sciences
Token Statistics Transformer: Linear-Time Attention via Variational Rate Reduction
FakeShield: Explainable Image Forgery Detection and Localization via Multi-modal Large Language Models
What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysis
DUET: Decentralized Bilevel Optimization without Lower-Level Strong Convexity
SecureGS: Boosting the Security and Fidelity of 3D Gaussian Splatting Steganography
ICLR: In-Context Learning of Representations
COFlowNet: Conservative Constraints on Flows Enable High-Quality Candidate Generation
Distribution-Free Data Uncertainty for Neural Network Regression
nGPT: Normalized Transformer with Representation Learning on the Hypersphere
Eliciting Human Preferences with Language Models
Exploring Local Memorization in Diffusion Models via Bright Ending Attention
SCBench: A KV Cache-Centric Analysis of Long-Context Methods
No Need to Talk: Asynchronous Mixture of Language Models
MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequences
Earlier Tokens Contribute More: Learning Direct Preference Optimization From Temporal Decay Perspective
From Few to Many: Self-Improving Many-Shot Reasoners Through Iterative Optimization and Generation
OptionZero: Planning with Learned Options
Decoupled Graph Energy-based Model for Node Out-of-Distribution Detection on Heterophilic Graphs
You Only Sample Once: Taming One-Step Text-to-Image Synthesis by Self-Cooperative Diffusion GANs
Valid Conformal Prediction for Dynamic GNNs
Private Mechanism Design via Quantile Estimation
TabWak: A Watermark for Tabular Diffusion Models
Sharper Guarantees for Learning Neural Network Classifiers with Gradient Methods
Analytic DAG Constraints for Differentiable DAG Learning
Self-Normalized Resets for Plasticity in Continual Learning
Robustness of Quantum Algorithms for Nonconvex Optimization
Beyond the convexity assumption: Realistic tabular data generation under quantifier-free real linear constraints
An Exploration with Entropy Constrained 3D Gaussians for 2D Video Compression
GLoRa: A Benchmark to Evaluate the Ability to Learn Long-Range Dependencies in Graphs
NetMoE: Accelerating MoE Training through Dynamic Sample Placement
xFinder: Large Language Models as Automated Evaluators for Reliable Evaluation
Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes
ManiSkill-HAB: A Benchmark for Low-Level Manipulation in Home Rearrangement Tasks
Towards Auto-Regressive Next-Token Prediction: In-context Learning Emerges from Generalization
Factual Context Validation and Simplification: A Scalable Method to Enhance GPT Trustworthiness and Efficiency
Exact Computation of Any-Order Shapley Interactions for Graph Neural Networks
Graph Neural Networks for Edge Signals: Orientation Equivariance and Invariance
Generalized Consistency Trajectory Models for Image Manipulation
Symbolic regression via MDLformer-guided search: from minimizing prediction error to minimizing description length
A Large-scale Dataset and Benchmark for Commuting Origin-Destination Flow Generation
Decoupled Finetuning for Domain Generalizable Semantic Segmentation
CFD: Learning Generalized Molecular Representation via Concept-Enhanced Feedback Disentanglement
Training Free Guided Flow-Matching with Optimal Control
ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments
Online Reward-Weighted Fine-Tuning of Flow Matching with Wasserstein Regularization
Predicting the Energy Landscape of Stochastic Dynamical System via Physics-informed Self-supervised Learning
Towards Unified Human Motion-Language Understanding via Sparse Interpretable Characterization
ChemAgent: Self-updating Memories in Large Language Models Improves Chemical Reasoning
Moner: Motion Correction in Undersampled Radial MRI with Unsupervised Neural Representation
AgentSquare: Automatic LLM Agent Search in Modular Design Space
GETS: Ensemble Temperature Scaling for Calibration in Graph Neural Networks
RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards
DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agent
TSVD: Bridging Theory and Practice in Continual Learning with Pre-trained Models
VL-ICL Bench: The Devil in the Details of Multimodal In-Context Learning
KLay: Accelerating Arithmetic Circuits for Neurosymbolic AI
VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
Fat-to-Thin Policy Optimization: Offline Reinforcement Learning with Sparse Policies
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
CFG++: Manifold-constrained Classifier Free Guidance for Diffusion Models
DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific Factors
Beyond Mere Token Analysis: A Hypergraph Metric Space Framework for Defending Against Socially Engineered LLM Attacks
Modeling Unseen Environments with Language-guided Composable Causal Components in Reinforcement Learning
Towards Generalizable Reinforcement Learning via Causality-Guided Self-Adaptive Representations
Interaction Asymmetry: A General Principle for Learning Composable Abstractions
FLOPS: Forward Learning with OPtimal Sampling
Non-Stationary Dueling Bandits Under a Weighted Borda Criterion
$q$-exponential family for policy optimization
Programming Refusal with Conditional Activation Steering
Tuning-Free Bilevel Optimization: New Algorithms and Convergence Analysis
Regularization by Texts for Latent Diffusion Inverse Solvers
Fourier Sliced-Wasserstein Embedding for Multisets and Measures
HiRA: Parameter-Efficient Hadamard High-Rank Adaptation for Large Language Models
AnyTouch: Learning Unified Static-Dynamic Representation across Multiple Visuo-tactile Sensors
Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for LLM Problem-Solving
Dream to Manipulate: Compositional World Models Empowering Robot Imitation Learning with Imagination
Wavelet-based Positional Representation for Long Context
A Generic Framework for Conformal Fairness
Block Verification Accelerates Speculative Decoding
Multi-Field Adaptive Retrieval
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
Accelerating Training with Neuron Interaction and Nowcasting Networks
Repulsive Latent Score Distillation for Solving Inverse Problems
Can Knowledge Editing Really Correct Hallucinations?
Training-Free Diffusion Model Alignment with Sampling Demons
Distilling Dataset into Neural Field
Exploring The Loss Landscape Of Regularized Neural Networks Via Convex Duality
Can We Ignore Labels in Out of Distribution Detection?
Deep Weight Factorization: Sparse Learning Through the Lens of Artificial Symmetries
Exploring a Principled Framework for Deep Subspace Clustering
Looking Backward: Retrospective Backward Synthesis for Goal-Conditioned GFlowNets
Find A Winning Sign: Sign Is All We Need to Win the Lottery
Mitigating Parameter Interference in Model Merging via Sharpness-Aware Fine-Tuning
MamKO: Mamba-based Koopman operator for modeling and predictive control
Making Transformer Decoders Better Differentiable Indexers
Breaking the $\log(1/\Delta_2)$ Barrier: Better Batched Best Arm Identification with Adaptive Grids
High-Precision Dichotomous Image Segmentation via Probing Diffusion Capacity
Boundary constrained Gaussian processes for robust physics-informed machine learning of linear partial differential equations
McEval: Massively Multilingual Code Evaluation
Accelerating 3D Molecule Generation via Jointly Geometric Optimal Transport
Demystifying the Token Dynamics of Deep Selective State Space Models
MarS: a Financial Market Simulation Engine Powered by Generative Foundation Model
Reward Learning from Multiple Feedback Types
MRS: A Fast Sampler for Mean Reverting Diffusion based on ODE and SDE Solvers
Equivariant Neural Functional Networks for Transformers
Leave-One-Out Stable Conformal Prediction
RB-Modulation: Training-Free Stylization using Reference-Based Modulation
CLDyB: Towards Dynamic Benchmarking for Continual Learning with Pre-trained Models
MrT5: Dynamic Token Merging for Efficient Byte-level Language Models
Preserving Deep Representations in One-Shot Pruning: A Hessian-Free Second-Order Optimization Framework
BLEND: Behavior-guided Neural Population Dynamics Modeling via Privileged Knowledge Distillation
Peeking Behind Closed Doors: Risks of LLM Evaluation by Private Data Curators
What Makes Large Language Models Reason in (Multi-Turn) Code Generation?
GridMix: Exploring Spatial Modulation for Neural Fields in PDE Modeling
Towards Understanding the Robustness of Diffusion-Based Purification: A Stochastic Perspective
AFlow: Automating Agentic Workflow Generation
When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers
Towards Automated Knowledge Integration From Human-Interpretable Representations
SD-LoRA: Scalable Decoupled Low-Rank Adaptation for Class Incremental Learning
Training Nonlinear Transformers for Chain-of-Thought Inference: A Theoretical Generalization Analysis
Near-optimal Active Regression of Single-Index Models
When narrower is better: the narrow width limit of Bayesian parallel branching neural networks
MAPS: Advancing Multi-Modal Reasoning in Expert-Level Physical Science
KGARevion: An AI Agent for Knowledge-Intensive Biomedical QA
Langevin Soft Actor-Critic: Efficient Exploration through Uncertainty-Driven Critic Learning
Mini-batch Coresets for Memory-efficient Language Model Training on Data Mixtures
Build-A-Scene: Interactive 3D Layout Control for Diffusion-Based Image Generation
Re-evaluating Open-ended Evaluation of Large Language Models
Differential Transformer
Transformer Block Coupling and its Correlation with Generalization in LLMs
On the self-verification limitations of large language models on reasoning and planning tasks
TC-MoE: Augmenting Mixture of Experts with Ternary Expert Choice
Post-hoc Reward Calibration: A Case Study on Length Bias
Efficient Interpolation between Extragradient and Proximal Methods for Weak MVIs
Generative Verifiers: Reward Modeling as Next-Token Prediction
Controlling Space and Time with Diffusion Models
Chain-of-Thought Provably Enables Learning the (Otherwise) Unlearnable
Training-free Camera Control for Video Generation
Multi-modal Learning: A Look Back and the Road Ahead
FOSP: Fine-tuning Offline Safe Policy through World Models
CaPo: Cooperative Plan Optimization for Efficient Embodied Multi-Agent Cooperation
Efficient Residual Learning with Mixture-of-Experts for Universal Dexterous Grasping
Provable Uncertainty Decomposition via Higher-Order Calibration
Hot-pluggable Federated Learning: Bridging General and Personalized FL via Dynamic Selection
Discriminator-Guided Embodied Planning for LLM Agent
MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation
Proving Olympiad Inequalities by Synergizing LLMs and Symbolic Reasoning
Collab: Controlled Decoding using Mixture of Agents for LLM Alignment
MAP: Multi-Human-Value Alignment Palette
AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models
When Graph Neural Networks Meet Dynamic Mode Decomposition
Activation Gradient based Poisoned Sample Detection Against Backdoor Attacks
Efficient Distribution Matching of Representations via Noise-Injected Deep InfoMax
Accelerating neural network training: An analysis of the AlgoPerf competition
Hyper-Connections
Supervised and Semi-Supervised Diffusion Maps with Label-Driven Diffusion
SINGAPO: Single Image Controlled Generation of Articulated Parts in Objects
Ultra-Sparse Memory Network
Spectral Compressive Imaging via Unmixing-driven Subspace Diffusion Refinement
Can a Large Language Model be a Gaslighter?
Text-to-Image Rectified Flow as Plug-and-Play Priors
Fréchet Wavelet Distance: A Domain-Agnostic Metric for Image Generation
MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code
Simple yet Effective Incomplete Multi-view Clustering: Similarity-level Imputation and Intra-view Hybrid-group Prototype Construction
pMoE: Prompting Diverse Experts Together Wins More in Visual Adaptation
Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late In Training
SeCom: On Memory Construction and Retrieval for Personalized Conversational Agents
HyPoGen: Optimization-Biased Hypernetworks for Generalizable Policy Generation
Multi-objective Differentiable Neural Architecture Search
Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues
Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models
ThinK: Thinner Key Cache by Query-Driven Pruning
Don't Take Things Out of Context: Attention Intervention for Enhancing Chain-of-Thought Reasoning in Large Language Models
MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine
Improved Diffusion-based Generative Model with Better Adversarial Robustness
Physics-Informed Diffusion Models
OptiBench Meets ReSocratic: Measure and Improve LLMs for Optimization Modeling
DocMIA: Document-Level Membership Inference Attacks against DocVQA Models
VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis
How Low Can You Go? Searching for the Intrinsic Dimensionality of Complex Networks using Metric Node Embeddings
Efficient Neuron Segmentation in Electron Microscopy by Affinity-Guided Queries
STAR: Synthesis of Tailored Architectures
Bridging the Gap Between f-divergences and Bayes Hilbert Spaces
Probabilistic Learning to Defer: Handling Missing Expert Annotations and Controlling Workload Distribution
Do You Keep an Eye on What I Ask? Mitigating Multimodal Hallucination via Attention-Guided Ensemble Decoding
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities
Zeroth-Order Policy Gradient for Reinforcement Learning from Human Feedback without Reward Inference
Does Safety Training of LLMs Generalize to Semantically Related Natural Prompts?
Context Clues: Evaluating Long Context Models for Clinical Prediction Tasks on EHR Data
LICORICE: Label-Efficient Concept-Based Interpretable Reinforcement Learning
Realistic Evaluation of Deep Partial-Label Learning Algorithms
Selective Task Group Updates for Multi-Task Optimization
Field-DiT: Diffusion Transformer on Unified Video, 3D, and Game Field Generation
Deconstructing What Makes a Good Optimizer for Autoregressive Language Models
WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild
BaB-ND: Long-Horizon Motion Planning with Branch-and-Bound and Neural Dynamics
Three-in-One: Fast and Accurate Transducer for Hybrid-Autoregressive ASR
Re-Thinking Inverse Graphics With Large Language Models
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
PQMass: Probabilistic Assessment of the Quality of Generative Models using Probability Mass Estimation
Action abstractions for amortized sampling
Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data
On the Identification of Temporal Causal Representation with Instantaneous Dependence
Semi-Parametric Retrieval via Binary Bag-of-Tokens Index
Revisiting Zeroth-Order Optimization: Minimum-Variance Two-Point Estimators and Directionally Aligned Perturbations
Learning Diverse Attacks on Large Language Models for Robust Red-Teaming and Safety Tuning
Diffusion Actor-Critic: Formulating Constrained Policy Iteration as Diffusion Noise Regression for Offline Reinforcement Learning
Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA
HAMSTER: Hierarchical Action Models for Open-World Robot Manipulation
Deep Learning Alternatives Of The Kolmogorov Superposition Theorem
The adaptive complexity of parallelized log-concave sampling
NeuralPlane: Structured 3D Reconstruction in Planar Primitives with Neural Fields
Rethinking Diffusion Posterior Sampling: From Conditional Score Estimator to Maximizing a Posterior
CViT: Continuous Vision Transformer for Operator Learning
Training-Free Activation Sparsity in Large Language Models
Synthetic continued pretraining
Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better
Robustness Inspired Graph Backdoor Defense
Why In-Context Learning Models are Good Few-Shot Learners?
Computational Explorations of Total Variation Distance
What Are Good Positional Encodings for Directed Graphs?
Multi-Label Test-Time Adaptation with Bound Entropy Minimization
Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models with Flow Matching
Isometric Regularization for Manifolds of Functional Data
Advancing Graph Generation through Beta Diffusion
Emergence of a High-Dimensional Abstraction Phase in Language Transformers
Equivariant Masked Position Prediction for Efficient Molecular Representation
ReCogLab: a framework testing relational reasoning & cognitive hypotheses on LLMs
Hierarchical Autoregressive Transformers: Combining Byte- and Word-Level Processing for Robust, Adaptable Language Models
Accelerated Over-Relaxation Heavy-Ball Method: Achieving Global Accelerated Convergence with Broad Generalization
Learning to Discretize Denoising Diffusion ODEs
Asymptotic Analysis of Two-Layer Neural Networks after One Gradient Step under Gaussian Mixtures Data with Structure
Improved Algorithms for Kernel Matrix-Vector Multiplication Under Sparsity Assumptions
Mitigating Object Hallucination in MLLMs via Data-augmented Phrase-level Alignment
Manifold Induced Biases for Zero-shot and Few-shot Detection of Generated Images
Optimistic Games for Combinatorial Bayesian Optimization with Application to Protein Design
DICE: Data Influence Cascade in Decentralized Learning
Tree of Attributes Prompt Learning for Vision-Language Models
Adversarial Generative Flow Network for Solving Vehicle Routing Problems
Adapters for Altering LLM Vocabularies: What Languages Benefit the Most?
PEARL: Towards Permutation-Resilient LLMs
ColPali: Efficient Document Retrieval with Vision Language Models
Joint Reward and Policy Learning with Demonstrations and Human Feedback Improves Alignment
SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning
Homomorphism Expressivity of Spectral Invariant Graph Neural Networks
Gaussian Mixture Counterfactual Generator
Kolmogorov-Arnold Transformer
DEPT: Decoupled Embeddings for Pre-training Language Models
On the Learn-to-Optimize Capabilities of Transformers in In-Context Sparse Recovery
Advancing Prompt-Based Methods for Replay-Independent General Continual Learning
Fine-Grained Verifiers: Preference Modeling as Next-token Prediction in Vision-Language Alignment
Unlearning or Obfuscating? Jogging the Memory of Unlearned LLMs via Benign Relearning
SimulPL: Aligning Human Preferences in Simultaneous Machine Translation
SFS: Smarter Code Space Search improves LLM Inference Scaling
AdaRankGrad: Adaptive Gradient Rank and Moments for Memory-Efficient LLMs Training and Fine-Tuning
Mitigating Hallucination in Large Vision-Language Models via Modular Attribution and Intervention
Lift Your Molecules: Molecular Graph Generation in Latent Euclidean Space
On the Expressiveness of Rational ReLU Neural Networks With Bounded Depth
AgentRefine: Enhancing Agent Generalization through Refinement Tuning
CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery
Approximating Full Conformal Prediction for Neural Network Regression with Gauss-Newton Influence
Efficiently Parameterized Neural Metriplectic Systems
Near-Exact Privacy Amplification for Matrix Mechanisms
Differentially Private Federated Learning with Time-Adaptive Privacy Spending
Catastrophic Failure of LLM Unlearning via Quantization
Federated Continual Learning Goes Online: Uncertainty-Aware Memory Management for Vision Tasks and Beyond
4K4DGen: Panoramic 4D Generation at 4K Resolution
TimeKAN: KAN-based Frequency Decomposition Learning Architecture for Long-term Time Series Forecasting
VAE-Var: Variational Autoencoder-Enhanced Variational Methods for Data Assimilation in Meteorology
PostCast: Generalizable Postprocessing for Precipitation Nowcasting via Unsupervised Blurriness Modeling
Beyond Random Augmentations: Pretraining with Hard Views
Improving Convergence Guarantees of Random Subspace Second-order Algorithm for Nonconvex Optimization
Effective Interplay between Sparsity and Quantization: From Theory to Practice
PCNN: Probable-Class Nearest-Neighbor Explanations Improve Fine-Grained Image Classification Accuracy for AIs and Humans
Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts
BAMDP Shaping: a Unified Theoretical Framework for Intrinsic Motivation and Reward Shaping
What Matters in Learning from Large-Scale Datasets for Robot Manipulation
VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents
Standard Gaussian Process is All You Need for High-Dimensional Bayesian Optimization
The Case for Cleaner Biosignals: High-fidelity Neural Compressor Enables Transfer from Cleaner iEEG to Noisier EEG
Unearthing Skill-level Insights for Understanding Trade-offs of Foundation Models
Robust Gymnasium: A Unified Modular Benchmark for Robust Reinforcement Learning
Unlocking Efficient, Scalable, and Continual Knowledge Editing with Basis-Level Representation Fine-Tuning
BitStack: Any-Size Compression of Large Language Models in Variable Memory Environments
OmniRe: Omni Urban Scene Reconstruction
Identifiability for Gaussian Processes with Holomorphic Kernels
FairDen: Fair Density-Based Clustering
MixEval-X: Any-to-any Evaluations from Real-world Data Mixture
See It from My Perspective: How Language Affects Cultural Bias in Image Understanding
Rethinking Artistic Copyright Infringements In the Era Of Text-to-Image Generative Models
Personalized Visual Instruction Tuning
Improving Uncertainty Estimation through Semantically Diverse Language Generation
Talking Turns: Benchmarking Audio Foundation Models on Turn-Taking Dynamics
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model
PRDP: Progressively Refined Differentiable Physics
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
Context-aware Dynamic Pruning for Speech Foundation Models
Forewarned is Forearmed: Harnessing LLMs for Data Synthesis via Failure-induced Exploration
Neuralized Markov Random Field for Interaction-Aware Stochastic Human Trajectory Prediction
Trust or Escalate: LLM Judges with Provable Guarantees for Human Agreement
MM-EMBED: UNIVERSAL MULTIMODAL RETRIEVAL WITH MULTIMODAL LLMS
LoCA: Location-Aware Cosine Adaptation for Parameter-Efficient Fine-Tuning
Competition Dynamics Shape Algorithmic Phases of In-Context Learning
Exact Certification of (Graph) Neural Networks Against Label Poisoning
RAPID: Retrieval Augmented Training of Differentially Private Diffusion Models
Dynamics of Concept Learning and Compositional Generalization
Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object Detection
3D-MolT5: Leveraging Discrete Structural Information for Molecule-Text Modeling
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
System 1.x: Learning to Balance Fast and Slow Planning with Language Models
CR-CTC: Consistency regularization on CTC for improved speech recognition
Adaptive Batch Size for Privately Finding Second-Order Stationary Points
Unlearn and Burn: Adversarial Machine Unlearning Requests Destroy Model Accuracy
JetFormer: An autoregressive generative model of raw images and text
SplineGS: Learning Smooth Trajectories in Gaussian Splatting for Dynamic Scene Reconstruction
Optimal Flow Transport and its Entropic Regularization: a GPU-friendly Matrix Iterative Algorithm for Flow Balance Satisfaction
MUSE: Machine Unlearning Six-Way Evaluation for Language Models
SelKD: Selective Knowledge Distillation via Optimal Transport Perspective
Transformers Learn Low Sensitivity Functions: Investigations and Implications
Diffusion Transformers for Tabular Data Time Series Generation
Density estimation with LLMs: a geometric investigation of in-context learning trajectories
Preference Elicitation for Offline Reinforcement Learning
Revisiting In-context Learning Inference Circuit in Large Language Models
Retrieval Head Mechanistically Explains Long-Context Factuality
Adversarial Mixup Unlearning
Language Models Need Inductive Biases to Count Inductively
TIGER: Time-frequency Interleaved Gain Extraction and Reconstruction for Efficient Speech Separation
ImageFolder: Autoregressive Image Generation with Folded Tokens
HexGen-2: Disaggregated Generative Inference of LLMs in Heterogeneous Environment
Tight Lower Bounds under Asymmetric High-Order Hölder Smoothness and Uniform Convexity
Learning system dynamics without forgetting
ReAttention: Training-Free Infinite Context with Finite Attention Scope
Learning Mask Invariant Mutual Information for Masked Image Modeling
Broadening Target Distributions for Accelerated Diffusion Models via a Novel Analysis Approach
Improving Unsupervised Constituency Parsing via Maximizing Semantic Information
LaMP: Language-Motion Pretraining for Motion Generation, Retrieval, and Captioning
A Little Goes a Long Way: Efficient Long Context Training and Inference with Partial Contexts
econSG: Efficient and Multi-view Consistent Open-Vocabulary 3D Semantic Gaussians
Large Language Models Often Say One Thing and Do Another
DSBench: How Far Are Data Science Agents from Becoming Data Science Experts?
NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative
Eliminating Position Bias of Language Models: A Mechanistic Approach
Long-horizon Visual Instruction Generation with Logic and Attribute Self-reflection
ACTIVE: Offline Reinforcement Learning via Adaptive Imitation and In-sample $V$-Ensemble
Query-based Knowledge Transfer for Heterogeneous Learning Environments
Scaling Diffusion Language Models via Adaptation from Autoregressive Models
Advancing LLM Reasoning Generalists with Preference Trees
Efficient Biological Data Acquisition through Inference Set Design
How new data permeates LLM knowledge and how to dilute it
OpenHands: An Open Platform for AI Software Developers as Generalist Agents
Hierarchical Uncertainty Estimation for Learning-based Registration in Neuroimaging
Reliable and Diverse Evaluation of LLM Medical Knowledge Mastery
Diffusing to the Top: Boost Graph Neural Networks with Minimal Hyperparameter Tuning
Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens
Aligning Visual Contrastive learning models via Preference Optimization
AlphaEdit: Null-Space Constrained Model Editing for Language Models
NextBestPath: Efficient 3D Mapping of Unseen Environments
Large Language Models can Become Strong Self-Detoxifiers
Improving Long-Text Alignment for Text-to-Image Diffusion Models
AdaFisher: Adaptive Second Order Optimization via Fisher Information
ImProver: Agent-Based Automated Proof Optimization
Grounding Continuous Representations in Geometry: Equivariant Neural Fields
CausalRivers - Scaling up benchmarking of causal discovery for real-world time-series
HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing
Diff-Prompt: Diffusion-driven Prompt Generator with Mask Supervision
SIMPL: Scalable and hassle-free optimisation of neural representations from behaviour
RocketEval: Efficient automated LLM evaluation via grading checklist
MuPT: A Generative Symbolic Music Pretrained Transformer
IGL-Bench: Establishing the Comprehensive Benchmark for Imbalanced Graph Learning
Vec2Face: Scaling Face Dataset Generation with Loosely Constrained Vectors
Can Reinforcement Learning Solve Asymmetric Combinatorial-Continuous Zero-Sum Games?
To Trust or Not to Trust? Enhancing Large Language Models' Situated Faithfulness to External Contexts
DataEnvGym: Data Generation Agents in Teacher Environments with Student Feedback
Resolution Attack: Exploiting Image Compression to Deceive Deep Neural Networks
Cauchy-Schwarz Regularizers
Alchemy: Amplifying Theorem-Proving Capability Through Symbolic Mutation
Synthesizing Programmatic Reinforcement Learning Policies with Large Language Model Guided Search
A CLIP-Powered Framework for Robust and Generalizable Data Selection
Uni$^2$Det: Unified and Universal Framework for Prompt-Guided Multi-dataset 3D Detection
TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models
A new framework for evaluating model out-of-distribution generalisation for the biochemical domain
TODO: Enhancing LLM Alignment with Ternary Preferences
MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine
ContextGNN: Beyond Two-Tower Recommendation Systems
Distilling Reinforcement Learning Algorithms for In-Context Model-Based Planning
The Computational Complexity of Positive Non-Clashing Teaching in Graphs
A Simple Framework for Open-Vocabulary Zero-Shot Segmentation
Data Unlearning in Diffusion Models
Strength Estimation and Human-Like Strength Adjustment in Games
Wavelet Diffusion Neural Operator
Towards Self-Supervised Covariance Estimation in Deep Heteroscedastic Regression
FlexCAD: Unified and Versatile Controllable CAD Generation with Fine-tuned Large Language Models
Mentored Learning: Improving Generalization and Convergence of Student Learner
Going Beyond Static: Understanding Shifts with Time-Series Attribution
SysBench: Can LLMs Follow System Message?
Fast unsupervised ground metric learning with tree-Wasserstein distance
On the Performance Analysis of Momentum Method: A Frequency Domain Perspective
Demystifying Online Clustering of Bandits: Enhanced Exploration Under Stochastic and Smoothed Adversarial Contexts
Exposure Bracketing Is All You Need For A High-Quality Image
Geometry-aware RL for Manipulation of Varying Shapes and Deformable Objects
Nesterov acceleration in benignly non-convex landscapes
From Attention to Activation: Unraveling the Enigmas of Large Language Models
Flat Reward in Policy Parameter Space Implies Robust Reinforcement Learning
Accelerating Goal-Conditioned Reinforcement Learning Algorithms and Research
Certifying Language Model Robustness with Fuzzed Randomized Smoothing: An Efficient Defense Against Backdoor Attacks
CheapNet: Cross-attention on Hierarchical representations for Efficient protein-ligand binding Affinity Prediction
Addressing Label Shift in Distributed Learning via Entropy Regularization
Not-So-Optimal Transport Flows for 3D Point Cloud Generation
Decoupled Subgraph Federated Learning
Learning to Explore and Exploit with GNNs for Unsupervised Combinatorial Optimization
MA$^2$E: Addressing Partial Observability in Multi-Agent Reinforcement Learning with Masked Auto-Encoder
TetSphere Splatting: Representing High-Quality Geometry with Lagrangian Volumetric Meshes
Adaptive backtracking for faster optimization
To Code or Not To Code? Exploring Impact of Code in Pre-training
Local Loss Optimization in the Infinite Width: Stable Parameterization of Predictive Coding Networks and Target Propagation
Rethinking Light Decoder-based Solvers for Vehicle Routing Problems
A transfer learning framework for weak to strong generalization
A Benchmark for Semantic Sensitive Information in LLMs Outputs
PhiNets: Brain-inspired Non-contrastive Learning Based on Temporal Prediction Hypothesis
SSLAM: Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes
MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec Transformer
Models trained with unnormalized density functions: A need for a course correction
Predicate Hierarchies Improve Few-Shot State Classification
Transformers Provably Learn Two-Mixture of Linear Classification via Gradient Flow
Bayesian Image Regression with Soft-thresholded Conditional Autoregressive Prior
LLaMaFlex: Many-in-one LLMs via Generalized Pruning and Weight Sharing
R-Sparse: Rank-Aware Activation Sparsity for Efficient LLM Inference
Learning to engineer protein flexibility
DaWin: Training-free Dynamic Weight Interpolation for Robust Adaptation
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training
CarbonSense: A Multimodal Dataset and Baseline for Carbon Flux Modelling
Learning to Clarify: Multi-turn Conversations with Action-Based Contrastive Self-Training
BigDocs: An Open Dataset for Training Multimodal Models on Document and Code Tasks
LLM Unlearning via Loss Adjustment with Only Forget Data
Controlling Language and Diffusion Models by Transporting Activations
Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing
ChartMoE: Mixture of Diversely Aligned Expert Connector for Chart Understanding
DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models
Debiasing Federated Learning with Correlated Client Participation
Episodic Memories Generation and Evaluation Benchmark for Large Language Models
Optimization with Access to Auxiliary Information
Transformer Learns Optimal Variable Selection in Group-Sparse Classification
Improving Equivariant Networks with Probabilistic Symmetry Breaking
DarkBench: Benchmarking Dark Patterns in Large Language Models
Evidential Learning-based Certainty Estimation for Robust Dense Feature Matching
Context-Parametric Inversion: Why Instruction Finetuning May Not Actually Improve Context Reliance
VTDexManip: A Dataset and Benchmark for Visual-tactile Pretraining and Dexterous Manipulation with Reinforcement Learning
Understanding Optimization in Deep Learning with Central Flows
On the Role of Attention Heads in Large Language Model Safety
Inference Optimal VLMs Need Fewer Visual Tokens and More Parameters
A Periodic Bayesian Flow for Material Generation
Behavioral Entropy-Guided Dataset Generation for Offline Reinforcement Learning
Towards Calibrated Deep Clustering Network
RECAST: Reparameterized, Compact weight Adaptation for Sequential Tasks
Revisiting Convolution Architecture in the Realm of DNA Foundation Models
Non-myopic Generation of Language Models for Reasoning and Planning
Information Theoretic Text-to-Image Alignment
Extreme Risk Mitigation in Reinforcement Learning using Extreme Value Theory
SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration
Bio-xLSTM: Generative modeling, representation and in-context learning of biological and chemical sequences
Exploiting Structure in Offline Multi-Agent RL: The Benefits of Low Interaction Rank
Flow-based Variational Mutual Information: Fast and Flexible Approximations
RefactorBench: Evaluating Stateful Reasoning in Language Agents Through Code
Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models
TIGeR: Unifying Text-to-Image Generation and Retrieval with Large Multimodal Models
TabReD: Analyzing Pitfalls and Filling the Gaps in Tabular Deep Learning Benchmarks
See What You Are Told: Visual Attention Sink in Large Multimodal Models
Fast Training of Sinusoidal Neural Fields via Scaling Initialization
ZIP: An Efficient Zeroth-order Prompt Tuning for Black-box Vision-Language Models
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers
Wasserstein Distances, Neuronal Entanglement, and Sparsity
TIPS: Text-Image Pretraining with Spatial awareness
Generalized Behavior Learning from Diverse Demonstrations
Fair Submodular Cover
Pareto Prompt Optimization
Fair Clustering in the Sliding Window Model
Beyond Surface Structure: A Causal Assessment of LLMs' Comprehension ability
On Discriminative Probabilistic Modeling for Self-Supervised Representation Learning
Learning-Augmented Search Data Structures
Automated Filtering of Human Feedback Data for Aligning Text-to-Image Diffusion Models
Provable weak-to-strong generalization via benign overfitting
Polyrating: A Cost-Effective and Bias-Aware Rating System for LLM Evaluation
On the Price of Differential Privacy for Hierarchical Clustering
Anyprefer: An Agentic Framework for Preference Data Synthesis
Misspecified $Q$-Learning with Sparse Linear Function Approximation: Tight Bounds on Approximation Error
Neural Functions for Learning Periodic Signal
Equivariant Denoisers Cannot Copy Graphs: Align Your Graph Diffusion Models
DS-LLM: Leveraging Dynamical Systems to Enhance Both Training and Inference of Large Language Models
GRAIN: Exact Graph Reconstruction from Gradients
Steering Protein Family Design through Profile Bayesian Flow
Ward: Provable RAG Dataset Inference via LLM Watermarks
Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models
Automated Design of Agentic Systems
Language models scale reliably with over-training and on downstream tasks
TLDR: Token-Level Detective Reward Model for Large Vision Language Models
Learn-by-interact: A Data-Centric Framework For Self-Adaptive Agents in Realistic Environments
Prevalence of Negative Transfer in Continual Reinforcement Learning: Analyses and a Simple Baseline
BrainOOD: Out-of-distribution Generalizable Brain Network Analysis
MagicPIG: LSH Sampling for Efficient LLM Generation
Pairwise Elimination with Instance-Dependent Guarantees for Bandits with Cost Subsidy
Scaling Long Context Training Data by Long-Distance Referrals
Should VLMs be Pre-trained with Image Data?
Geometry of Neural Reinforcement Learning in Continuous State and Action Spaces
SANA: Efficient High-Resolution Text-to-Image Synthesis with Linear Diffusion Transformers
Identification of Intermittent Temporal Latent Process
Direct Post-Training Preference Alignment for Multi-Agent Motion Generation Model Using Implicit Feedback from Pre-training Demonstrations
Synergy Between Sufficient Changes and Sparse Mixing Procedure for Disentangled Representation Learning
Enhancing Federated Domain Adaptation with Multi-Domain Prototype-Based Federated Fine-Tuning
LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias
Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion
Towards Scalable Exact Machine Unlearning Using Parameter-Efficient Fine-Tuning
LIFe-GoM: Generalizable Human Rendering with Learned Iterative Feedback Over Multi-Resolution Gaussians-on-Mesh
Large Language Models are Interpretable Learners
The Crystal Ball Hypothesis in diffusion models: Anticipating object positions from initial noise
Denoising Autoregressive Transformers for Scalable Text-to-Image Generation
Diffusing States and Matching Scores: A New Framework for Imitation Learning
DRoP: Distributionally Robust Data Pruning
Is Your Multimodal Language Model Oversensitive to Safe Queries?
Ctrl-U: Robust Conditional Image Generation via Uncertainty-aware Reward Modeling
VCR: Pixel-Level Complex Reasoning by Restoring Occluded Text
Generalization v.s. Memorization: Tracing Language Models’ Capabilities Back to Pretraining Data
CPSample: Classifier Protected Sampling for Guarding Training Data During Diffusion
Point-based Instance Completion with Scene Constraints
Unlocking the Power of Function Vectors for Characterizing and Mitigating Catastrophic Forgetting in Continual Instruction Tuning
Are Transformers Able to Reason by Connecting Separated Knowledge in Training Data?
SFESS: Score Function Estimators for $k$-Subset Sampling
Hummingbird: High Fidelity Image Generation via Multimodal Context Alignment
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
Exploiting Distribution Constraints for Scalable and Efficient Image Retrieval
InterMask: 3D Human Interaction Generation via Collaborative Masked Modeling
Rethinking Shapley Value for Negative Interactions in Non-convex Games
Weak-to-Strong Preference Optimization: Stealing Reward from Weak Aligned Model
Halton Scheduler for Masked Generative Image Transformer
Normed Spaces for Graph Embedding
Glauber Generative Model: Discrete Diffusion Models via Binary Classification
CoTFormer: A Chain of Thought Driven Architecture with Budget-Adaptive Computation Cost at Inference
Towards Federated RLHF with Aggregated Client Preference for LLMs
Flow With What You Know
Intrinsic User-Centric Interpretability through Global Mixture of Experts
Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets
Unsupervised Multiple Kernel Learning for Graphs via Ordinality Preservation
Reward Dimension Reduction for Scalable Multi-Objective Reinforcement Learning
AtomSurf: Surface Representation for Learning on Protein Structures
Newton Meets Marchenko-Pastur: Massively Parallel Second-Order Optimization with Hessian Sketching and Debiasing
Is uniform expressivity too restrictive? Towards efficient expressivity of GNNs
Computationally Efficient RL under Linear Bellman Completeness for Deterministic Dynamics
A Unifying Framework for Representation Learning
Backdooring Vision-Language Models with Out-Of-Distribution Data
Mechanistic Interpretability Meets Vision Language Models: Insights and Limitations
Union-over-Intersections: Object Detection beyond Winner-Takes-All
DisEnvisioner: Disentangled and Enriched Visual Prompt for Customized Image Generation
Video Action Differencing
ImpScore: A Learnable Metric For Quantifying The Implicitness Level of Sentences
Jamba: Hybrid Transformer-Mamba Language Models
Generating Freeform Endoskeletal Robots
Inverse Scaling: When Bigger Isn't Better
Geometry of Long-Tailed Representation Learning: Rebalancing Features for Skewed Distributions
Edge Prompt Tuning for Graph Neural Networks
Artificial Kuramoto Oscillatory Neurons
metabench - A Sparse Benchmark of Reasoning and Knowledge in Large Language Models
PerturboLLaVA: Reducing Multimodal Hallucinations with Perturbative Visual Training
Improved Sampling Of Diffusion Models In Fluid Dynamics With Tweedie's Formula
SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding
Confidence Elicitation: A New Attack Vector for Large Language Models
Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models
Root Cause Analysis of Anomalies in Multivariate Time Series through Granger Causal Discovery
Aria-MIDI: A Dataset of Piano MIDI Files for Symbolic Music Modeling
Robust Barycenter Estimation using Semi-Unbalanced Neural Optimal Transport
E-Valuating Classifier Two-Sample Tests
Broaden your SCOPE! Efficient Multi-turn Conversation Planning for LLMs with Semantic Space
Discretization-invariance? On the Discretization Mismatch Errors in Neural Operators
Revisiting Large-Scale Non-convex Distributionally Robust Optimization
Robust-PIFu: Robust Pixel-aligned Implicit Function for 3D Human Digitalization from a Single Image
Improving Data Efficiency via Curating LLM-Driven Rating Systems
CONGO: Compressive Online Gradient Optimization
StochSync: Stochastic Diffusion Synchronization for Image Generation in Arbitrary Spaces
GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-Time Alignment
Soft Merging of Experts with Adaptive Routing
Continuous Autoregressive Modeling with Stochastic Monotonic Alignment for Speech Synthesis
MAD-TD: Model-Augmented Data stabilizes High Update Ratio RL
Lightning-Fast Image Inversion and Editing for Text-to-Image Diffusion Models
HASARD: A Benchmark for Vision-Based Safe Reinforcement Learning in Embodied Agents
Asymmetric Factorized Bilinear Operation for Vision Transformer
Unifying Causal Representation Learning with the Invariance Principle
Building, Reusing, and Generalizing Abstract Representations from Concrete Sequences
MambaExtend: A Training-Free Approach to Improve Long Context Extension of Mamba
First-Person Fairness in Chatbots
Breach By A Thousand Leaks: Unsafe Information Leakage in 'Safe' AI Responses
NeurFlow: Interpreting Neural Networks through Neuron Groups and Functional Interactions
MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation
DRESSing Up LLM: Efficient Stylized Question-Answering via Style Subspace Editing
PooDLe🐩: Pooled and dense self-supervised learning from naturalistic videos
Planning in Natural Language Improves LLM Search for Code Generation
L3Ms — Lagrange Large Language Models
Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance
Lambda-Skip Connections: the architectural component that prevents Rank Collapse
Solving Inverse Problems with Model Mismatch using Untrained Neural Networks within Model-based Architectures
Dynamic Gaussians Mesh: Consistent Mesh Reconstruction from Dynamic Scenes
Agent S: An Open Agentic Framework that Uses Computers Like a Human
Pushing the Limits of All-Atom Geometric Graph Neural Networks: Pre-Training, Scaling, and Zero-Shot Transfer
GReaTer: Gradients Over Reasoning Makes Smaller Language Models Strong Prompt Optimizers
ExACT: Teaching AI Agents to Explore with Reflective-MCTS and Exploratory Learning
Sketching for Convex and Nonconvex Regularized Least Squares with Sharp Guarantees
VL-Cache: Sparsity and Modality-Aware KV Cache Compression for Vision-Language Model Inference Acceleration
Statistical Advantages of Perturbing Cosine Router in Mixture of Experts
ML4TSPBench: Drawing Methodological Principles for TSP and Beyond from Streamlined Design Space of Learning and Search
On Evaluating the Durability of Safeguards for Open-Weight LLMs
Scalable Extraction of Training Data from Aligned, Production Language Models
Composable Interventions for Language Models
Not All Prompts Are Made Equal: Prompt-based Pruning of Text-to-Image Diffusion Models
Mitigating Reward Over-Optimization in RLHF via Behavior-Supported Regularization
Epistemic Monte Carlo Tree Search
Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI
Ada-K Routing: Boosting the Efficiency of MoE-based LLMs
PIN: Prolate Spheroidal Wave Function-based Implicit Neural Representations
Taming Overconfidence in LLMs: Reward Calibration in RLHF
h4rm3l: A Language for Composable Jailbreak Attack Synthesis
Measuring Non-Adversarial Reproduction of Training Data in Large Language Models
Shared-AE: Automatic Identification of Shared Subspaces in High-dimensional Neural and Behavioral Activity
Operator Deep Smoothing for Implied Volatility
VisualPredicator: Learning Abstract World Models with Neuro-Symbolic Predicates for Robot Planning
Persistent Pre-training Poisoning of LLMs
Adding Conditional Control to Diffusion Models with Reinforcement Learning
GotenNet: Rethinking Efficient 3D Equivariant Graph Neural Networks
Metalic: Meta-Learning In-Context with Protein Language Models
Unlocking Global Optimality in Bilevel Optimization: A Pilot Study
Revisiting Energy Based Models as Policies: Ranking Noise Contrastive Estimation and Interpolating Energy Models
Fast and Accurate Blind Flexible Docking
Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models
AutoEval: Autonomous Evaluation of LLMs for Truth Maintenance and Reasoning Tasks
DenseGrounding: Improving Dense Language-Vision Semantics for Ego-centric 3D Visual Grounding
ConFIG: Towards Conflict-free Training of Physics Informed Neural Networks
Policy Decorator: Model-Agnostic Online Refinement for Large Policy Model
Group Downsampling with Equivariant Anti-aliasing
PersonalLLM: Tailoring LLMs to Individual Preferences
Mixture of Attentions For Speculative Decoding
Optimal Protocols for Continual Learning via Statistical Physics and Control Theory
Towards counterfactual fairness through auxiliary variables
A Theory of Initialisation's Impact on Specialisation
Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction
Federated Residual Low-Rank Adaption of Large Language Models
AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs
One Hundred Neural Networks and Brains Watching Videos: Lessons from Alignment
ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian Splatting
Stem-OB: Generalizable Visual Imitation Learning with Stem-Like Convergent Observation through Diffusion Inversion
On the Importance of Language-driven Representation Learning for Heterogeneous Federated Learning
Lightweight Predictive 3D Gaussian Splats
GTR: Improving Large 3D Reconstruction Models through Geometry and Texture Refinement
Multi-Robot Motion Planning with Diffusion Models
DELTA: DENSE EFFICIENT LONG-RANGE 3D TRACKING FOR ANY VIDEO
VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control
Glimpse: Enabling White-Box Methods to Use Proprietary Models for Zero-Shot LLM-Generated Text Detection
DenseMatcher: Learning 3D Semantic Correspondence for Category-Level Manipulation from a Single Demo
LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code
Comparing noisy neural population dynamics using optimal transport distances
Dissecting Adversarial Robustness of Multimodal LM Agents
Minimal Impact ControlNet: Advancing Multi-ControlNet Integration
From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency
Adaptive Rank Allocation: Speeding Up Modern Transformers with RaNA Adapters
The Breakdown of Gaussian Universality in Classification of High-dimensional Linear Factor Mixtures
Effective and Efficient Time-Varying Counterfactual Prediction with State-Space Models
Continuous Ensemble Weather Forecasting with Diffusion models
Scalable Mechanistic Neural Networks
DeFT: Decoding with Flash Tree-attention for Efficient Tree-structured LLM Inference
Effective post-training embedding compression via temperature control in contrastive training
Collapsed Language Models Promote Fairness
Latent-EnSF: A Latent Ensemble Score Filter for High-Dimensional Data Assimilation with Sparse Observation Data
Straightness of Rectified Flow: A Theoretical Insight into Wasserstein Convergence
ToolDial: Multi-turn Dialogue Generation Method for Tool-Augmented Language Models
HERO: Human-Feedback Efficient Reinforcement Learning for Online Diffusion Model Finetuning
On the Crucial Role of Initialization for Matrix Factorization
SSOLE: Rethinking Orthogonal Low-rank Embedding for Self-Supervised Learning
Once-for-All: Controllable Generative Image Compression with Dynamic Granularity Adaptation
SELF-EVOLVED REWARD LEARNING FOR LLMS
MIND over Body: Adaptive Thinking using Dynamic Computation
Varying Shades of Wrong: Aligning LLMs with Wrong Answers Only
Faster, More Efficient RLHF through Off-Policy Asynchronous Learning
Refine-by-Align: Reference-Guided Artifacts Refinement through Semantic Alignment
SmartPretrain: Model-Agnostic and Dataset-Agnostic Representation Learning for Motion Prediction
Language Guided Skill Discovery
Certifying Counterfactual Bias in LLMs
Unsupervised Model Tree Heritage Recovery
Diffusion-NPO: Negative Preference Optimization for Better Preference Aligned Generation of Diffusion Models
Mechanistic Permutability: Match Features Across Layers
Deep Linear Probe Generators for Weight Space Learning
Score Forgetting Distillation: A Swift, Data-Free Method for Machine Unlearning in Diffusion Models
Point Cluster: A Compact Message Unit for Communication-Efficient Collaborative Perception
RuAG: Learned-rule-augmented Generation for Large Language Models
From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic Data
Mixture Compressor for Mixture-of-Experts LLMs Gains More
DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search
Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
Agents' Room: Narrative Generation through Multi-step Collaboration
Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want
Separation Power of Equivariant Neural Networks
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions
SPORTU: A Comprehensive Sports Understanding Benchmark for Multimodal Large Language Models
MMSearch: Unveiling the Potential of Large Models as Multi-modal Search Engines
LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation
Universal Sharpness Dynamics in Neural Network Training: Fixed Point Analysis, Edge of Stability, and Route to Chaos
Lumina-T2X: Scalable Flow-based Large Diffusion Transformer for Flexible Resolution Generation
MoS: Unleashing Parameter Efficiency of Low-Rank Adaptation with Mixture of Shards
DiffGAD: A Diffusion-based Unsupervised Graph Anomaly Detector
RaSA: Rank-Sharing Low-Rank Adaptation
ToolGen: Unified Tool Retrieval and Calling via Generation
PhyMPGN: Physics-encoded Message Passing Graph Network for spatiotemporal PDE systems
Do Large Language Models Truly Understand Geometric Structures?
Adam-mini: Use Fewer Learning Rates To Gain More
Spread Preference Annotation: Direct Preference Judgment for Efficient LLM Alignment
Handling Delay in Real-Time Reinforcement Learning
A Causal Lens for Learning Long-term Fair Policies
High-dimensional Analysis of Knowledge Distillation: Weak-to-Strong Generalization and Scaling Laws
Towards Understanding Why FixMatch Generalizes Better Than Supervised Learning
Directional Gradient Projection for Robust Fine-Tuning of Foundation Models
Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Reinforcement Learning from Imperfect Corrective Actions and Proxy Rewards
CameraCtrl: Enabling Camera Control for Video Diffusion Models
On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization
Unified Convergence Analysis for Score-Based Diffusion Models with Deterministic Samplers
Optimization by Parallel Quasi-Quantum Annealing with Gradient-Based Sampling
Weighted Point Set Embedding for Multimodal Contrastive Learning Toward Optimal Similarity Metric
Explore Theory of Mind: program-guided adversarial data generation for theory of mind reasoning
AI as Humanity’s Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text
Beyond-Expert Performance with Limited Demonstrations: Efficient Imitation Learning with Double Exploration
Can LLMs Really Learn to Translate a Low-Resource Language from One Grammar Book?
Shedding Light on Time Series Classification using Interpretability Gated Networks
Self-Play Preference Optimization for Language Model Alignment
MMEgo: Towards Building Egocentric Multimodal LLMs for Video QA
Flow matching achieves almost minimax optimal convergence
Unlocking Point Processes through Point Set Diffusion
GEVRM: Goal-Expressive Video Generation Model For Robust Visual Manipulation
PIORF: Physics-Informed Ollivier-Ricci Flow for Long–Range Interactions in Mesh Graph Neural Networks
OSCAR: Operating System Control via State-Aware Reasoning and Re-Planning
GALA: Geometry-Aware Local Adaptive Grids for Detailed 3D Generation
VLAS: Vision-Language-Action Model with Speech Instructions for Customized Robot Manipulation
AI2TALE: An Innovative Information Theory-based Approach for Learning to Localize Phishing Attacks
MELODI: Exploring Memory Compression for Long Contexts
CHiP: Cross-modal Hierarchical Direct Preference Optimization for Multimodal LLMs
Selective Attention Improves Transformer
Reflexive Guidance: Improving OoDD in Vision-Language Models via Self-Guided Image-Adaptive Concept Generation
TopoNets: High performing vision and language models with brain-like topography
MOS: Model Synergy for Test-Time Adaptation on LiDAR-Based 3D Object Detection
Optimality and Adaptivity of Deep Neural Features for Instrumental Variable Regression
Direct Distributional Optimization for Provable Alignment of Diffusion Models
MathGAP: Out-of-Distribution Evaluation on Problems with Arbitrarily Complex Proofs
On the Optimization and Generalization of Two-layer Transformers with Sign Gradient Descent
Sufficient Context: A New Lens on Retrieval Augmented Generation Systems
TPO: Aligning Large Language Models with Multi-branch & Multi-step Preference Trees
BodyGen: Advancing Towards Efficient Embodiment Co-Design
SVDQuant: Absorbing Outliers by Low-Rank Component for 4-Bit Diffusion Models
Do LLMs Recognize Your Preferences? Evaluating Personalized Preference Following in LLMs
PN-GAIL: Leveraging Non-optimal Information from Imperfect Demonstrations
PICASO: Permutation-Invariant Context Composition with State Space Models
Gradient-Free Generation for Hard-Constrained Systems
Solving New Tasks by Adapting Internet Video Knowledge
ScImage: How good are multimodal large language models at scientific text-to-image generation?
Adversarial Score identity Distillation: Rapidly Surpassing the Teacher in One Step
From Search to Sampling: Generative Models for Robust Algorithmic Recourse
SymmetricDiffusers: Learning Discrete Diffusion Models over Finite Symmetric Groups
Intelligence at the Edge of Chaos
ParaSolver: A Hierarchical Parallel Integral Solver for Diffusion Models
Sensitivity Verification for Additive Decision Tree Ensembles
NVS-Solver: Video Diffusion Model as Zero-Shot Novel View Synthesizer
SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints
Deconstructing Denoising Diffusion Models for Self-Supervised Learning
RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph
BIRD: A Trustworthy Bayesian Inference Framework for Large Language Models
HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models
Bidirectional Decoding: Improving Action Chunking via Guided Test-Time Sampling
Language Agents Meet Causality -- Bridging LLMs and Causal World Models
Disentangling 3D Animal Pose Dynamics with Scrubbed Conditional Latent Variables
Data Center Cooling System Optimization Using Offline Reinforcement Learning
3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation
LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models
CodePlan: Unlocking Reasoning Potential in Large Language Models by Scaling Code-form Planning
Image Watermarks are Removable using Controllable Regeneration from Clean Noise
Linear Representations of Political Perspective Emerge in Large Language Models
LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding
CryoGEN: Generative Energy-based Models for Cryogenic Electron Tomography Reconstruction
Event-Driven Online Vertical Federated Learning
Revisiting Source-Free Domain Adaptation: a New Perspective via Uncertainty Control
SynQ: Accurate Zero-shot Quantization by Synthesis-aware Fine-tuning
NeRAF: 3D Scene Infused Neural Radiance and Acoustic Fields
ZETA: Leveraging $Z$-order Curves for Efficient Top-$k$ Attention
MediConfusion: Can you trust your AI radiologist? Probing the reliability of multimodal medical foundation models
TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention
Multimodal Unsupervised Domain Generalization by Retrieving Across the Modality Gap
On the Benefits of Attribute-Driven Graph Domain Adaptation
DGQ: Distribution-Aware Group Quantization for Text-to-Image Diffusion Models
Multi-Task Dense Predictions via Unleashing the Power of Diffusion
Transformers Struggle to Learn to Search
Towards Explaining the Power of Constant-depth Graph Neural Networks for Structured Linear Programming
Proxy Denoising for Source-Free Domain Adaptation
Test-time Alignment of Diffusion Models without Reward Over-optimization
MVTokenFlow: High-quality 4D Content Generation using Multiview Token Flow
Attention with Markov: A Curious Case of Single-layer Transformers
Chunk-Distilled Language Modeling
Efficient Reinforcement Learning with Large Language Model Priors
Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates
INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge
MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models
RegMix: Data Mixture as Regression for Language Model Pre-training
Flow Distillation Sampling: Regularizing 3D Gaussians with Pre-trained Matching Priors
UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation
From an LLM Swarm to a PDDL-empowered Hive: Planning Self-executed Instructions in a Multi-modal Jungle
LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization
Answer, Assemble, Ace: Understanding How LMs Answer Multiple Choice Questions
Accelerating Diffusion Transformers with Token-wise Feature Caching
IRIS: LLM-Assisted Static Analysis for Detecting Security Vulnerabilities
LASER: A Neuro-Symbolic Framework for Learning Spatio-Temporal Scene Graphs with Weak Supervision
PAL: Sample-Efficient Personalized Reward Modeling for Pluralistic Alignment
SpaceGNN: Multi-Space Graph Neural Network for Node Anomaly Detection with Extremely Limited Labels
Can LLMs Solve Longer Math Word Problems Better?
Biologically Plausible Brain Graph Transformer
On Linear Representations and Pretraining Data Frequency in Language Models
Data Taggants: Dataset Ownership Verification Via Harmless Targeted Data Poisoning
Boltzmann Semantic Score: A Semantic Metric for Evaluating Large Vision Models Using Large Language Models
LICO: Large Language Models for In-Context Molecular Optimization
NetFormer: An interpretable model for recovering dynamical connectivity in neuronal population dynamics
Beyond Canonicalization: How Tensorial Messages Improve Equivariant Message Passing
DPaI: Differentiable Pruning at Initialization with Node-Path Balance Principle
Weighted-Reward Preference Optimization for Implicit Model Fusion
Reasoning with Latent Thoughts: On the Power of Looped Transformers
Simple Guidance Mechanisms for Discrete Diffusion Models
Reasoning of Large Language Models over Knowledge Graphs with Super-Relations
Bilinear MLPs enable weight-based mechanistic interpretability
Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance
CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models
Near, far: Patch-ordering enhances vision foundation models' scene understanding
Guided Score identity Distillation for Data-Free One-Step Text-to-Image Generation
Faster Cascades via Speculative Decoding
Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model
Efficient stagewise pretraining via progressive subnetworks
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Better autoregressive regression with LLMs via regression-aware fine-tuning
LongVILA: Scaling Long-Context Visual Language Models for Long Videos
LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization
Credit-based self organizing maps: training deep topographic networks with minimal performance degradation
Exploring the Camera Bias of Person Re-identification
Learning a Fast Mixing Exogenous Block MDP using a Single Trajectory
Sort-free Gaussian Splatting via Weighted Sum Rendering
Decision Tree Induction Through LLMs via Semantically-Aware Evolution
Active Task Disambiguation with LLMs
Dynamic Contrastive Skill Learning with State-Transition Based Skill Clustering and Dynamic Length Adjustment
Efficient Model-Based Reinforcement Learning Through Optimistic Thompson Sampling
On the Adversarial Risk of Test Time Adaptation: An Investigation into Realistic Test-Time Data Poisoning
Temporal Flexibility in Spiking Neural Networks: Towards Generalization Across Time Steps and Deployment Friendliness
Efficient and Context-Aware Label Propagation for Zero-/Few-Shot Training-Free Adaptation of Vision-Language Model
EC-DIT: Scaling Diffusion Transformers with Adaptive Expert-Choice Routing
Linear Spherical Sliced Optimal Transport: A Fast Metric for Comparing Spherical Data
Model-based Offline Reinforcement Learning with Lower Expectile Q-Learning
Learning Gain Map for Inverse Tone Mapping
Expected Sliced Transport Plans
Rethinking Classifier Re-Training in Long-Tailed Recognition: Label Over-Smooth Can Balance
Selective Label Enhancement Learning for Test-Time Adaptation
CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models
Instruct-SkillMix: A Powerful Pipeline for LLM Instruction Tuning
A Computational Framework for Modeling Emergence of Color Vision in the Human Brain
Neuron Platonic Intrinsic Representation From Dynamics Using Contrastive Learning
Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis
Spiking Vision Transformer with Saccadic Attention
How efficient is LLM-generated code? A rigorous & high-standard benchmark
Exploring Prosocial Irrationality for LLM Agents: A Social Cognition View
PWM: Policy Learning with Multi-Task World Models
Needle In A Video Haystack: A Scalable Synthetic Evaluator for Video MLLMs
EgoSim: Egocentric Exploration in Virtual Worlds with Multi-modal Conditioning
SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?
Rethinking and Improving Autoformalization: Towards a Faithful Metric and a Dependency Retrieval-based Approach
Simplifying, Stabilizing and Scaling Continuous-time Consistency Models
Feedback Favors the Generalization of Neural ODEs
SyllableLM: Learning Coarse Semantic Units for Speech Language Models
Ask, and it shall be given: On the Turing completeness of prompting
Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning
LoCoDL: Communication-Efficient Distributed Learning with Local Training and Compression
IgGM: A Generative Model for Functional Antibody and Nanobody Design
MAST: model-agnostic sparsified training
Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model
Analyzing and Boosting the Power of Fine-Grained Visual Recognition for Multi-modal Large Language Models
Causal Information Prioritization for Efficient Reinforcement Learning
Copyright-Protected Language Generation via Adaptive Model Fusion
Herald: A Natural Language Annotated Lean 4 Dataset
Quamba: A Post-Training Quantization Recipe for Selective State Space Models
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
A General Framework for Off-Policy Learning with Partially-Observed Reward
VoxDialogue: Can Spoken Dialogue Systems Understand Information Beyond Words?
Beyond Worst-Case Dimensionality Reduction for Sparse Vectors
CogCoM: A Visual Language Model with Chain-of-Manipulations Reasoning
Streaming Algorithms For $\ell_p$ Flows and $\ell_p$ Regression
Learning to Select Nodes in Branch and Bound with Sufficient Tree Representation
LevAttention: Time, Space and Streaming Efficient Algorithm for Heavy Attentions
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games
OmniSep: Unified Omni-Modality Sound Separation with Query-Mixup
Hotspot-Driven Peptide Design via Multi-Fragment Autoregressive Extension
OmniBind: Large-scale Omni Multimodal Representation via Binding Spaces
VideoWebArena: Evaluating Long Context Multimodal Agents with Video Understanding Web Tasks
Air Quality Prediction with Physics-Guided Dual Neural ODEs in Open Systems
INS: Interaction-aware Synthesis to Enhance Offline Multi-agent Reinforcement Learning
Learning Generalizable Skills from Offline Multi-Task Data for Multi-Agent Cooperation
Probabilistic Geometric Principal Component Analysis with application to neural data
Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning
Towards a Complete Logical Framework for GNN Expressiveness
Empowering LLM Agents with Zero-Shot Optimal Decision-Making through Q-learning
DCT-CryptoNets: Scaling Private Inference in the Frequency Domain
FlashRNN: I/O-Aware Optimization of Traditional RNNs on modern hardware
VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks
Overcoming Lower-Level Constraints in Bilevel Optimization: A Novel Approach with Regularized Gap Functions
AdvPaint: Protecting Images from Inpainting Manipulation via Adversarial Attention Disruption
PaPaGei: Open Foundation Models for Optical Physiological Signals
Semantic Aware Representation Learning for Lifelong Learning
InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales
MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance
U-shaped and Inverted-U Scaling behind Emergent Abilities of Large Language Models
MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks
Conformal Prediction Sets Can Cause Disparate Impact
GSBA$^K$: $top$-$K$ Geometric Score-based Black-box Attack
Minimalistic Predictions for Online Class Constraint Scheduling
Linear Partial Gromov-Wasserstein Embedding
Forgetting Transformer: Softmax Attention with a Forget Gate
Energy-based Backdoor Defense Against Federated Graph Learning
Signature Kernel Conditional Independence Tests in Causal Discovery for Stochastic Processes
Learning Robust Representations with Long-Term Information for Generalization in Visual Reinforcement Learning
Partial Gromov-Wasserstein Metric
MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering
Steering LLMs' Behavior with Concept Activation Vectors
InversionGNN: A Dual Path Network for Multi-Property Molecular Optimization
BoneMet: An Open Large-Scale Multi-Modal Murine Dataset for Breast Cancer Bone Metastasis Diagnosis and Prognosis
Compute-Optimal LLMs Provably Generalize Better with Scale
Ranking-aware adapter for text-driven image ordering with CLIP
Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation
Layout-your-3D: Controllable and Precise 3D Generation with 2D Blueprint
VideoGLUE: Video General Understanding Evaluation of Foundation Models
Looped Transformers for Length Generalization
No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images
A Truncated Newton Method for Optimal Transport
Going Beyond Feature Similarity: Effective Dataset distillation based on Class-aware Conditional Mutual Information
KinFormer: Generalizable Dynamical Symbolic Regression for Catalytic Organic Reaction Kinetics
Diffusion-based Neural Network Weights Generation
A Simple Approach to Unifying Diffusion-based Conditional Generation
HQGS: High-Quality Novel View Synthesis with Gaussian Splatting in Degraded Scenes
RobuRCDet: Enhancing Robustness of Radar-Camera Fusion in Bird's Eye View for 3D Object Detection
MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion
Learning 3D Perception from Others' Predictions
A Training-Free Sub-quadratic Cost Transformer Model Serving Framework with Hierarchically Pruned Attention
SV4D: Dynamic 3D Content Generation with Multi-Frame and Multi-View Consistency
Learning Partial Graph Matching via Optimal Partial Transport
RMP-SAM: Towards Real-Time Multi-Purpose Segment Anything
GNNs Getting ComFy: Community and Feature Similarity Guided Rewiring
Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturbation
Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models
Standardizing Structural Causal Models
Compositional Entailment Learning for Hyperbolic Vision-Language Models
COAT: Compressing Optimizer states and Activations for Memory-Efficient FP8 Training
MatryoshkaKV: Adaptive KV Compression via Trainable Orthogonal Projection
Revisiting Random Walks for Learning on Graphs
Beyond single neurons: population response geometry in digital twins of mouse visual cortex
MP-Mat: A 3D-and-Instance-Aware Human Matting and Editing Framework with Multiplane Representation
Words in Motion: Extracting Interpretable Control Vectors for Motion Transformers
Machine Unlearning Fails to Remove Data Poisoning Attacks
The Hidden Cost of Waiting for Accurate Predictions
API Pack: A Massive Multi-Programming Language Dataset for API Call Generation
Certified Robustness Under Bounded Levenshtein Distance
Vision CNNs trained to estimate spatial latents learned similar ventral-stream-aligned representations
How Do Large Language Models Understand Graph Patterns? A Benchmark for Graph Pattern Comprehension
A Large-Scale 3D Face Mesh Video Dataset via Neural Re-parameterized Optimization
Generating Physical Dynamics under Priors
Emergence of meta-stable clustering in mean-field transformer models
Tracking the Copyright of Large Vision-Language Models through Parameter Learning Adversarial Images
MOFFlow: Flow Matching for Structure Prediction of Metal-Organic Frameworks
Lean-STaR: Learning to Interleave Thinking and Proving
Timer-XL: Long-Context Transformers for Unified Time Series Forecasting
Open-Source vs Close-Source: The Context Utilization Challenge
SBSC: Step-by-Step Coding for Improving Mathematical Olympiad Performance
Scaling Speech-Text Pre-training with Synthetic Interleaved Data
Differentiable Optimization of Similarity Scores Between Models and Brains
Learning to Generate Diverse Pedestrian Movements from Web Videos with Noisy Labels
AHA: A Vision-Language-Model for Detecting and Reasoning Over Failures in Robotic Manipulation
InstaSHAP: Interpretable Additive Models Explain Shapley Values Instantly
TaskGalaxy: Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
Brain-inspired $L_p$-Convolution benefits large kernels and aligns better with visual cortex
Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation
SoundCTM: Unifying Score-based and Consistency Models for Full-band Text-to-Sound Generation
Language Model Alignment in Multilingual Trolley Problems
UNIP: Rethinking Pre-trained Attention Patterns for Infrared Semantic Segmentation
OS-ATLAS: Foundation Action Model for Generalist GUI Agents
Rethinking Reward Model Evaluation: Are We Barking up the Wrong Tree?
Diffusion Models are Evolutionary Algorithms
Towards Homogeneous Lexical Tone Decoding from Heterogeneous Intracranial Recordings
Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning
Learning Structured Universe Graph with Outlier OOD Detection for Partial Matching
Not All Language Model Features Are One-Dimensionally Linear
Strong Model Collapse
Adversaries With Incentives: A Strategic Alternative to Adversarial Robustness
Adaptive Gradient Clipping for Robust Federated Learning
VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
CyberHost: A One-stage Diffusion Framework for Audio-driven Talking Body Generation
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency
Rethinking Graph Prompts: Unraveling the Power of Data Manipulation in Graph Neural Networks
Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
DSPO: Direct Score Preference Optimization for Diffusion Model Alignment
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression
Robust System Identification: Finite-sample Guarantees and Connection to Regularization
OASIS Uncovers: High-Quality T2I Models, Same Old Stereotypes
Finding and Only Finding Differential Nash Equilibria by Both Pretending to be a Follower
Efficient Reward Poisoning Attacks on Online Deep Reinforcement Learning
CirT: Global Subseasonal-to-Seasonal Forecasting with Geometry-inspired Transformer
Improved Regret Bounds for Linear Adversarial MDPs via Linear Optimization
RRM: Robust Reward Model Training Mitigates Reward Hacking
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Protecting against simultaneous data poisoning attacks
Robust Conformal Prediction with a Single Binary Certificate
OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code
Improved Sampling Algorithms for Lévy-Itô Diffusion Models
LeanVec: Searching vectors faster by making them fit
Understanding Fairness Surrogate Functions in Algorithmic Fairness
On the Inherent Privacy Properties of Discrete Denoising Diffusion Models
Unlocking Guidance for Discrete State-Space Diffusion and Flow Models
Agree to Disagree: Demystifying Homogeneous Deep Ensembles through Distributional Equivalence
What Has Been Overlooked in Contrastive Source-Free Domain Adaptation: Leveraging Source-Informed Latent Augmentation within Neighborhood Context
GOPlan: Goal-conditioned Offline Reinforcement Learning by Planning with Learned Models
Causal Reasoning and Large Language Models: Opening a New Frontier for Causality
A Statistical Approach for Controlled Training Data Detection
Revisiting Prefix-tuning: Statistical Benefits of Reparameterization among Prompts
AutoCLIP: Auto-tuning Zero-Shot Classifiers for Vision-Language Models
Breaking Class Barriers: Efficient Dataset Distillation via Inter-Class Feature Compensator
Transition Path Sampling with Improved Off-Policy Training of Diffusion Path Samplers
Interpreting Global Perturbation Robustness of Image Models using Axiomatic Spectral Importance Decomposition
From Complexity to Clarity: Analytical Expressions of Deep Neural Network Weights via Clifford Algebra and Convexity
Segment Any 3D Object with Language
Exploiting Hankel-Toeplitz Structures for Fast Computation of Kernel Precision Matrices
Forget the Data and Fine-Tuning! Just Fold the Network to Compress
LoRA Learns Less and Forgets Less
Frequency-Guided Masking for Enhanced Vision Self-Supervised Learning
Hessian Free Efficient Single Loop Iterative Differentiation Methods for Bi-Level Optimization Problems
Transformer Encoder Satisfiability: Complexity and Impact on Formal Reasoning
Equivariant Symmetry Breaking Sets
Robustness Auditing for Linear Regression: To Singularity and Beyond
Reward Guided Latent Consistency Distillation
Linear Mode Connectivity in Differentiable Tree Ensembles
Efficient Cross-Episode Meta-RL
How Two-Layer Neural Networks Learn, One (Giant) Step at a Time
Learning Regularized Graphon Mean-Field Games with Unknown Graphons
STAMP: Scalable Task- And Model-agnostic Collaborative Perception
Graph Neural Preconditioners for Iterative Solutions of Sparse Linear Systems
HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion Models
MGCFNN: A Neural MultiGrid Solver with Novel Fourier Neural Network for High Wave Number Helmholtz Equations
Measuring memorization in RLHF for code completion
HyperPLR: Hypergraph Generation through Projection, Learning, and Reconstruction
A Curious Case of the Missing Measure: Better Scores and Worse Generation
PolyNet: Learning Diverse Solution Strategies for Neural Combinatorial Optimization
ESE: Espresso Sentence Embeddings
cryoSPHERE: Single-Particle HEterogeneous REconstruction from cryo EM
Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective
Understanding Model Calibration - A gentle introduction and visual exploration of calibration and the expected calibration error (ECE)
“I Am the One and Only, Your Cyber BFF”: Understanding the Impact of GenAI Requires Understanding the Impact of Anthropomorphic AI
Restating the Proof of Linear Convergence for Linear GNNs
A Visual Dive into Conditional Flow Matching
The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve?
Studying the Interplay Between the Actor and Critic Representations in Reinforcement Learning
Positional Embeddings in Transformer Models: Evolution from Text to Vision Domains
Analysing The Spectral Biases in Generative Models
How to visualize training dynamics in neural networks
Flaws of ImageNet, Computer Vision's Favourite Dataset
Efficient Model Editing with Task-Localized Sparse Fine-tuning
Building Blocks of Differentially Private Training
A primer on analytical learning dynamics of nonlinear neural networks
Robustness Reprogramming for Representation Learning
Vision-LSTM: xLSTM as Generic Vision Backbone
Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision
Fine-Tuning Token-Based Large Multimodal Models: What Works, What Doesn’t and What's Next
Test-time Adaptation for Regression by Subspace Alignment
3D Vision-Language Gaussian Splatting
Learning the Optimal Stopping for Early Classification within Finite Horizons via Sequential Probability Ratio Test
Lost in Prediction: Why Social Media Narratives Don't Help Macroeconomic Forecasting?
Dynamic Neural Fortresses: An Adaptive Shield for Model Extraction Defense
Holistically Evaluating the Environmental Impact of Creating Language Models
Watermark Anything With Localized Messages
HOPE for a Robust Parameterization of Long-memory State Space Models
GS-LiDAR: Generating Realistic LiDAR Point Clouds with Panoramic Gaussian Splatting
Neural Eulerian Scene Flow Fields
CoInD: Enabling Logical Compositions in Diffusion Models
Probing the Latent Hierarchical Structure of Data via Diffusion Models
Planning Anything with Rigor: General-Purpose Zero-Shot Planning with LLM-based Formalized Programming
Variational Bayesian Pseudo-Coreset
The Computational Complexity of Circuit Discovery for Inner Interpretability
Do Deep Neural Network Solutions Form a Star Domain?
Unsupervised Zero-Shot Reinforcement Learning via Dual-Value Forward-Backward Representation
Edge-aware Image Smoothing with Relative Wavelet Domain Representation
Toward Exploratory Inverse Constraint Inference with Generative Diffusion Verifiers
Uni-Sign: Toward Unified Sign Language Understanding at Scale
Domain Guidance: A Simple Transfer Approach for a Pre-trained Diffusion Model
ProtComposer: Compositional Protein Structure Generation with 3D Ellipsoids
Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces
Mufu: Multilingual Fused Learning for Low-Resource Translation with LLM
Residual Connections and Normalization Can Provably Prevent Oversmoothing in GNNs
ThunderKittens: Simple, Fast, and $\textit{Adorable}$ Kernels
Incremental Causal Effect for Time to Treatment Initialization
Image and Video Tokenization with Binary Spherical Quantization
A Percolation Model of Emergence: Analyzing Transformers Trained on a Formal Language
Monitoring Latent World States in Language Models with Propositional Probes
Factor Graph-based Interpretable Neural Networks
Robust Watermarking Using Generative Priors Against Image Editing: From Benchmarking to Advances
Erasing Concept Combination from Text-to-Image Diffusion Model
Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
On the Fourier analysis in the SO(3) space : the EquiLoPO Network
Efficient Source-Free Time-Series Adaptation via Parameter Subspace Disentanglement
Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models
Noise-conditioned Energy-based Annealed Rewards (NEAR): A Generative Framework for Imitation Learning from Observation
Generalized Principal-Agent Problem with a Learning Agent
Towards Robust and Parameter-Efficient Knowledge Unlearning for LLMs
PaRa: Personalizing Text-to-Image Diffusion via Parameter Rank Reduction
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models
Flow Matching with Gaussian Process Priors for Probabilistic Time Series Forecasting
LeanQuant: Accurate and Scalable Large Language Model Quantization with Loss-error-aware Grid
Zero-cost Proxy for Adversarial Robustness Evaluation
Revisiting Mode Connectivity in Neural Networks with Bezier Surface
TexTailor: Customized Text-aligned Texturing via Effective Resampling
Revealing and Reducing Gender Biases in Vision and Language Assistants (VLAs)
Kinetix: Investigating the Training of General Agents through Open-Ended Physics-Based Control Tasks
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios
STAFF: Speculative Coreset Selection for Task-Specific Fine-tuning
Linear Recurrences Accessible to Everyone
Classic but Everlasting: Traditional Gradient-Based Algorithms Converges Fast Even in Time-Varying Multi-Player Games
Aligned Better, Listen Better For Audio-Visual Large Language Models
Implicit Neural Surface Deformation with Explicit Velocity Fields
GenXD: Generating Any 3D and 4D Scenes
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding
EC-Diffuser: Multi-Object Manipulation via Entity-Centric Behavior Generation
Discrete GCBF Proximal Policy Optimization for Multi-agent Safe Optimal Control
Long-Sequence Recommendation Models Need Decoupled Embeddings
Leveraging Submodule Linearity Enhances Task Arithmetic Performance in LLMs
Open-Vocabulary Customization from CLIP via Data-Free Knowledge Distillation
Representational Similarity via Interpretable Visual Concepts
DeepLTL: Learning to Efficiently Satisfy Complex LTL Specifications for Multi-Task RL
Long Context Compression with Activation Beacon
Singular Subspace Perturbation Bounds via Rectangular Random Matrix Diffusions
Capability Localization: Capabilities Can be Localized rather than Individual Knowledge
ProtPainter: Draw or Drag Protein via Topology-guided Diffusion
Natural Language Inference Improves Compositionality in Vision-Language Models
Policy Design in Long-run Welfare Dynamics
Bayesian Optimization via Continual Variational Last Layer Training
Federated $Q$-Learning with Reference-Advantage Decomposition: Almost Optimal Regret and Logarithmic Communication Cost
Efficient and Robust Neural Combinatorial Optimization via Wasserstein-Based Coresets
REGENT: A Retrieval-Augmented Generalist Agent That Can Act In-Context in New Environments
RevisEval: Improving LLM-as-a-Judge via Response-Adapted References
Think while You Generate: Discrete Diffusion with Planned Denoising
Fast and Slow Streams for Online Time Series Forecasting Without Information Leakage
Gaussian Head & Shoulders: High Fidelity Neural Upper Body Avatars with Anchor Gaussian Guided Texture Warping
MGMapNet: Multi-Granularity Representation Learning for End-to-End Vectorized HD Map Construction
Adaptive Camera Sensor for Vision Models
Differentially private learners for heterogeneous treatment effects
Neuroplastic Expansion in Deep Reinforcement Learning
Provable Convergence Bounds for Hybrid Dynamical Sampling and Optimization
Stabilizing Reinforcement Learning in Differentiable Multiphysics Simulation
L-WISE: Boosting human visual category learning through model-based image selection and enhancement
A General Framework for Producing Interpretable Semantic Text Embeddings
Multiview Equivariance Improves 3D Correspondence Understanding with Minimal Feature Finetuning
What should a neuron aim for? Designing local objective functions based on information theory
EqNIO: Subequivariant Neural Inertial Odometry
A deep inverse-mapping model for a flapping robotic wing
General Scene Adaptation for Vision-and-Language Navigation
Intermediate Layer Classifiers for OOD generalization
How Gradient descent balances features: A dynamical analysis for two-layer neural networks
Reinforcement Learning for Control of Non-Markovian Cellular Population Dynamics
PolyhedronNet: Representation Learning for Polyhedra with Surface-attributed Graph
Differentiable Causal Discovery for Latent Hierarchical Causal Models
SplatFormer: Point Transformer for Robust 3D Gaussian Splatting
MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models
LoRA-X: Bridging Foundation Models with Training-Free Cross-Model Adaptation
SPA: 3D Spatial-Awareness Enables Effective Embodied Representation
Multi-Modal and Multi-Attribute Generation of Single Cells with CFGen
Metamizer: A Versatile Neural Optimizer for Fast and Accurate Physics Simulations
Generative Representational Instruction Tuning
Language-Assisted Feature Transformation for Anomaly Detection
Be More Diverse than the Most Diverse: Optimal Mixtures of Generative Models via Mixture-UCB Bandit Algorithms
Few-Class Arena: A Benchmark for Efficient Selection of Vision Models and Dataset Difficulty Measurement
Compositional simulation-based inference for time series
Algorithmic Stability Based Generalization Bounds for Adversarial Training
Towards Universality: Studying Mechanistic Similarity Across Language Model Architectures
Lossy Compression with Pretrained Diffusion Models
Learning-Guided Rolling Horizon Optimization for Long-Horizon Flexible Job-Shop Scheduling
From Commands to Prompts: LLM-based Semantic File System for AIOS
CollabEdit: Towards Non-destructive Collaborative Knowledge Editing
Gradient correlation is a key ingredient to accelerate SGD with momentum
On a Connection Between Imitation Learning and RLHF
Long-Context Linear System Identification
RouteLLM: Learning to Route LLMs from Preference Data
Has the Deep Neural Network learned the Stochastic Process? An Evaluation Viewpoint
Solving Differential Equations with Constrained Learning
SelectFormer: Private and Practical Data Selection for Transformers
Conformal Language Model Reasoning with Coherent Factuality
Generating Graphs via Spectral Diffusion
HiLo: A Learning Framework for Generalized Category Discovery Robust to Domain Shifts
BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities
Mixture of In-Context Prompters for Tabular PFNs
Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization
Streamlining Prediction in Bayesian Deep Learning
SMI-Editor: Edit-based SMILES Language Model with Fragment-level Supervision
ParetoFlow: Guided Flows in Multi-Objective Optimization
The Hyperfitting Phenomenon: Sharpening and Stabilizing LLMs for Open-Ended Text Generation
3DMolFormer: A Dual-channel Framework for Structure-based Drug Discovery
ROUTE: Robust Multitask Tuning and Collaboration for Text-to-SQL
Can We Trust Embodied Agents? Exploring Backdoor Attacks against Embodied LLM-Based Decision-Making Systems
Sparse Autoencoders Reveal Temporal Difference Learning in Large Language Models
Null Counterfactual Factor Interactions for Goal-Conditioned Reinforcement Learning
Trajectory attention for fine-grained video motion control
Qinco2: Vector Compression and Search with Improved Implicit Neural Codebooks
Large Language Models Assume People are More Rational than We Really are
Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL
Seq-VCR: Preventing Collapse in Intermediate Transformer Representations for Enhanced Reasoning
Federated Domain Generalization with Data-free On-server Matching Gradient
N-ForGOT: Towards Not-forgetting and Generalization of Open Temporal Graph Learning
LongGenBench: Benchmarking Long-Form Generation in Long Context LLMs
Interpretable Causal Representation Learning for Biological Data in the Pathway Space
CtrLoRA: An Extensible and Efficient Framework for Controllable Image Generation
Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?
From Tokens to Words: On the Inner Lexicon of LLMs
WorkflowLLM: Enhancing Workflow Orchestration Capability of Large Language Models
Group Distributionally Robust Dataset Distillation with Risk Minimization
Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling
Procedural Synthesis of Synthesizable Molecules
IPDreamer: Appearance-Controllable 3D Object Generation with Complex Image Prompts
Better Instruction-Following Through Minimum Bayes Risk
KAA: Kolmogorov-Arnold Attention for Enhancing Attentive Graph Neural Networks
Revisiting a Design Choice in Gradient Temporal Difference Learning
FlowDec: A flow-based full-band general audio codec with high perceptual quality
LiFT: Learning to Fine-Tune via Bayesian Parameter Efficient Meta Fine-Tuning
LASeR: Towards Diversified and Generalizable Robot Design with Large Language Models
Doubly Optimal Policy Evaluation for Reinforcement Learning
ToolACE: Winning the Points of LLM Function Calling
On Speeding Up Language Model Evaluation
Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning
SPA-BENCH: A COMPREHENSIVE BENCHMARK FOR SMARTPHONE AGENT EVALUATION
Drama: Mamba-Enabled Model-Based Reinforcement Learning Is Sample and Parameter Efficient
DiffPuter: An EM-Driven Diffusion Model for Missing Data Imputation
Visually Consistent Hierarchical Image Classification
Transformers Can Learn Temporal Difference Methods for In-Context Reinforcement Learning
ConMix: Contrastive Mixup at Representation Level for Long-tailed Deep Clustering
MaestroMotif: Skill Design from Artificial Intelligence Feedback
Simulating Human-like Daily Activities with Desire-driven Autonomy
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Extendable and Iterative Structure Learning Strategy for Bayesian Networks
Automatic Curriculum Expert Iteration for Reliable LLM Reasoning
OMG: Opacity Matters in Material Modeling with Gaussian Splatting
Input Space Mode Connectivity in Deep Neural Networks
Self-Correcting Decoding with Generative Feedback for Mitigating Hallucinations in Large Vision-Language Models
Reading Your Heart: Learning ECG Words and Sentences via Pre-training ECG Language Model
Machine Unlearning via Simulated Oracle Matching
On the Modeling Capabilities of Large Language Models for Sequential Decision Making
You Only Prune Once: Designing Calibration-Free Model Compression With Policy Learning
Tailoring Mixup to Data for Calibration
Mixture-of-Agents Enhances Large Language Model Capabilities
Walk the Talk? Measuring the Faithfulness of Large Language Model Explanations
Decoding Game: On Minimax Optimality of Heuristic Text Generation Strategies
ALLaM: Large Language Models for Arabic and English
Elliptic Loss Regularization
An Intelligent Agentic System for Complex Image Restoration Problems
Scaling Transformers for Low-Bitrate High-Quality Speech Coding
Overcoming False Illusions in Real-World Face Restoration with Multi-Modal Guided Diffusion Model
Test of Time: A Benchmark for Evaluating LLMs on Temporal Reasoning
TabDiff: a Mixed-type Diffusion Model for Tabular Data Generation
On Quantizing Neural Representation for Variable-Rate Video Coding
UniCon: Unidirectional Information Flow for Effective Control of Large-Scale Diffusion Models
FedTMOS: Efficient One-Shot Federated Learning with Tsetlin Machine
Cross-Modal Safety Mechanism Transfer in Large Vision-Language Models
HyperFace: Generating Synthetic Face Recognition Datasets by Exploring Face Embedding Hypersphere
FreDF: Learning to Forecast in the Frequency Domain
Training-Free Message Passing for Learning on Hypergraphs
Debiasing Mini-Batch Quadratics for Applications in Deep Learning
Robust Simulation-Based Inference under Missing Data via Neural Processes
3DIS: Depth-Driven Decoupled Image Synthesis for Universal Multi-Instance Generation
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation
AdaGrad under Anisotropic Smoothness
REFINE: Inversion-Free Backdoor Defense via Model Reprogramming
kNN Attention Demystified: A Theoretical Exploration for Scalable Transformers
Integrating Protein Dynamics into Structure-Based Drug Design via Full-Atom Stochastic Flows
RankSHAP: Shapley Value Based Feature Attributions for Learning to Rank
Graph Assisted Offline-Online Deep Reinforcement Learning for Dynamic Workflow Scheduling
DICE: End-to-end Deformation Capture of Hand-Face Interactions from a Single Image
TraceVLA: Visual Trace Prompting Enhances Spatial-Temporal Awareness for Generalist Robotic Policies
A-Bench: Are LMMs Masters at Evaluating AI-generated Images?
TypedThinker: Diversify Large Language Model Reasoning with Typed Thinking
Global Identifiability of Overcomplete Dictionary Learning via L1 and Volume Minimization
Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset
Let Me Grok for You: Accelerating Grokking via Embedding Transfer from a Weaker Model
What Do You See in Common? Learning Hierarchical Prototypes over Tree-of-Life to Discover Evolutionary Traits
DUALFormer: Dual Graph Transformer
CLOVER: Cross-Layer Orthogonal Vectors Pruning and Fine-Tuning
On the Completeness of Invariant Geometric Deep Learning Models
Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers
Refining CLIP's Spatial Awareness: A Visual-Centric Perspective
COME: Test-time Adaption by Conservatively Minimizing Entropy
Number Cookbook: Number Understanding of Language Models and How to Improve It
A Probabilistic Perspective on Unlearning and Alignment for Large Language Models
Proximal Mapping Loss: Understanding Loss Functions in Crowd Counting & Localization
Language-Image Models with 3D Understanding
Binary Losses for Density Ratio Estimation
Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo
Discovering Temporally Compositional Neural Manifolds with Switching Infinite GPFA
How to Probe: Simple Yet Effective Techniques for Improving Post-hoc Explanations
PnP-Flow: Plug-and-Play Image Restoration with Flow Matching
InvestESG: A multi-agent reinforcement learning benchmark for studying climate investment as a social dilemma
GOFA: A Generative One-For-All Model for Joint Graph Language Modeling
Variational Search Distributions
Descent with Misaligned Gradients and Applications to Hidden Convexity
Can One Modality Model Synergize Training of Other Modality Models?
Asynchronous Federated Reinforcement Learning with Policy Gradient Updates: Algorithm Design and Convergence Analysis
Predictive Uncertainty Quantification for Bird's Eye View Segmentation: A Benchmark and Novel Loss Function
Make Haste Slowly: A Theory of Emergent Structured Mixed Selectivity in Feature Learning ReLU Networks
Intent3D: 3D Object Detection in RGB-D Scans Based on Human Intention
Everything, Everywhere, All at Once: Is Mechanistic Interpretability Identifiable?
POGEMA: A Benchmark Platform for Cooperative Multi-Agent Pathfinding
Can Neural Networks Achieve Optimal Computational-statistical Tradeoff? An Analysis on Single-Index Model
AutoBencher: Towards Declarative Benchmark Construction
Breaking Neural Network Scaling Laws with Modularity
GraphEval: A Lightweight Graph-Based LLM Framework for Idea Evaluation
Captured by Captions: On Memorization and its Mitigation in CLIP Models
Steering Large Language Models between Code Execution and Textual Reasoning
Multi-Dimensional Conformal Prediction
Amortized Control of Continuous State Space Feynman-Kac Model for Irregular Time Series
AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials
Learning Task Belief Similarity with Latent Dynamics for Meta-Reinforcement Learning
COPER: Correlation-based Permutations for Multi-View Clustering
Wide Neural Networks Trained with Weight Decay Provably Exhibit Neural Collapse
Bi-Factorial Preference Optimization: Balancing Safety-Helpfulness in Language Models
Risk-Controlling Model Selection via Guided Bayesian Optimization
Learning under Temporal Label Noise
ComaDICE: Offline Cooperative Multi-Agent Reinforcement Learning with Stationary Distribution Shift Regularization
KooNPro: A Variance-Aware Koopman Probabilistic Model Enhanced by Neural Process for Time Series Forecasting
Convex Formulations for Training Two-Layer ReLU Neural Networks
Precise Localization of Memories: A Fine-grained Neuron-level Knowledge Editing Technique for LLMs
VVC-Gym: A Fixed-Wing UAV Reinforcement Learning Environment for Multi-Goal Long-Horizon Problems
Manifolds, Random Matrices and Spectral Gaps: The geometric phases of generative diffusion
HG-Adapter: Improving Pre-Trained Heterogeneous Graph Neural Networks with Dual Adapters
Expressivity of Neural Networks with Random Weights and Learned Biases
Time After Time: Deep-Q Effect Estimation for Interventions on When and What to do
Cross-Entropy Is All You Need To Invert the Data Generating Process
GPS: A Probabilistic Distributional Similarity with Gumbel Priors for Set-to-Set Matching
Regret-Optimal List Replicable Bandit Learning: Matching Upper and Lower Bounds
Privacy Auditing of Large Language Models
Physics-Informed Deep Inverse Operator Networks for Solving PDE Inverse Problems
Complementary Label Learning with Positive Label Guessing and Negative Label Enhancement
Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning
Approximation algorithms for combinatorial optimization with predictions
dEBORA: Efficient Bilevel Optimization-based low-Rank Adaptation
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Model merging with SVD to tie the Knots
Repurposing in AI: A Distinct Approach or an Extension of Creative Problem Solving?
Scaling up the Banded Matrix Factorization Mechanism for Large Scale Differentially Private ML
GeoLoRA: Geometric integration for parameter efficient fine-tuning
Curriculum-aware Training for Discriminating Molecular Property Prediction Models
Concept Bottleneck Large Language Models
Variational Diffusion Posterior Sampling with Midpoint Guidance
Preference Diffusion for Recommendation
MotherNet: Fast Training and Inference via Hyper-Network Transformers
MaskBit: Embedding-free Image Generation via Bit Tokens
NutriBench: A Dataset for Evaluating Large Language Models in Nutrition Estimation from Meal Descriptions
R2Det: Exploring Relaxed Rotation Equivariance in 2D Object Detection
MTSAM: Multi-Task Fine-Tuning for Segment Anything Model
Exploring Learning Complexity for Efficient Downstream Dataset Pruning
STAR: Stability-Inducing Weight Perturbation for Continual Learning
Pitfalls of Evidence-Based AI Policy
Layerwise Recurrent Router for Mixture-of-Experts
The Loss Landscape of Deep Linear Neural Networks: a Second-order Analysis
GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training
FedLWS: Federated Learning with Adaptive Layer-wise Weight Shrinking
UniGEM: A Unified Approach to Generation and Property Prediction for Molecules
Scaling Instruction-tuned LLMs to Million-token Contexts via Hierarchical Synthetic Data Generation
High-dimension Prototype is a Better Incremental Object Detection Learner
Learning multi-modal generative models with permutation-invariant encoders and tighter variational objectives
Motion-Agent: A Conversational Framework for Human Motion Generation with LLMs
Noise Stability Optimization for Finding Flat Minima: A Hessian-based Regularization Approach
RecDreamer: Consistent Text-to-3D Generation via Uniform Score Distillation
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solver
Interference Among First-Price Pacing Equilibria: A Bias and Variance Analysis
MLLM as Retriever: Interactively Learning Multimodal Retrieval for Embodied Agents
HyperDAS: Towards Automating Mechanistic Interpretability with Hypernetworks
BrainUICL: An Unsupervised Individual Continual Learning Framework for EEG Applications
Benign Overfitting in Out-of-Distribution Generalization of Linear Models
Brain Bandit: A Biologically Grounded Neural Network for Efficient Control of Exploration
ACC-Collab: An Actor-Critic Approach to Multi-Agent LLM Collaboration
Towards Understanding Why Label Smoothing Degrades Selective Classification and How to Fix It
Dynamic Negative Guidance of Diffusion Models
Divergence-enhanced Knowledge-guided Context Optimization for Visual-Language Prompt Tuning
Beyond Random Masking: When Dropout meets Graph Convolutional Networks
A Multiscale Frequency Domain Causal Framework for Enhanced Pathological Analysis
Can a MISL Fly? Analysis and Ingredients for Mutual Information Skill Learning
Towards Foundation Models for Mixed Integer Linear Programming
LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation
LaGeM: A Large Geometry Model for 3D Representation Learning and Diffusion
Inference-Aware Fine-Tuning for Best-of-N Sampling in Large Language Models
Enhancing Cognition and Explainability of Multimodal Foundation Models with Self-Synthesized Data
Reassessing EMNLP 2024’s Best Paper: Does Divergence-Based Calibration for MIAs Hold Up?
Bad-PFL: Exploiting Backdoor Attacks against Personalized Federated Learning
Graph Sparsification via Mixture of Graphs
Regretful Decisions under Label Noise
Minimax Optimal Reinforcement Learning with Quasi-Optimism
Strong Preferences Affect the Robustness of Preference Models and Value Alignment
Breaking Free from MMI: A New Frontier in Rationalization by Probing Input Utilization
Lasso Bandit with Compatibility Condition on Optimal Arm
Zero-Shot Whole-Body Humanoid Control via Behavioral Foundation Models
Bandit Learning in Matching Markets with Indifference
MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs
Real-time design of architectural structures with differentiable mechanics and neural networks
Reflective Gaussian Splatting
Group-robust Sample Reweighting for Subpopulation Shifts via Influence Functions
Multi-session, multi-task neural decoding from distinct cell-types and brain regions
Unveiling the Magic of Code Reasoning through Hypothesis Decomposition and Amendment
Boosting Latent Diffusion with Perceptual Objectives
Examining Alignment of Large Language Models through Representative Heuristics: the case of political stereotypes
Improving Text-to-Image Consistency via Automatic Prompt Optimization
Logically Consistent Language Models via Neuro-Symbolic Integration
Temporal Difference Learning: Why It Can Be Fast and How It Will Be Faster
Lie Algebra Canonicalization: Equivariant Neural Operators under arbitrary Lie Groups
$\mathbb{X}$-Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs
Entropy-based Activation Function Optimization: A Method on Searching Better Activation Functions
Learning Dynamics of LLM Finetuning
Learn Your Reference Model for Real Good Alignment
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token
Heavy-Tailed Diffusion Models
Causal Effect Estimation with Mixed Latent Confounders and Post-treatment Variables
Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents
AnoLLM: Large Language Models for Tabular Anomaly Detection
ToddlerDiffusion: Interactive Structured Image Generation with Cascaded Schrödinger Bridge
Graph Neural Ricci Flow: Evolving Feature from a Curvature Perspective
OCEAN: Offline Chain-of-thought Evaluation and Alignment in Large Language Models
Self-Updatable Large Language Models by Integrating Context into Model Parameters
AttriBoT: A Bag of Tricks for Efficiently Approximating Leave-One-Out Context Attribution
Object-Centric Pretraining via Target Encoder Bootstrapping
A Black Swan Hypothesis: The Role of Human Irrationality in AI Safety
MotionDreamer: One-to-Many Motion Synthesis with Localized Generative Masked Transformer
TestGenEval: A Real World Unit Test Generation and Test Completion Benchmark
Self-Boosting Large Language Models with Synthetic Preference Data
Stealthy Shield Defense: A Conditional Mutual Information-Based Approach against Black-Box Model Inversion Attacks
EdgeRunner: Auto-regressive Auto-encoder for Artistic Mesh Generation
Stiefel Flow Matching for Moment-Constrained Structure Elucidation
Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference under Ambiguities
Rethinking the generalization of drug target affinity prediction algorithms via similarity aware evaluation
HALL-E: Hierarchical Neural Codec Language Model for Minute-Long Zero-Shot Text-to-Speech Synthesis
Optimal Brain Apoptosis
Efficient Off-Policy Learning for High-Dimensional Action Spaces
Interactive Adjustment for Human Trajectory Prediction with Individual Feedback
GDrag:Towards General-Purpose Interactive Editing with Anti-ambiguity Point Diffusion
A Stochastic Approach to the Subset Selection Problem via Mirror Descent
Diversity-Rewarded CFG Distillation
Comparing Targeting Strategies for Maximizing Social Welfare with Limited Resources
Forking Paths in Neural Text Generation
BOND: Aligning LLMs with Best-of-N Distillation
Progressive Compositionality in Text-to-Image Generative Models
Adversarial Policy Optimization for Offline Preference-based Reinforcement Learning
GameGen-X: Interactive Open-world Game Video Generation
LoLCATs: On Low-Rank Linearizing of Large Language Models
Shapley-Guided Utility Learning for Effective Graph Inference Data Valuation
ViSAGe: Video-to-Spatial Audio Generation
Quality Measures for Dynamic Graph Generative Models
TGB-Seq Benchmark: Challenging Temporal GNNs with Complex Sequential Dynamics
Optimal Strong Regret and Violation in Constrained MDPs via Policy Optimization
Controllable Generation via Locally Constrained Resampling
Longhorn: State Space Models are Amortized Online Learners
Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation
Generalizability of Neural Networks Minimizing Empirical Risk Based on Expressive Power
ParFam -- (Neural Guided) Symbolic Regression via Continuous Global Optimization
Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations
Causal Identification for Complex Functional Longitudinal Studies
Semi-Supervised CLIP Adaptation by Enforcing Semantic and Trapezoidal Consistency
Enhancing Zeroth-order Fine-tuning for Language Models with Low-rank Structures
ReMatching Dynamic Reconstruction Flow
VideoPhy: Evaluating Physical Commonsense for Video Generation
An Efficient Framework for Crediting Data Contributors of Diffusion Models
Interpreting Emergent Planning in Model-Free Reinforcement Learning
Inverse Constitutional AI: Compressing Preferences into Principles
Generative Adapter: Contextualizing Language Models in Parameters with A Single Forward Pass
When do GFlowNets learn the right distribution?
SqueezeAttention: 2D Management of KV-Cache in LLM Inference via Layer-wise Optimal Budget
OmniPhysGS: 3D Constitutive Gaussians for General Physics-Based Dynamics Generation
Provence: efficient and robust context pruning for retrieval-augmented generation
3D-Properties: Identifying Challenges in DPO and Charting a Path Forward
Transformer Meets Twicing: Harnessing Unattended Residual Information
Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark
Iterative Dual-RL: An Optimal Discriminator Weighted Imitation Perspective for Reinforcement Learning
Towards Optimal Multi-draft Speculative Decoding
MoLEx: Mixture of Layer Experts for Fine-tuning with Sparse Upcycling
Controlled LLM Decoding via Discrete Auto-regressive Biasing
Controllable Unlearning for Image-to-Image Generative Models via $\epsilon$-Constrained Optimization
LLMOPT: Learning to Define and Solve General Optimization Problems from Scratch
DiffusionGuard: A Robust Defense Against Malicious Diffusion-based Image Editing
Infilling Score: A Pretraining Data Detection Algorithm for Large Language Models
Multi-Reward as Condition for Instruction-based Image Editing
Conditional Diffusion with Ordinal Regression: Longitudinal Data Generation for Neurodegenerative Disease Studies
Towards Marginal Fairness Sliced Wasserstein Barycenter
The Complexity of Two-Team Polymatrix Games with Independent Adversaries
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models
Sample then Identify: A General Framework for Risk Control and Assessment in Multimodal Large Language Models
Mixture of Parrots: Experts improve memorization more than reasoning
OSDA Agent: Leveraging Large Language Models for De Novo Design of Organic Structure Directing Agents
Sports-Traj: A Unified Trajectory Generation Model for Multi-Agent Movement in Sports
KOR-Bench: Benchmarking Language Models on Knowledge-Orthogonal Reasoning Tasks
Tight Clusters Make Specialized Experts
Vision and Language Synergy for Rehearsal Free Continual Learning
Offline Hierarchical Reinforcement Learning via Inverse Optimization
WardropNet: Traffic Flow Predictions via Equilibrium-Augmented Learning
Sparse Autoencoders Do Not Find Canonical Units of Analysis
Exponential Topology-enabled Scalable Communication in Multi-agent Reinforcement Learning
Can Large Language Models Understand Symbolic Graphics Programs?
Beyond Sequence: Impact of Geometric Context for RNA Property Prediction
Causal Order: The Key to Leveraging Imperfect Experts in Causal Inference
LocoVR: Multiuser Indoor Locomotion Dataset in Virtual Reality
Random Is All You Need: Random Noise Injection on Feature Statistics for Generalizable Deep Image Denoising
CAMEx: Curvature-aware Merging of Experts
TIS-DPO: Token-level Importance Sampling for Direct Preference Optimization With Estimated Weights
Uncovering Gaps in How Humans and LLMs Interpret Subjective Language
The Crucial Role of Samplers in Online Direct Preference Optimization
Spherical Tree-Sliced Wasserstein Distance
MetaOOD: Automatic Selection of OOD Detection Models
Doubly robust identification of treatment effects from multiple environments
MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding
Open-CK: A Large Multi-Physics Fields Coupling benchmarks in Combustion Kinetics
Efficient Inference for Large Language Model-based Generative Recommendation
Distance-Based Tree-Sliced Wasserstein Distance
Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN
Consistent Flow Distillation for Text-to-3D Generation
Grounding Multimodal Large Language Model in GUI World
Human-inspired Episodic Memory for Infinite Context LLMs
Implicit Search via Discrete Diffusion: A Study on Chess
Atomas: Hierarchical Adaptive Alignment on Molecule-Text for Unified Molecule Understanding and Generation
InstantPortrait: One-Step Portrait Editing via Diffusion Multi-Objective Distillation
XAIguiFormer: explainable artificial intelligence guided transformer for brain disorder identification
The Illustrated AlphaFold
Decentralized Optimization with Coupled Constraints
On Designing General and Expressive Quantum Graph Neural Networks with Applications to MILP Instance Representation
MIRACLE 3D: Memory-efficient Integrated Robust Approach for Continual Learning on 3D Point Clouds via Shape Model Construction
Multi-modal brain encoding models for multi-modal stimuli
How DNNs break the Curse of Dimensionality: Compositionality and Symmetry Learning
Learning Efficient Positional Encodings with Graph Neural Networks
I2VControl-Camera: Precise Video Camera Control with Adjustable Motion Strength
Correlating instruction-tuning (in multimodal models) with vision-language processing (in the brain)
Repetition Improves Language Model Embeddings
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
Divergence-Regularized Discounted Aggregation: Equilibrium Finding in Multiplayer Partially Observable Stochastic Games
Noise Separation guided Candidate Label Reconstruction for Noisy Partial Label Learning
Sequential Stochastic Combinatorial Optimization Using Hierarchal Reinforcement Learning
Identifying latent state transitions in non-linear dynamical systems
A Theoretical Analysis of Self-Supervised Learning for Vision Transformers
RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization
Continuous Exposure Learning for Low-light Image Enhancement using Neural ODEs
Toward Generalizing Visual Brain Decoding to Unseen Subjects
Learning to Contextualize Web Pages for Enhanced Decision Making by LLM Agents
Fast Summation of Radial Kernels via QMC Slicing
Manifold Learning by Mixture Models of VAEs for Inverse Problems
Audio Large Language Models Can Be Descriptive Speech Quality Evaluators
Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning
Rapid Selection and Ordering of In-Context Demonstrations via Prompt Embedding Clustering
Inverse decision-making using neural amortized Bayesian actors
Plug, Play, and Generalize: Length Extrapolation with Pointer-Augmented Neural Memory
Towards Fast, Specialized Machine Learning Force Fields: Distilling Foundation Models via Energy Hessians
FLIP: Flow-Centric Generative Planning as General-Purpose Manipulation World Model
Understanding Constraint Inference in Safety-Critical Inverse Reinforcement Learning
TopoGaussian: Inferring Internal Topology Structures from Visual Clues
From Risk to Uncertainty: Generating Predictive Uncertainty Measures via Bayesian Estimation
Rethinking the role of frames for SE(3)-invariant crystal structure modeling
Improving Language Model Distillation through Hidden State Matching
SleepSMC: Ubiquitous Sleep Staging via Supervised Multimodal Coordination
Methods with Local Steps and Random Reshuffling for Generally Smooth Non-Convex Federated Optimization
PRISM: Privacy-Preserving Improved Stochastic Masking for Federated Generative Models
OPTAMI: Global Superlinear Convergence of High-order Methods
Boosting the visual interpretability of CLIP via adversarial fine-tuning
MuHBoost: Multi-Label Boosting For Practical Longitudinal Human Behavior Modeling
Conformal Structured Prediction
Flavors of Margin: Implicit Bias of Steepest Descent in Homogeneous Neural Networks
Constructing Confidence Intervals for Average Treatment Effects from Multiple Datasets
Lightweight Neural App Control
Generating CAD Code with Vision-Language Models for 3D Designs
Towards Bridging Generalization and Expressivity of Graph Neural Networks
Scaling In-the-Wild Training for Diffusion-based Illumination Harmonization and Editing by Imposing Consistent Light Transport
qNBO: quasi-Newton Meets Bilevel Optimization
LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation
SIM: Surface-based fMRI Analysis for Inter-Subject Multimodal Decoding from Movie-Watching Experiments
Efficient and Trustworthy Causal Discovery with Latent Variables and Complex Relations
Credal Wrapper of Model Averaging for Uncertainty Estimation in Classification
Learning Molecular Representation in a Cell
MMFakeBench: A Mixed-Source Multimodal Misinformation Detection Benchmark for LVLMs
Investigating Pattern Neurons in Urban Time Series Forecasting
On the Linear Speedup of Personalized Federated Reinforcement Learning with Shared Representations
Autoregressive Pretraining with Mamba in Vision
What Matters When Repurposing Diffusion Models for General Dense Perception Tasks?
Dist Loss: Enhancing Regression in Few-Shot Region through Distribution Distance Constraint
Fine-tuning with Reserved Majority for Noise Reduction
Leveraging Variable Sparsity to Refine Pareto Stationarity in Multi-Objective Optimization
Non-Equilibrium Dynamics of Hybrid Continuous-Discrete Ground-State Sampling
Backtracking Improves Generation Safety
Do Mice Grok? Glimpses of Hidden Progress in Sensory Cortex
RGB-Event ISP: The Dataset and Benchmark
Scaling Laws for Precision
FreqPrior: Improving Video Diffusion Models with Frequency Filtering Gaussian Noise
UniGS: Unified Language-Image-3D Pretraining with Gaussian Splatting
GeSubNet: Gene Interaction Inference for Disease Subtype Network Generation
Large Language Models Meet Symbolic Provers for Logical Reasoning Evaluation
Hessian-Free Online Certified Unlearning
The KoLMogorov Test: Compression by Code Generation
TAU-106K: A New Dataset for Comprehensive Understanding of Traffic Accident
Real2Code: Reconstruct Articulated Objects via Code Generation
ELICIT: LLM Augmentation Via External In-context Capability
Vertical Federated Learning with Missing Features During Training and Inference
DRL: Decomposed Representation Learning for Tabular Anomaly Detection
DreamDistribution: Learning Prompt Distribution for Diverse In-distribution Generation
Stochastic Semi-Gradient Descent for Learning Mean Field Games with Population-Aware Function Approximation
Towards Faster Decentralized Stochastic Optimization with Communication Compression
Masked Diffusion Models are Secretly Time-Agnostic Masked Models and Exploit Inaccurate Categorical Sampling
Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization
Restyling Unsupervised Concept Based Interpretable Networks with Generative Models
Digi-Q: Learning VLM Q-Value Functions for Training Device-Control Agents
Training Language Models to Self-Correct via Reinforcement Learning
Implicit In-context Learning
High-Dynamic Radar Sequence Prediction for Weather Nowcasting Using Spatiotemporal Coherent Gaussian Representation
GenVP: Generating Visual Puzzles with Contrastive Hierarchical VAEs
Unified Parameter-Efficient Unlearning for LLMs
CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL
Beyond Single Concept Vector: Modeling Concept Subspace in LLMs with Gaussian Distribution
A Tight Convergence Analysis of Inexact Stochastic Proximal Point Algorithm for Stochastic Composite Optimization Problems
SGD with memory: fundamental properties and stochastic acceleration
MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions
Understanding Methods for Scalable MCTS
QA-Calibration of Language Model Confidence Scores
Learning Shape-Independent Transformation via Spherical Representations for Category-Level Object Pose Estimation
Residual Kernel Policy Network: Enhancing Stability and Robustness in RKHS-Based Reinforcement Learning
Fast Feedforward 3D Gaussian Splatting Compression
Towards a General Time Series Anomaly Detector with Adaptive Bottlenecks and Dual Adversarial Decoders
Faster Algorithms for Structured Linear and Kernel Support Vector Machines
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
VideoGrain: Modulating Space-Time Attention for Multi-Grained Video Editing
Towards Neural Scaling Laws for Time Series Foundation Models
Sensitivity-Constrained Fourier Neural Operators for Forward and Inverse Problems in Parametric Differential Equations
Dynamic Low-Rank Sparse Adaptation for Large Language Models
RazorAttention: Efficient KV Cache Compression Through Retrieval Heads
Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning
Trivialized Momentum Facilitates Diffusion Generative Modeling on Lie Groups
GS-CPR: Efficient Camera Pose Refinement via 3D Gaussian Splatting
Provable Convergence and Limitations of Geometric Tempering for Langevin Dynamics
Reinforcement learning with combinatorial actions for coupled restless bandits
M^3PC: Test-time Model Predictive Control using Pretrained Masked Trajectory Model
Decoupling Layout from Glyph in Online Chinese Handwriting Generation
EMOS: Embodiment-aware Heterogeneous Multi-robot Operating System with LLM Agents
Enhancing Document Understanding with Group Position Embedding: A Novel Approach to Incorporate Layout Information
Weak to Strong Generalization for Large Language Models with Multi-capabilities
Continual Slow-and-Fast Adaptation of Latent Neural Dynamics (CoSFan): Meta-Learning What-How & When to Adapt
DEPfold: RNA Secondary Structure Prediction as Dependency Parsing.
Gaussian-Det: Learning Closed-Surface Gaussians for 3D Object Detection
Mind Control through Causal Inference: Predicting Clean Images from Poisoned Data
Student-Informed Teacher Training
Have the VLMs Lost Confidence? A Study of Sycophancy in VLMs
Evaluating Large Language Models through Role-Guide and Self-Reflection: A Comparative Study
Black-Box Detection of Language Model Watermarks
Large Convolutional Model Tuning via Filter Subspace
BadRobot: Jailbreaking Embodied LLMs in the Physical World
SAM-CP: Marrying SAM with Composable Prompts for Versatile Segmentation
Modality-Specialized Synergizers for Interleaved Vision-Language Generalists
Controllable Context Sensitivity and the Knob Behind It
Learning How Hard to Think: Input-Adaptive Allocation of LM Computation
DynaPrompt: Dynamic Test-Time Prompt Tuning
A Graph Enhanced Symbolic Discovery Framework For Efficient Logic Optimization
Scalable Decentralized Learning with Teleportation
ToVE: Efficient Vision-Language Learning via Knowledge Transfer from Vision Experts
Arithmetic Transformers Can Length-Generalize in Both Operand Length and Count
Improved Approximation Algorithms for $k$-Submodular Maximization via Multilinear Extension
RecFlow: An Industrial Full Flow Recommendation Dataset
Pyramidal Flow Matching for Efficient Video Generative Modeling
GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS
Reveal Object in Lensless Photography via Region Gaze and Amplification
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Partially Observed Trajectory Inference using Optimal Transport and a Dynamics Prior
TorchTitan: One-stop PyTorch native solution for production ready LLM pretraining
uniINF: Best-of-Both-Worlds Algorithm for Parameter-Free Heavy-Tailed MABs
Explain Yourself, Briefly! Self-Explaining Neural Networks with Concise Sufficient Reasons
Taming Transformer Without Using Learning Rate Warmup
Verifying Properties of Binary Neural Networks Using Sparse Polynomial Optimization
SEPARATE: A Simple Low-rank Projection for Gradient Compression in Modern Large-scale Model Training Process
Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling
Uncovering Overfitting in Large Language Model Editing
Revealing and Mitigating Over-Attention in Knowledge Editing
Efficient Multi-agent Offline Coordination via Diffusion-based Trajectory Stitching
Agent-Oriented Planning in Multi-Agent Systems
Privacy-Preserving Personalized Federated Prompt Learning for Multimodal Large Language Models
AutoUAD: Hyper-parameter Optimization for Unsupervised Anomaly Detection
On Large Language Model Continual Unlearning
Mastering Task Arithmetic: $\tau$Jp as a Key Indicator for Weight Disentanglement
Local-Prompt: Extensible Local Prompts for Few-Shot Out-of-Distribution Detection
Adapt-$\infty$: Scalable Continual Multimodal Instruction Tuning via Dynamic Data Selection
How much of my dataset did you use? Quantitative Data Usage Inference in Machine Learning
INFER: A Neural-symbolic Model For Extrapolation Reasoning on Temporal Knowledge Graph
MAI: A Multi-turn Aggregation-Iteration Model for Composed Image Retrieval
MMDisCo: Multi-Modal Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation
BinaryDM: Accurate Weight Binarization for Efficient Diffusion Models
The Ramanujan Library - Automated Discovery on the Hypergraph of Integer Relations
Simplifying Deep Temporal Difference Learning
BEEM: Boosting Performance of Early Exit DNNs using Multi-Exit Classifiers as Experts
Bisimulation Metric for Model Predictive Control
Regulatory DNA Sequence Design with Reinforcement Learning
Differentially private optimization for non-decomposable objective functions
DataGen: Unified Synthetic Dataset Generation via Large Language Models
DreamCatalyst: Fast and High-Quality 3D Editing via Controlling Editability and Identity Preservation
OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision
Latent Bayesian Optimization via Autoregressive Normalizing Flows
MAESTRO: Masked Encoding Set Transformer with Self-Distillation
Interactive Speculative Planning: Enhance Agent Efficiency through Co-design of System and User Interface
Not All Heads Matter: A Head-Level KV Cache Compression Method with Integrated Retrieval and Reasoning
Bringing NeRFs to the Latent Space: Inverse Graphics Autoencoder
When does compositional structure yield compositional generalization? A kernel theory.
Differentiable Integer Linear Programming
6D Object Pose Tracking in Internet Videos for Robotic Manipulation
Discrete Copula Diffusion
GrabS: Generative Embodied Agent for 3D Object Segmentation without Scene Supervision
Adaptive Retention & Correction: Test-Time Training for Continual Learning
In Search of Forgotten Domain Generalization
GIFT: Unlocking Full Potential of Labels in Distilled Dataset at Near-zero Cost
The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities
EmbedLLM: Learning Compact Representations of Large Language Models
Medium-Difficulty Samples Constitute Smoothed Decision Boundary for Knowledge Distillation on Pruned Datasets
Diffusion Generative Modeling for Spatially Resolved Gene Expression Inference from Histology Images
DELIFT: Data Efficient Language model Instruction Fine-Tuning
Graph-based Document Structure Analysis
UniCO: On Unified Combinatorial Optimization via Problem Reduction to Matrix-Encoded General TSP
ELBOing Stein: Variational Bayes with Stein Mixture Inference
Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws
Global Well-posedness and Convergence Analysis of Score-based Generative Models via Sharp Lipschitz Estimates
Sylber: Syllabic Embedding Representation of Speech from Raw Audio
Discrete Distribution Networks
JudgeBench: A Benchmark for Evaluating LLM-Based Judges
InstaRevive: One-Step Image Enhancement via Dynamic Score Matching
The Value of Sensory Information to a Robot
SLMRec: Distilling Large Language Models into Small for Sequential Recommendation
Representative Guidance: Diffusion Model Sampling with Coherence
Grounding Video Models to Actions through Goal Conditioned Exploration
Safety Layers in Aligned Large Language Models: The Key to LLM Security
Diff3DS: Generating View-Consistent 3D Sketch via Differentiable Curve Rendering
Tackling Data Corruption in Offline Reinforcement Learning via Sequence Modeling
OVTR: End-to-End Open-Vocabulary Multiple Object Tracking with Transformer
Online Preference Alignment for Language Models via Count-based Exploration
Rational Decision-Making Agent with Learning Internal Utility Judgment
MMQA: Evaluating LLMs with Multi-Table Multi-Hop Complex Questions
AutoCGP: Closed-Loop Concept-Guided Policies from Unlabeled Demonstrations
TFG-Flow: Training-free Guidance in Multimodal Generative Flow
Skill Expansion and Composition in Parameter Space
Distribution-Specific Agnostic Conditional Classification With Halfspaces
SaLoRA: Safety-Alignment Preserved Low-Rank Adaptation
Convergence and Implicit Bias of Gradient Descent on Continual Linear Classification
Interpreting the Second-Order Effects of Neurons in CLIP
Optimizing $(L_0, L_1)$-Smooth Functions by Gradient Methods
Weakly Supervised Video Scene Graph Generation via Natural Language Supervision
Mutual Effort for Efficiency: A Similarity-based Token Pruning for Vision Transformers in Self-Supervised Learning
Discrete Latent Plans via Semantic Skill Abstractions
Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models
Chain-of-Focus Prompting: Leveraging Sequential Visual Cues to Prompt Large Autoregressive Vision Models
Unhackable Temporal Reward for Scalable Video MLLMs
Enhancing Pre-trained Representation Classifiability can Boost its Interpretability
Anti-Exposure Bias in Diffusion Models
Data Pruning by Information Maximization
Jump Your Steps: Optimizing Sampling Schedule of Discrete Diffusion Models
Advancing Mathematical Reasoning in Language Models: The Impact of Problem-Solving Data, Data Synthesis Methods, and Training Stages
Round and Round We Go! What makes Rotary Positional Encodings useful?
Remove Symmetries to Control Model Expressivity and Improve Optimization
CR2PQ: Continuous Relative Rotary Positional Query for Dense Visual Representation Learning
Atlas Gaussians Diffusion for 3D Generation
Data Shapley in One Training Run
CofCA: A STEP-WISE Counterfactual Multi-hop QA benchmark
HR-Extreme: A High-Resolution Dataset for Extreme Weather Forecasting
ADAPT: Attentive Self-Distillation and Dual-Decoder Prediction Fusion for Continual Panoptic Segmentation
Systems with Switching Causal Relations: A Meta-Causal Perspective
Towards Improving Exploration through Sibling Augmented GFlowNets
Bayesian Regularization of Latent Representation
Flow: Modularized Agentic Workflow Automation
Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data
World Model on Million-Length Video And Language With Blockwise RingAttention
Efficient Top-m Data Values Identification for Data Selection
Learning Spatial-Semantic Features for Robust Video Object Segmentation
PolaFormer: Polarity-aware Linear Attention for Vision Transformers
Learning LLM-as-a-Judge for Preference Alignment
Neural Dueling Bandits: Preference-Based Optimization with Human Feedback
Towards Principled Evaluations of Sparse Autoencoders for Interpretability and Control
A Closer Look at Machine Unlearning for Large Language Models
RandLoRA: Full rank parameter-efficient fine-tuning of large models
Rethinking Evaluation of Sparse Autoencoders through the Representation of Polysemous Words
Refine Knowledge of Large Language Models via Adaptive Contrastive Learning
Scaling up Masked Diffusion Models on Text
SVBench: A Benchmark with Temporal Multi-Turn Dialogues for Streaming Video Understanding
Methods for Convex $(L_0,L_1)$-Smooth Optimization: Clipping, Acceleration, and Adaptivity
Maximizing the Potential of Synthetic Data: Insights from Random Matrix Theory
Improved Techniques for Optimization-Based Jailbreaking on Large Language Models
Bootstrapping Language Models with DPO Implicit Rewards
Let Your Features Tell The Differences: Understanding Graph Convolution By Feature Splitting
Dreamweaver: Learning Compositional World Models from Pixels
HGM³: Hierarchical Generative Masked Motion Modeling with Hard Token Mining
Do LLMs estimate uncertainty well in instruction-following?
Rodimus*: Breaking the Accuracy-Efficiency Trade-Off with Efficient Attentions
Can In-context Learning Really Generalize to Out-of-distribution Tasks?
Neural Fluid Simulation on Geometric Surfaces
Narrowing Information Bottleneck Theory for Multimodal Image-Text Representations Interpretability
Animate-X: Universal Character Image Animation with Enhanced Motion Representation
Efficient Action-Constrained Reinforcement Learning via Acceptance-Rejection Method and Augmented MDPs
Searching for Optimal Solutions with LLMs via Bayesian Optimization
From Models to Microtheories: Distilling a Model's Topical Knowledge for Grounded Question-Answering
Influence-Guided Diffusion for Dataset Distillation
Theory, Analysis, and Best Practices for Sigmoid Self-Attention
SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
CodeMMLU: A Multi-Task Benchmark for Assessing Code Understanding & Reasoning Capabilities of CodeLLMs
OmniKV: Dynamic Context Selection for Efficient Long-Context LLMs
Oracle efficient truncated statistics
MACPO: Weak-to-Strong Alignment via Multi-Agent Contrastive Preference Optimization
Second-Order Min-Max Optimization with Lazy Hessians
Tell me about yourself: LLMs are aware of their learned behaviors
Balanced Ranking with Relative Centrality: A multi-core periphery perspective
IV-mixed Sampler: Leveraging Image Diffusion Models for Enhanced Video Synthesis
NeuroLM: A Universal Multi-task Foundation Model for Bridging the Gap between Language and EEG Signals
Improving Reasoning Performance in Large Language Models via Representation Engineering
LOIRE: LifelOng learning on Incremental data via pre-trained language model gRowth Efficiently
A Simple yet Effective $\Delta\Delta G$ Predictor is An Unsupervised Antibody Optimizer and Explainer
Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse
Not All LLM-Generated Data Are Equal: Rethinking Data Weighting in Text Classification
An Empirical Analysis of Uncertainty in Large Language Model Evaluations
LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging
RetroInText: A Multimodal Large Language Model Enhanced Framework for Retrosynthetic Planning via In-Context Representation Learning
PostEdit: Posterior Sampling for Efficient Zero-Shot Image Editing
Generalizing Reasoning Problems to Longer Lengths
FreSh: Frequency Shifting for Accelerated Neural Representation Learning
Differentiable Rule Induction from Raw Sequence Inputs
How Does Critical Batch Size Scale in Pre-training?
On the expressiveness and spectral bias of KANs
Scaling Wearable Foundation Models
Disentangling Representations through Multi-task Learning
Accelerated training through iterative gradient propagation along the residual path
Matrix Product Sketching via Coordinated Sampling
Quantitative Approximation for Neural Operators in Nonlinear Parabolic Equations
Contextual Self-paced Learning for Weakly Supervised Spatio-Temporal Video Grounding
From Decoupling to Adaptive Transformation: a Wider Optimization Space for PTQ
Grammar Reinforcement Learning: path and cycle counting in graphs with a Context-Free Grammar and Transformer approach
Personalized Representation from Personalized Generation
PIG: Physics-Informed Gaussians as Adaptive Parametric Mesh Representations
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts
Tighter Privacy Auditing of DP-SGD in the Hidden State Threat Model
Mixture of Experts Made Personalized: Federated Prompt Learning for Vision-Language Models
Structural-Entropy-Based Sample Selection for Efficient and Effective Learning
Specialized Foundation Models Struggle to Beat Supervised Baselines
Improved Finite-Particle Convergence Rates for Stein Variational Gradient Descent
FormalAlign: Automated Alignment Evaluation for Autoformalization
A Single Goal is All You Need: Skills and Exploration Emerge from Contrastive RL without Rewards, Demonstrations, or Subgoals
SANER: Annotation-free Societal Attribute Neutralizer for Debiasing CLIP
The Power of LLM-Generated Synthetic Data for Stance Detection in Online Political Discussions
Storybooth: Training-Free Multi-Subject Consistency for Improved Visual Storytelling
Tuning Frequency Bias of State Space Models
Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge
ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple Domains
Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge
Can We Talk Models Into Seeing the World Differently?
Self-Attention-Based Contextual Modulation Improves Neural System Identification
CONDA: Adaptive Concept Bottleneck for Foundation Models Under Distribution Shifts
Needle Threading: Can LLMs Follow Threads Through Near-Million-Scale Haystacks?
Analyzing Neural Scaling Laws in Two-Layer Networks with Power-Law Data Spectra
Is Large-scale Pretraining the Secret to Good Domain Generalization?
Auto-GDA: Automatic Domain Adaptation for Efficient Grounding Verification in Retrieval-Augmented Generation
Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains
Simple is Effective: The Roles of Graphs and Large Language Models in Knowledge-Graph-Based Retrieval-Augmented Generation
Benchmarking Agentic Workflow Generation
MAGE: Model-Level Graph Neural Networks Explanations via Motif-based Graph Generation
Enhancing Clustered Federated Learning: Integration of Strategies and Improved Methodologies
Revisiting Nearest Neighbor for Tabular Data: A Deep Tabular Baseline Two Decades Later
Difference-of-submodular Bregman Divergence
LLMs Can Plan Only If We Tell Them
Towards Empowerment Gain through Causal Structure Learning in Model-Based Reinforcement Learning
Strategist: Self-improvement of LLM Decision Making via Bi-Level Tree Search
SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems
Constraint-Conditioned Actor-Critic for Offline Safe Reinforcement Learning
Inverse Rendering using Multi-Bounce Path Tracing and Reservoir Sampling
Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
Ambient Diffusion Posterior Sampling: Solving Inverse Problems with Diffusion Models Trained on Corrupted Data
Stochastic Bandits Robust to Adversarial Attacks
miniCTX: Neural Theorem Proving with (Long-)Contexts
Rare-to-Frequent: Unlocking Compositional Generation Power of Diffusion Models on Rare Concepts with LLM Guidance
AniSDF: Fused-Granularity Neural Surfaces with Anisotropic Encoding for High-Fidelity 3D Reconstruction
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark for Large Language Models
TRACE: Temporal Grounding Video LLM via Causal Event Modeling
IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations
Computing Circuits Optimization via Model-Based Circuit Genetic Evolution
YOLO-RD: Introducing Relevant and Compact Explicit Knowledge to YOLO by Retriever-Dictionary
Distilling Structural Representations into Protein Sequence Models
Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
HELMET: How to Evaluate Long-context Models Effectively and Thoroughly
Learning Distributions of Complex Fluid Simulations with Diffusion Graph Networks
BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval
Benchmarking LLMs' Judgments with No Gold Standard
SeedLM: Compressing LLM Weights into Seeds of Pseudo-Random Generators
Bayesian Experimental Design Via Contrastive Diffusions
DPLM-2: A Multimodal Diffusion Protein Language Model
Language Imbalance Driven Rewarding for Multilingual Self-improving
Accelerating Neural ODEs: A Variational Formulation-based Approach
A Non-Contrastive Learning Framework for Sequential Recommendation with Preference-Preserving Profile Generation
Improving Large Language Model Planning with Action Sequence Similarity
Accelerating Task Generalisation with Multi-Level Skill Hierarchies
Hydra-SGG: Hybrid Relation Assignment for One-stage Scene Graph Generation
A Differentiable Rank-Based Objective for Better Feature Learning
Rethinking Multiple-Instance Learning From Feature Space to Probability Space
Enhancing Prediction Performance through Influence Measure
Shallow diffusion networks provably learn hidden low-dimensional structure
Detecting Backdoor Samples in Contrastive Language Image Pretraining
Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models
Learning to Plan Before Answering: Self-Teaching LLMs to Learn Abstract Plans for Problem Solving
A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules
MIRAGE: Evaluating and Explaining Inductive Reasoning Process in Language Models
Diffusion Feedback Helps CLIP See Better
MiniPLM: Knowledge Distillation for Pre-training Language Models
Selective Unlearning via Representation Erasure Using Domain Adversarial Training
Latent Safety-Constrained Policy Approach for Safe Offline Reinforcement Learning
Projection Head is Secretly an Information Bottleneck
Self-Evolving Multi-Agent Collaboration Networks for Software Development
Node Identifiers: Compact, Discrete Representations for Efficient Graph Learning
Learning and aligning single-neuron invariance manifolds in visual cortex
Do as We Do, Not as You Think: the Conformity of Large Language Models
Towards Learning High-Precision Least Squares Algorithms with Sequence Models
Beyond Linear Approximations: A Novel Pruning Approach for Attention Matrix
QERA: an Analytical Framework for Quantization Error Reconstruction
Reducing Hallucinations in Large Vision-Language Models via Latent Space Steering
Accessing Vision Foundation Models via ImageNet-1K
For Better or For Worse? Learning Minimum Variance Features With Label Augmentation
CapeX: Category-Agnostic Pose Estimation from Textual Point Explanation
Aioli: A Unified Optimization Framework for Language Model Data Mixing
6DGS: Enhanced Direction-Aware Gaussian Splatting for Volumetric Rendering
Can LLMs Understand Time Series Anomalies?
LoRA3D: Low-Rank Self-Calibration of 3D Geometric Foundation models
Pursuing Better Decision Boundaries for Long-Tailed Object Detection via Category Information Amount
LiveBench: A Challenging, Contamination-Limited LLM Benchmark
Discovering Influential Neuron Path in Vision Transformers
Last-Iterate Convergence Properties of Regret-Matching Algorithms in Games
Temporal Reasoning Transfer from Text to Video
Accelerating Inference of Retrieval-Augmented Generation via Sparse Context Selection
Neuron based Personality Trait Induction in Large Language Models
Quest: Query-centric Data Synthesis Approach for Long-context Scaling of Large Language Model
An Evolved Universal Transformer Memory
Demystifying Topological Message-Passing with Relational Structures: A Case Study on Oversquashing in Simplicial Message-Passing
Computational Limits of Low-Rank Adaptation (LoRA) Fine-Tuning for Transformer Models
Efficient and Accurate Explanation Estimation with Distribution Compression
Faster Inference of Flow-Based Generative Models via Improved Data-Noise Coupling
Learning Chaos In A Linear Way
Rethinking Reward Modeling in Preference-based Large Language Model Alignment
CircuitFusion: Multimodal Circuit Representation Learning for Agile Chip Design
Chemistry-Inspired Diffusion with Non-Differentiable Guidance
Generating Likely Counterfactuals Using Sum-Product Networks
Enhanced Diffusion Sampling via Extrapolation with Multiple ODE Solutions
What's the Move? Hybrid Imitation Learning via Salient Points
Towards Out-of-Modal Generalization without Instance-level Modal Correspondence
Size-Generalizable RNA Structure Evaluation by Exploring Hierarchical Geometries
Non-Adversarial Inverse Reinforcement Learning via Successor Feature Matching
Laplace Sample Information: Data Informativeness Through a Bayesian Lens
CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation
Trusted Multi-View Classification via Evolutionary Multi-View Fusion
Systematic Relational Reasoning With Epistemic Graph Neural Networks
LeFusion: Controllable Pathology Synthesis via Lesion-Focused Diffusion Models
Chain-of-region: Visual Language Models Need Details for Diagram Analysis
Learned Reference-based Diffusion Sampler for multi-modal distributions
CoMotion: Concurrent Multi-person 3D Motion
Understanding Matrix Function Normalizations in Covariance Pooling through the Lens of Riemannian Geometry
OGBench: Benchmarking Offline Goal-Conditioned RL
TEOChat: A Large Vision-Language Assistant for Temporal Earth Observation Data
Explanations of GNN on Evolving Graphs via Axiomatic Layer edges
URLOST: Unsupervised Representation Learning without Stationarity or Topology
Boltzmann priors for Implicit Transfer Operators
On Scaling Up 3D Gaussian Splatting Training
Bayesian WeakS-to-Strong from Text Classification to Generation
Rethinking Fair Representation Learning for Performance-Sensitive Tasks
When GNNs meet symmetry in ILPs: an orbit-based feature augmentation approach
Generative Flows on Synthetic Pathway for Drug Design
Optimizing 4D Gaussians for Dynamic Scene Video from Single Landscape Images
GlycanML: A Multi-Task and Multi-Structure Benchmark for Glycan Machine Learning
Zigzag Diffusion Sampling: Diffusion Models Can Self-Improve via Self-Reflection
3D-AffordanceLLM: Harnessing Large Language Models for Open-Vocabulary Affordance Detection in 3D Worlds
Conflict-Averse Gradient Aggregation for Constrained Multi-Objective Reinforcement Learning
Commit0: Library Generation from Scratch
FlickerFusion: Intra-trajectory Domain Generalizing Multi-agent Reinforcement Learning
Long-Context LLMs Meet RAG: Overcoming Challenges for Long Inputs in RAG
Knowledge Graph Finetuning Enhances Knowledge Manipulation in Large Language Models
TVNet: A Novel Time Series Analysis Method Based on Dynamic Convolution and 3D-Variation
Variance-Reducing Couplings for Random Features
Looking into User’s Long-term Interests through the Lens of Conservative Evidential Learning
ANaGRAM: A Natural Gradient Relative to Adapted Model for efficient PINNs learning
Neuron-based Multifractal Analysis of Neuron Interaction Dynamics in Large Models
QP-SNN: Quantized and Pruned Spiking Neural Networks
Visual Agents as Fast and Slow Thinkers
HelpSteer2-Preference: Complementing Ratings with Preferences
Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers
Centrality-guided Pre-training for Graph
Relax and Merge: A Simple Yet Effective Framework for Solving Fair $k$-Means and $k$-sparse Wasserstein Barycenter Problems
NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals
From Isolated Conversations to Hierarchical Schemas: Dynamic Tree Memory Representation for LLMs
Discovering Clone Negatives via Adaptive Contrastive Learning for Image-Text Matching
Active Learning for Continual Learning: Keeping the Past Alive in the Present
On the Adversarial Vulnerability of Label-Free Test-Time Adaptation
Multi-Draft Speculative Sampling: Canonical Decomposition and Theoretical Limits
Preble: Efficient Distributed Prompt Scheduling for LLM Serving
CipherPrune: Efficient and Scalable Private Transformer Inference
Fragment and Geometry Aware Tokenization of Molecules for Structure-Based Drug Design Using Language Models
Learning to Help in Multi-Class Settings
An Engorgio Prompt Makes Large Language Model Babble on
Tracking objects that change in appearance with phase synchrony
AdaWM: Adaptive World Model based Planning for Autonomous Driving
Long-Short Decision Transformer: Bridging Global and Local Dependencies for Generalized Decision-Making
RESuM: A Rare Event Surrogate Model for Physics Detector Design
It Helps to Take a Second Opinion: Teaching Smaller LLMs To Deliberate Mutually via Selective Rationale Optimisation
Plastic Learning with Deep Fourier Features
Dynamic Modeling of Patients, Modalities and Tasks via Multi-modal Multi-task Mixture of Experts
Nonconvex Stochastic Optimization under Heavy-Tailed Noises: Optimal Convergence without Gradient Clipping
SLoPe: Double-Pruned Sparse Plus Lazy Low-Rank Adapter Pretraining of LLMs
Limits to scalable evaluation at the frontier: LLM as judge won’t beat twice the data
Towards Understanding the Universality of Transformers for Next-Token Prediction
Zero-shot Imputation with Foundation Inference Models for Dynamical Systems
From Promise to Practice: Realizing High-performance Decentralized Training
Discrete Codebook World Models for Continuous Control
Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs
Adaptive $Q$-Network: On-the-fly Target Selection for Deep Reinforcement Learning
RAG-SR: Retrieval-Augmented Generation for Neural Symbolic Regression
Many-Objective Multi-Solution Transport
CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding
Model Editing as a Robust and Denoised variant of DPO: A Case Study on Toxicity
Differentially Private Steering for Large Language Model Alignment
Probabilistic Conformal Prediction with Approximate Conditional Validity
Learning from negative feedback, or positive feedback or both
Rethinking Graph Neural Networks From A Geometric Perspective Of Node Features
Formation of Representations in Neural Networks
A Watermark for Order-Agnostic Language Models
Online Reinforcement Learning in Non-Stationary Context-Driven Environments
Bridging the Data Provenance Gap Across Text, Speech, and Video
Conditional Diffusion Models are Minimax-Optimal and Manifold-Adaptive for Conditional Distribution Estimation
Enhancing Graph Of Thought: Enhancing Prompts with LLM Rationales and Dynamic Temperature Control
Understanding Virtual Nodes: Oversquashing and Node Heterogeneity
Robust Root Cause Diagnosis using In-Distribution Interventions
One for all and all for one: Efficient computation of partial Wasserstein distances on the line
Provably Safeguarding a Classifier from OOD and Adversarial Samples
Learning local equivariant representations for quantum operators
Circuit Transformer: A Transformer That Preserves Logical Equivalence
Reconsidering Faithfulness in Regular, Self-Explainable and Domain Invariant GNNs
Determine-Then-Ensemble: Necessity of Top-k Union for Large Language Model Ensembling
SV-RAG: LoRA-Contextualizing Adaptation of MLLMs for Long Document Understanding
Neural Wave Equation for Irregularly Sampled Sequence Data
Few for Many: Tchebycheff Set Scalarization for Many-Objective Optimization
Diffusion State-Guided Projected Gradient for Inverse Problems
AstroCompress: A benchmark dataset for multi-purpose compression of astronomical data
Learning Long Range Dependencies on Graphs via Random Walks
Arithmetic Without Algorithms: Language Models Solve Math with a Bag of Heuristics
MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?
MolSpectra: Pre-training 3D Molecular Representation with Multi-modal Energy Spectra
RelCon: Relative Contrastive Learning for a Motion Foundation Model for Wearable Data
Learning a Neural Solver for Parametric PDEs to Enhance Physics-Informed Methods
Meta-Continual Learning of Neural Fields
A Sanity Check for AI-generated Image Detection
Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution
EVA: Geometric Inverse Design for Fast Protein Motif-Scaffolding with Coupled Flow
Instant Policy: In-Context Imitation Learning via Graph Diffusion
BTBS-LNS: Binarized-Tightening, Branch and Search on Learning LNS Policies for MIP
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration
Positive-Unlabeled Diffusion Models for Preventing Sensitive Data Generation
EFFICIENT JAILBREAK ATTACK SEQUENCES ON LARGE LANGUAGE MODELS VIA MULTI-ARMED BANDIT-BASED CONTEXT SWITCHING
CoRNStack: High-Quality Contrastive Data for Better Code Retrieval and Reranking
Inverse Attention Agents for Multi-Agent Systems
Building Interactable Replicas of Complex Articulated Objects via Gaussian Splatting
Nonasymptotic Analysis of Stochastic Gradient Descent with the Richardson–Romberg Extrapolation
Samba: Synchronized Set-of-Sequences Modeling for Multiple Object Tracking
Proactive Privacy Amnesia for Large Language Models: Safeguarding PII with Negligible Impact on Model Utility
Robotouille: An Asynchronous Planning Benchmark for LLM Agents
ET-SEED: EFFICIENT TRAJECTORY-LEVEL SE(3) EQUIVARIANT DIFFUSION POLICY
Uncertainty modeling for fine-tuned implicit functions
PaCA: Partial Connection Adaptation for Efficient Fine-Tuning
A Theoretical Framework for Partially-Observed Reward States in RLHF
LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
Unlocking the Potential of Model Calibration in Federated Learning
Endowing Visual Reprogramming with Adversarial Robustness
Duoduo CLIP: Efficient 3D Understanding with Multi-View Images
Neural Stochastic Differential Equations for Uncertainty-Aware Offline RL
No Free Lunch: Fundamental Limits of Learning Non-Hallucinating Generative Models
AutoG: Towards automatic graph construction from tabular data
Understanding Long Videos with Multimodal Language Models
Efficient Exploration and Discriminative World Model Learning with an Object-Centric Abstraction
Watch Less, Do More: Implicit Skill Discovery for Video-Conditioned Policy
Immunogenicity Prediction with Dual Attention Enables Vaccine Target Selection
Mining your own secrets: Diffusion Classifier Scores for Continual Personalization of Text-to-Image Diffusion Models
Uncertainty-Aware Decoding with Minimum Bayes Risk
Instance-dependent Early Stopping
GaussianAnything: Interactive Point Cloud Flow Matching for 3D Generation
ACES: Automatic Cohort Extraction System for Event-Stream Datasets
OBI-Bench: Can LMMs Aid in Study of Ancient Script on Oracle Bones?
Provable Benefit of Annealed Langevin Monte Carlo for Non-log-concave Sampling
Mix-CPT: A Domain Adaptation Framework via Decoupling Knowledge Learning and Format Alignment
T-JEPA: Augmentation-Free Self-Supervised Learning for Tabular Data
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
Precise Parameter Localization for Textual Generation in Diffusion Models
Diffusion Models Are Real-Time Game Engines
Rare event modeling with self-regularized normalizing flows: what can we learn from a single failure?
Scalable Influence and Fact Tracing for Large Language Model Pretraining
Efficient Active Imitation Learning with Random Network Distillation
The Foundations of Tokenization: Statistical and Computational Concerns
Unlearning-based Neural Interpretations
Magnetic Preference Optimization: Achieving Last-iterate Convergence for Language Model Alignment
RTop-K: Ultra-Fast Row-Wise Top-K Selection for Neural Network Acceleration on GPUs
Safety Representations for Safer Policy Learning
Humanizing the Machine: Proxy Attacks to Mislead LLM Detectors
Generalization and Distributed Learning of GFlowNets
Composing Unbalanced Flows for Flexible Docking and Relaxation
Diffusion$^2$: Dynamic 3D Content Generation via Score Composition of Video and Multi-view Diffusion Models
Deep Distributed Optimization for Large-Scale Quadratic Programming
Enabling Realtime Reinforcement Learning at Scale with Staggered Asynchronous Inference
Interpretable Unsupervised Joint Denoising and Enhancement for Real-World low-light Scenarios
Improving Probabilistic Diffusion Models With Optimal Diagonal Covariance Matching
LLaMA-Omni: Seamless Speech Interaction with Large Language Models
New Algorithms for the Learning-Augmented k-means Problem
Connectome Mapping: Shape-Memory Network via Interpretation of Contextual Semantic Information
MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models
Universal Image Restoration Pre-training via Degradation Classification
InfoGS: Efficient Structure-Aware 3D Gaussians via Lightweight Information Shaping
Towards Continuous Reuse of Graph Models via Holistic Memory Diversification
Safety-Prioritizing Curricula for Constrained Reinforcement Learning
What to align in multimodal contrastive learning?
Value-aligned Behavior Cloning for Offline Reinforcement Learning via Bi-level Optimization
Learning Neural Networks with Distribution Shift: Efficiently Certifiable Guarantees
Filtered not Mixed: Filtering-Based Online Gating for Mixture of Large Language Models
Offline RL with Smooth OOD Generalization in Convex Hull and its Neighborhood
SegLLM: Multi-round Reasoning Segmentation with Large Language Models
Gap Preserving Distillation by Building Bidirectional Mappings with A Dynamic Teacher
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing
Training Language Models on Synthetic Edit Sequences Improves Code Synthesis
Provable unlearning in topic modeling and downstream tasks
Training Free Exponential Context Extension via Cascading KV Cache
Sequential Controlled Langevin Diffusions
Do Contemporary Causal Inference Models Capture Real-World Heterogeneity? Findings from a Large-Scale Benchmark
Generalization Guarantees for Representation Learning via Data-Dependent Gaussian Mixture Priors
MallowsPO: Fine-Tune Your LLM with Preference Dispersions
SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction
A Robust Method to Discover Causal or Anticausal Relation
Hybrid Regularization Improves Diffusion-based Inverse Problem Solving
Underdamped Diffusion Bridges with Applications to Sampling
CLIBD: Bridging Vision and Genomics for Biodiversity Monitoring at Scale
GSE: Group-wise Sparse and Explainable Adversarial Attacks
Efficient Training of Neural Stochastic Differential Equations by Matching Finite Dimensional Distributions
The Same but Different: Structural Similarities and Differences in Multilingual Language Modeling
ProteinBench: A Holistic Evaluation of Protein Foundation Models
Model Equality Testing: Which Model is this API Serving?
RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
ETA: Evaluating Then Aligning Safety of Vision Language Models at Inference Time
Model-agnostic meta-learners for estimating heterogeneous treatment effects over time
Scale-aware Recognition in Satellite Images under Resource Constraints
TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models
Adversarial Latent Feature Augmentation for Fairness
REvolve: Reward Evolution with Large Language Models using Human Feedback
Aligning Human Motion Generation with Human Perceptions
PEARL: Parallel Speculative Decoding with Adaptive Draft Length
One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt
A New Perspective on Shampoo's Preconditioner
MR-GSM8K: A Meta-Reasoning Benchmark for Large Language Model Evaluation
Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF
No Preference Left Behind: Group Distributional Preference Optimization
Prototype antithesis for biological few-shot class-incremental learning
$\text{I}^2\text{AM}$: Interpreting Image-to-Image Latent Diffusion Models via Bi-Attribution Maps
A Quantum Circuit-Based Compression Perspective for Parameter-Efficient Learning
Intrinsic Dimension Correlation: uncovering nonlinear connections in multimodal representations
Risk-Sensitive Diffusion: Robustly Optimizing Diffusion Models with Noisy Samples
Depth Pro: Sharp Monocular Metric Depth in Less Than a Second
HADAMRNN: BINARY AND SPARSE TERNARY ORTHOGONAL RNNS
A Geometric Framework for Understanding Memorization in Generative Models
Multi-Scale Fusion for Object Representation
Recovering Manifold Structure Using Ollivier Ricci Curvature
TopoLM: brain-like spatio-functional organization in a topographic language model
Towards General-Purpose Model-Free Reinforcement Learning
PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object Detection
QMP: Q-switch Mixture of Policies for Multi-Task Behavior Sharing
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
Mitigate the Gap: Improving Cross-Modal Alignment in CLIP
Toward Understanding In-context vs. In-weight Learning
Error-quantified Conformal Inference for Time Series
Enhancing Language Model Agents using Diversity of Thoughts
VLMaterial: Procedural Material Generation with Large Vision-Language Models
A Distributional Approach to Uncertainty-Aware Preference Alignment Using Offline Demonstrations
Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models
ARB-LLM: Alternating Refined Binarizations for Large Language Models
Optimized Multi-Token Joint Decoding With Auxiliary Model for LLM Inference
Kronecker Mask and Interpretive Prompts are Language-Action Video Learners
InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation
Min-K%++: Improved Baseline for Pre-Training Data Detection from Large Language Models
Easing Training Process of Rectified Flow Models Via Lengthening Inter-Path Distance
Shape as Line Segments: Accurate and Flexible Implicit Surface Representation
Cross-Domain Off-Policy Evaluation and Learning for Contextual Bandits
Diffusion Models as Cartoonists: The Curious Case of High Density Regions
Concept Bottleneck Language Models For Protein Design
Sensor-Invariant Tactile Representation
Conformalized Interactive Imitation Learning: Handling Expert Shift and Intermittent Feedback
Multilevel Generative Samplers for Investigating Critical Phenomena
Reframing Structure-Based Drug Design Model Evaluation via Metrics Correlated to Practical Needs
GOttack: Universal Adversarial Attacks on Graph Neural Networks via Graph Orbits Learning
Gaussian-Based Instance-Adaptive Intensity Modeling for Point-Supervised Facial Expression Spotting
ODE-based Smoothing Neural Network for Reinforcement Learning Tasks
ReGenesis: LLMs can Grow into Reasoning Generalists via Self-Improvement
Redefining the task of Bioactivity Prediction
GROOT-2: Weakly Supervised Multimodal Instruction Following Agents
ECD: A Machine Learning Benchmark for Predicting Enhanced-Precision Electronic Charge Density in Crystalline Inorganic Materials
HiSplat: Hierarchical 3D Gaussian Splatting for Generalizable Sparse-View Reconstruction
Surgical, Cheap, and Flexible: Mitigating False Refusal in Language Models via Single Vector Ablation
How Far Are We from True Unlearnability?
NExUME: Adaptive Training and Inference for DNNs under Intermittent Power Environments
Geometry of Lightning Self-Attention: Identifiability and Dimension
Attributing Culture-Conditioned Generations to Pretraining Corpora
Long-time asymptotics of noisy SVGD outside the population limit
Spreading Out-of-Distribution Detection on Graphs
High-Dimensional Bayesian Optimisation with Gaussian Process Prior Variational Autoencoders
Multi-level Certified Defense Against Poisoning Attacks in Offline Reinforcement Learning
Dynamic Multimodal Evaluation with Flexible Complexity by Vision-Language Bootstrapping
Optimal Non-Asymptotic Rates of Value Iteration for Average-Reward Markov Decision Processes
Meta-Dynamical State Space Models for Integrative Neural Data Analysis
Optimal Learning of Kernel Logistic Regression for Complex Classification Scenarios
Scaling Optimal LR Across Token Horizons
Subgraph Federated Learning for Local Generalization
Is In-Context Learning Sufficient for Instruction Following in LLMs?
Probe Pruning: Accelerating LLMs through Dynamic Pruning via Model-Probing
REBIND: Enhancing Ground-state Molecular Conformation Prediction via Force-Based Graph Rewiring
Locality Sensitive Avatars From Video
Does Spatial Cognition Emerge in Frontier Models?
Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
Release the Powers of Prompt Tuning: Cross-Modality Prompt Transfer
GoodDrag: Towards Good Practices for Drag Editing with Diffusion Models
Learning Interpretable Hierarchical Dynamical Systems Models from Time Series Data
NeSyC: A Neuro-symbolic Continual Learner For Complex Embodied Tasks in Open Domains
A Decade's Battle on Dataset Bias: Are We There Yet?
TabM: Advancing tabular deep learning with parameter-efficient ensembling
Learning to Solve Differential Equation Constrained Optimization Problems
GameArena: Evaluating LLM Reasoning through Live Computer Games
REEF: Representation Encoding Fingerprints for Large Language Models
Diff-PIC: Revolutionizing Particle-In-Cell Nuclear Fusion Simulation with Diffusion Models
Regret Bounds for Episodic Risk-Sensitive Linear Quadratic Regulator
Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning
Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective
Privacy-Aware Lifelong Learning
Mitigating Spurious Correlations in Zero-Shot Multimodal Models
LeanAgent: Lifelong Learning for Formal Theorem Proving
InstaTrain: Adaptive Training via Ultra-Fast Natural Annealing within Dynamical Systems
Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining
ZooProbe: A Data Engine for Evaluating, Exploring, and Evolving Large-scale Training Data for Multimodal LLMs
PARTNR: A Benchmark for Planning and Reasoning in Embodied Multi-agent Tasks
Adversarial Training for Defense Against Label Poisoning Attacks
Learning Harmonized Representations for Speculative Sampling
Trajectory-LLM: A Language-based Data Generator for Trajectory Prediction in Autonomous Driving
Self-Supervised Diffusion Models for Electron-Aware Molecular Representation Learning
Learning to Adapt Frozen CLIP for Few-Shot Test-Time Domain Adaptation
MambaPEFT: Exploring Parameter-Efficient Fine-Tuning for Mamba
Counterfactual Realizability
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
Nonlinear Sequence Embedding by Monotone Variational Inequality
A Formal Framework for Understanding Length Generalization in Transformers
Surprising Effectiveness of pretraining Ternary Language Model at Scale
FIRING-Net: A filtered feature recycling network for speech enhancement
Generalization through variance: how noise shapes inductive biases in diffusion models
Physics-informed Temporal Difference Metric Learning for Robot Motion Planning
ClawMachine: Learning to Fetch Visual Tokens for Referential Comprehension
Grokking at the Edge of Numerical Stability
CLIPure: Purification in Latent Space via CLIP for Adversarially Robust Zero-Shot Classification
Solving Video Inverse Problems Using Image Diffusion Models
Gumbel Counterfactual Generation From Language Models
Task Descriptors Help Transformers Learn Linear Models In-Context
Neural Sampling from Boltzmann Densities: Fisher-Rao Curves in the Wasserstein Geometry
DECO: Unleashing the Potential of ConvNets for Query-based Detection and Segmentation
Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models
Logical Consistency of Large Language Models in Fact-Checking
HShare: Fast LLM Decoding by Hierarchical Key-Value Sharing
P-SPIKESSM: HARNESSING PROBABILISTIC SPIKING STATE SPACE MODELS FOR LONG-RANGE DEPENDENCY TASKS
Boosting Neural Combinatorial Optimization for Large-Scale Vehicle Routing Problems
On the Relation between Trainability and Dequantization of Variational Quantum Learning Models
MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark
The Belief State Transformer
UNSURE: self-supervised learning with Unknown Noise level and Stein's Unbiased Risk Estimate
Spurious Forgetting in Continual Learning of Language Models
State Space Model Meets Transformer: A New Paradigm for 3D Object Detection
UTILITY: Utilizing Explainable Reinforcement Learning to Improve Reinforcement Learning
Learnable Expansion of Graph Operators for Multi-Modal Feature Fusion
Language Models Are Implicitly Continuous
MLPs Learn In-Context on Regression and Classification Tasks
Law of the Weakest Link: Cross Capabilities of Large Language Models
Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process
Bridging Information Asymmetry in Text-video Retrieval: A Data-centric Approach
Language Models Trained to do Arithmetic Predict Human Risky and Intertemporal Choice
Concept-ROT: Poisoning Concepts in Large Language Models with Model Editing
AvatarGO: Zero-shot 4D Human-Object Interaction Generation and Animation
Closed-Form Merging of Parameter-Efficient Modules for Federated Continual Learning
FreCaS: Efficient Higher-Resolution Image Generation via Frequency-aware Cascaded Sampling
T2V2: A Unified Non-Autoregressive Model for Speech Recognition and Synthesis via Multitask Learning
SRSA: Skill Retrieval and Adaptation for Robotic Assembly Tasks
A Riemannian Framework for Learning Reduced-order Lagrangian Dynamics
Approaching Rate-Distortion Limits in Neural Compression with Lattice Transform Coding
Neural Interactive Proofs
A Differentiable Metric for Discovering Groups and Unitary Representations
Model-Free Offline Reinforcement Learning with Enhanced Robustness
On Disentangled Training for Nonlinear Transform in Learned Image Compression
InstantSwap: Fast Customized Concept Swapping across Sharp Shape Differences
Multi-agent cooperation through learning-aware policy gradients
The 3D-PC: a benchmark for visual perspective taking in humans and machines
Continuous Diffusion for Mixed-Type Tabular Data
SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation
Exploring the Design Space of Visual Context Representation in Video MLLMs
Microcanonical Langevin Ensembles: Advancing the Sampling of Bayesian Neural Networks
Advantage Alignment Algorithms
Circuit Representation Learning with Masked Gate Modeling and Verilog-AIG Alignment
PEAR: Primitive Enabled Adaptive Relabeling for Boosting Hierarchical Reinforcement Learning
Efficient Causal Decision Making with One-sided Feedback
Learning Hierarchical Polynomials of Multiple Nonlinear Features
A Transfer Attack to Image Watermarks
Navigating Neural Space: Revisiting Concept Activation Vectors to Overcome Directional Divergence
Uncertainty Herding: One Active Learning Method for All Label Budgets
Matryoshka Multimodal Models
Combining Induction and Transduction for Abstract Reasoning
VICtoR: Learning Hierarchical Vision-Instruction Correlation Rewards for Long-horizon Manipulation
Topograph: An Efficient Graph-Based Framework for Strictly Topology Preserving Image Segmentation
MorphoDiff: Cellular Morphology Painting with Diffusion Models
Accurate and Scalable Graph Neural Networks via Message Invariance
Teaching LLMs How to Learn with Contextual Fine-Tuning
Protein Language Model Fitness is a Matter of Preference
Biologically Constrained Barrel Cortex Model Integrates Whisker Inputs and Replicates Key Brain Network Dynamics
In Search of the Engram in LLMs: A Neuroscience Perspective on the Memory Functions in AI Models
Triples as the Key: Structuring Makes Decomposition and Verification Easier in LLM-based TableQA
SoftCVI: Contrastive variational inference with self-generated soft labels
Think Then React: Towards Unconstrained Action-to-Reaction Motion Generation
Execution-guided within-prompt search for programming-by-example
RA-TTA: Retrieval-Augmented Test-Time Adaptation for Vision-Language Models
Attention as a Hypernetwork
Adam Exploits $\ell_\infty$-geometry of Loss Landscape via Coordinate-wise Adaptivity
Gaussian Ensemble Belief Propagation for Efficient Inference in High-Dimensional, Black-box Systems
DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life
Select before Act: Spatially Decoupled Action Repetition for Continuous Control
Divergence of Neural Tangent Kernel in Classification Problems
Structure Language Models for Protein Conformation Generation
Matérn Kernels for Tunable Implicit Surface Reconstruction
SEAL: Safety-enhanced Aligned LLM Fine-tuning via Bilevel Data Selection
Designing Mechanical Meta-Materials by Learning Equivariant Flows
Convergence of Distributed Adaptive Optimization with Local Updates
DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models
Learning vector fields of differential equations on manifolds with geometrically constrained operator-valued kernels
CARTS: Advancing Neural Theorem Proving with Diversified Tactic Calibration and Bias-Resistant Tree Search
An Online Learning Theory of Trading-Volume Maximization
Learning High-Degree Parities: The Crucial Role of the Initialization
Fast Direct: Query-Efficient Online Black-box Guidance for Diffusion-model Target Generation
Reasoning Elicitation in Language Models via Counterfactual Feedback
Latent Action Pretraining from Videos
ObscuraCoder: Powering Efficient Code LM Pre-Training Via Obfuscation Grounding
Co$^{\mathbf{3}}$Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion
Correlation and Navigation in the Vocabulary Key Representation Space of Language Models
KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models
Self-supervised contrastive learning performs non-linear system identification
Fengbo: a Clifford Neural Operator pipeline for 3D PDEs in Computational Fluid Dynamics
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
SmartRAG: Jointly Learn RAG-Related Tasks From the Environment Feedback
SparsyFed: Sparse Adaptive Federated Learning
Decision Information Meets Large Language Models: The Future of Explainable Operations Research
TRENDy: Temporal Regression of Effective Nonlinear Dynamics
Geometry-Aware Approaches for Balancing Performance and Theoretical Guarantees in Linear Bandits
Adaptive Transformer Programs: Bridging the Gap Between Performance and Interpretability in Transformers
Variational Best-of-N Alignment
Aligned LLMs Are Not Aligned Browser Agents
Adaptive Pruning of Pretrained Transformer via Differential Inclusions
PivotMesh: Generic 3D Mesh Generation via Pivot Vertices Guidance
Distributional Associations vs In-Context Reasoning: A Study of Feed-forward and Attention Layers
Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation
Learning Multi-Index Models with Neural Networks via Mean-Field Langevin Dynamics
Training Robust Ensembles Requires Rethinking Lipschitz Continuity
Perm: A Parametric Representation for Multi-Style 3D Hair Modeling
ActionReasoningBench: Reasoning about Actions with and without Ramification Constraints
Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation
CBraMod: A Criss-Cross Brain Foundation Model for EEG Decoding
Re-Evaluating the Impact of Unseen-Class Unlabeled Data on Semi-Supervised Learning Model
Generalization in VAE and Diffusion Models: A Unified Information-Theoretic Analysis
Weighted Multi-Prompt Learning with Description-free Large Language Model Distillation
On Targeted Manipulation and Deception when Optimizing LLMs for User Feedback
Three Mechanisms of Feature Learning in a Linear Network
DyCAST: Learning Dynamic Causal Structure from Time Series
Online Clustering with Nearly Optimal Consistency
Building Math Agents with Multi-Turn Iterative Preference Learning
GenDataAgent: On-the-fly Dataset Augmentation with Synthetic Data
Outlier Synthesis via Hamiltonian Monte Carlo for Out-of-Distribution Detection
Small Models are LLM Knowledge Triggers for Medical Tabular Prediction
Contextual Document Embeddings
Continuity-Preserving Convolutional Autoencoders for Learning Continuous Latent Dynamical Models from Images
LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior
Efficient Perplexity Bound and Ratio Matching in Discrete Diffusion Language Models
A Theoretical Perspective: How to Prevent Model Collapse in Self-consuming Training Loops
Learning to Discover Regulatory Elements for Gene Expression Prediction
Controllable Blur Data Augmentation Using 3D-Aware Motion Estimation
GANDALF: Generative AttentioN based Data Augmentation and predictive modeLing Framework for personalized cancer treatment
Learning-Augmented Frequent Directions
Topological Schrödinger Bridge Matching
Moral Alignment for LLM Agents
Reassessing How to Compare and Improve the Calibration of Machine Learning Models
F-Fidelity: A Robust Framework for Faithfulness Evaluation of Explainable AI
Motion Control of High-Dimensional Musculoskeletal Systems with Hierarchical Model-Based Planning
HELM: Hierarchical Encoding for mRNA Language Modeling
Everything is Editable: Extend Knowledge Editing to Unstructured Data in Large Language Models
Mitigating Memorization in Language Models
Zero-Shot Natural Language Explanations
SPDIM: Source-Free Unsupervised Conditional and Label Shift Adaptation in EEG
Fine-tuning can Help Detect Pretraining Data from Large Language Models
Rethinking Invariance Regularization in Adversarial Training to Improve Robustness-Accuracy Trade-off
Following the Human Thread in Social Navigation
DriveTransformer: Unified Transformer for Scalable End-to-End Autonomous Driving
Residual Stream Analysis with Multi-Layer SAEs
EmbodiedSAM: Online Segment Any 3D Thing in Real Time
Unsupervised Disentanglement of Content and Style via Variance-Invariance Constraints
AdaManip: Adaptive Articulated Object Manipulation Environments and Policy Learning
ContraDiff: Planning Towards High Return States via Contrastive Learning
Framer: Interactive Frame Interpolation
PhyloVAE: Unsupervised Learning of Phylogenetic Trees via Variational Autoencoders
DartControl: A Diffusion-Based Autoregressive Motion Model for Real-Time Text-Driven Motion Control
Deep Incomplete Multi-view Learning via Cyclic Permutation of VAEs
Contrastive Learning from Synthetic Audio Doppelgängers
Nonlinear multiregion neural dynamics with parametric impulse response communication channels
Incorporating Visual Correspondence into Diffusion Model for Virtual Try-On
POTEC: Off-Policy Contextual Bandits for Large Action Spaces via Policy Decomposition
3D-SPATIAL MULTIMODAL MEMORY
Leveraging Driver Field-of-View for Multimodal Ego-Trajectory Prediction
Self-supervised Monocular Depth Estimation Robust to Reflective Surface Leveraged by Triplet Mining
VEDIT: Latent Prediction Architecture For Procedural Video Representation Learning
RobustKV: Defending Large Language Models against Jailbreak Attacks via KV Eviction
TSC-Net: Prediction of Pedestrian Trajectories by Trajectory-Scene-Cell Classification
Beyond Circuit Connections: A Non-Message Passing Graph Transformer Approach for Quantum Error Mitigation
On Conformal Isometry of Grid Cells: Learning Distance-Preserving Position Embedding
IntersectionZoo: Eco-driving for Benchmarking Multi-Agent Contextual Reinforcement Learning
Finding Shared Decodable Concepts and their Negations in the Brain
An Asynchronous Bundle Method for Distributed Learning Problems
IterGen: Iterative Semantic-aware Structured LLM Generation with Backtracking
Agent Skill Acquisition for Large Language Models via CycleQD
GraphArena: Evaluating and Exploring Large Language Models on Graph Computation
Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems
CtD: Composition through Decomposition in Emergent Communication
Locally Connected Echo State Networks for Time Series Forecasting
PADRe: A Unifying Polynomial Attention Drop-in Replacement for Efficient Vision Transformer
LaMPlace: Learning to Optimize Cross-Stage Metrics in Macro Placement
Progressive Parameter Efficient Transfer Learning for Semantic Segmentation
COMBO: Compositional World Models for Embodied Multi-Agent Cooperation
To Tackle Adversarial Transferability: A Novel Ensemble Training Method with Fourier Transformation
Calibrating LLMs with Information-Theoretic Evidential Deep Learning
SMITE: Segment Me In TimE
PABBO: Preferential Amortized Black-Box Optimization
Federated Granger Causality Learning For Interdependent Clients With State Space Representation
ShEPhERD: Diffusing shape, electrostatics, and pharmacophores for bioisosteric drug design
Toward Efficient Multi-Agent Exploration With Trajectory Entropy Maximization
Improving Semantic Understanding in Speech Language Models via Brain-tuning
Permute-and-Flip: An optimally stable and watermarkable decoder for LLMs
Scaling Laws for Adversarial Attacks on Language Model Activations and Tokens
NEAR: A Training-Free Pre-Estimator of Machine Learning Model Performance
From GNNs to Trees: Multi-Granular Interpretability for Graph Neural Networks
Glad: A Streaming Scene Generator for Autonomous Driving
Scaling Large Language Model-based Multi-Agent Collaboration
Gaussian Differentially Private Human Faces Under a Face Radial Curve Representation
Simulating Training Dynamics to Reconstruct Training Data from Deep Neural Networks
Unsupervised Meta-Learning via In-Context Learning
The Directionality of Optimization Trajectories in Neural Networks
TopoDiffusionNet: A Topology-aware Diffusion Model
Concept Pinpoint Eraser for Text-to-image Diffusion Models via Residual Attention Gate
Self-Improving Robust Preference Optimization
Radar: Fast Long-Context Decoding for Any Transformer
A Policy-Gradient Approach to Solving Imperfect-Information Games with Best-Iterate Convergence
Conformalized Survival Analysis for General Right-Censored Data
Truncated Consistency Models
Exploiting Hidden Symmetry to Improve Objective Perturbation for DP linear learners with a nonsmooth L1-norm
Actions Speak Louder Than Words: Rate-Reward Trade-off in Markov Decision Processes
Step-by-Step Reasoning for Math Problems via Twisted Sequential Monte Carlo
Learning Dynamics of Deep Matrix Factorization Beyond the Edge of Stability
Learning to Steer Markovian Agents under Model Uncertainty
Federated Few-Shot Class-Incremental Learning
What Makes a Maze Look Like a Maze?
LDAdam: Adaptive Optimization from Low-Dimensional Gradient Statistics
Conditional Testing based on Localized Conformal $p$-values
Revisiting text-to-image evaluation with Gecko: on metrics, prompts, and human rating
Synergy and Diversity in CLIP: Enhancing Performance Through Adaptive Backbone Ensembling
PvNeXt: Rethinking Network Design and Temporal Motion for Point Cloud Video Recognition
Problem-Parameter-Free Federated Learning
Rethinking Spiking Neural Networks from an Ensemble Learning Perspective
One-for-All Few-Shot Anomaly Detection via Instance-Induced Prompt Learning
Training Large Language Models for Retrieval-Augmented Question Answering through Backtracking Correction
Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages
CityGaussianV2: Efficient and Geometrically Accurate Reconstruction for Large-Scale Scenes
Multiplicative Logit Adjustment Approximates Neural-Collapse-Aware Decision Boundary Adjustment
Retrieval Augmented Diffusion Model for Structure-informed Antibody Design and Optimization
Implicit Bias of Mirror Flow for Shallow Neural Networks in Univariate Regression
SaMer: A Scenario-aware Multi-dimensional Evaluator for Large Language Models
SOAP: Improving and Stabilizing Shampoo using Adam for Language Modeling
Streamlining Redundant Layers to Compress Large Language Models
Multimodal Situational Safety
TASAR: Transfer-based Attack on Skeletal Action Recognition
Wasserstein-Regularized Conformal Prediction under General Distribution Shift
SafeWatch: An Efficient Safety-Policy Following Video Guardrail Model with Transparent Explanations
$\text{D}_{2}\text{O}$: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models
Does Refusal Training in LLMs Generalize to the Past Tense?
JPEG Inspired Deep Learning
ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning
Robust Feature Learning for Multi-Index Models in High Dimensions
Stabilized Neural Prediction of Potential Outcomes in Continuous Time
OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation
Injecting Universal Jailbreak Backdoors into LLMs in Minutes
Learning Continually by Spectral Regularization
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation
Deep Random Features for Scalable Interpolation of Spatiotemporal Data
The Utility and Complexity of In- and Out-of-Distribution Machine Unlearning
Training Neural Networks as Recognizers of Formal Languages
CBQ: Cross-Block Quantization for Large Language Models
Consistency Checks for Language Model Forecasters
MambaQuant: Quantizing the Mamba Family with Variance Aligned Rotation Methods
Adversarially Robust Out-of-Distribution Detection Using Lyapunov-Stabilized Embeddings
Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with Image-Based Surface Representation
OSTQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitting
QPM: Discrete Optimization for Globally Interpretable Image Classification
Targeted Attack Improves Protection against Unauthorized Diffusion Customization
DexTrack: Towards Generalizable Neural Tracking Control for Dexterous Manipulation from Human References
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization
How Learnable Grids Recover Fine Detail in Low Dimensions: A Neural Tangent Kernel Analysis of Multigrid Parametric Encodings
Learning Geometric Reasoning Networks For Robot Task And Motion Planning
Rethinking Neural Multi-Objective Combinatorial Optimization via Neat Weight Embedding
Frame-Voyager: Learning to Query Frames for Video Large Language Models
Bridging the Semantic Gap Between Text and Table: A Case Study on NL2SQL
Towards Multiple Character Image Animation Through Enhancing Implicit Decoupling
Is Factuality Enhancement a Free Lunch For LLMs? Better Factuality Can Lead to Worse Context-Faithfulness
Understanding and Enhancing the Transferability of Jailbreaking Attacks
XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning
3D StreetUnveiler with Semantic-aware 2DGS - a simple baseline
Near-Optimal Policy Identification in Robust Constrained Markov Decision Processes via Epigraph Form
Inner Information Analysis Algorithm for Deep Neural Network based on Community
Node Similarities under Random Projections: Limits and Pathological Cases
Towards Scalable Topological Regularizers
Adapting Multi-modal Large Language Model to Concept Drift From Pre-training Onwards
No Training, No Problem: Rethinking Classifier-Free Guidance for Diffusion Models
X-Fi: A Modality-Invariant Foundation Model for Multimodal Human Sensing
Second-Order Fine-Tuning without Pain for LLMs: A Hessian Informed Zeroth-Order Optimizer
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Efficient Sparse PCA via Block-Diagonalization
Bridging the Gap between Variational Inference and Stochastic Gradient MCMC in Function Space
Topological Blindspots: Understanding and Extending Topological Deep Learning Through the Lens of Expressivity
The Effectiveness of Curvature-Based Rewiring and the Role of Hyperparameters in GNNs Revisited
Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data
The Geometry of Categorical and Hierarchical Concepts in Large Language Models
MotionAura: Generating High-Quality and Motion Consistent Videos using Discrete Diffusion
Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements
Human-Aligned Chess With a Bit of Search
Rethinking Self-Distillation: Label Averaging and Enhanced Soft Label Refinement with Partial Labels
Neural Exploratory Landscape Analysis for Meta-Black-Box-Optimization
Can Generative AI Solve Your In-Context Learning Problem? A Martingale Perspective
Cut Your Losses in Large-Vocabulary Language Models
KinPFN: Bayesian Approximation of RNA Folding Kinetics using Prior-Data Fitted Networks
Scaling FP8 training to trillion-token LLMs
Limits of Deep Learning: Sequence Modeling through the Lens of Complexity Theory
PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training Runs
Selective induction Heads: How Transformers Select Causal Structures in Context
SOO-Bench: Benchmarks for Evaluating the Stability of Offline Black-Box Optimization
Attention layers provably solve single-location regression
ImDy: Human Inverse Dynamics from Imitated Observations
Leveraging Sub-Optimal Data for Human-in-the-Loop Reinforcement Learning
Towards a learning theory of representation alignment
Dynamic Assortment Selection and Pricing with Censored Preference Feedback
Exploring the Effectiveness of Object-Centric Representations in Visual Question Answering: Comparative Insights with Foundation Models
Swift4D: Adaptive divide-and-conquer Gaussian Splatting for compact and efficient reconstruction of dynamic scene
Out-of-distribution Generalization for Total Variation based Invariant Risk Minimization
Dataset Distillation via Knowledge Distillation: Towards Efficient Self-Supervised Pre-training of Deep Networks
Beyond FVD: An Enhanced Evaluation Metrics for Video Generation Distribution Quality
Estimating the Probabilities of Rare Outputs in Language Models
Probabilistic Language-Image Pre-Training
Physics-aligned field reconstruction with diffusion bridge
Progressive Compression with Universally Quantized Diffusion Models
Bridging Context Gaps: Leveraging Coreference Resolution for Long Contextual Understanding
eQMARL: Entangled Quantum Multi-Agent Reinforcement Learning for Distributed Cooperation over Quantum Channels
Class Distribution-induced Attention Map for Open-vocabulary Semantic Segmentations
Fitting Networks with a Cancellation Trick
ACE: All-round Creator and Editor Following Instructions via Diffusion Transformer
Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
Knowledge Distillation with Multi-granularity Mixture of Priors for Image Super-Resolution
Flash Inference: Near Linear Time Inference for Long Convolution Sequence Models and Beyond
How to Evaluate Reward Models for RLHF
GeoILP: A Synthetic Dataset to Guide Large-Scale Rule Induction
On-the-fly Preference Alignment via Principle-Guided Decoding
Towards Interpreting Visual Information Processing in Vision-Language Models
LucidPPN: Unambiguous Prototypical Parts Network for User-centric Interpretable Computer Vision
Occlusion-aware Non-Rigid Point Cloud Registration via Unsupervised Neural Deformation Correntropy
Horizon Generalization in Reinforcement Learning
SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking
X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale
Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions
Decentralized Sporadic Federated Learning: A Unified Algorithmic Framework with Convergence Guarantees
Learning Clustering-based Prototypes for Compositional Zero-Shot Learning
Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs
Denoising with a Joint-Embedding Predictive Architecture
Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold
The Superposition of Diffusion Models Using the Itô Density Estimator
In-context Time Series Predictor
Beyond Next Token Prediction: Patch-Level Training for Large Language Models
Energy-Weighted Flow Matching for Offline Reinforcement Learning
Locality-aware Gaussian Compression for Fast and High-quality Rendering
MixMax: Distributional Robustness in Function Space via Optimal Data Mixtures
CryoFM: A Flow-based Foundation Model for Cryo-EM Densities
Calibrating Expressions of Certainty
Budgeted Online Continual Learning by Adaptive Layer Freezing and Frequency-based Sampling
Generalizable Human Gaussians from Single-View Image
Tool-Planner: Task Planning with Clusters across Multiple Tools
SCOPE: A Self-supervised Framework for Improving Faithfulness in Conditional Text Generation
Quantum-PEFT: Ultra parameter-efficient fine-tuning
Open-World Reinforcement Learning over Long Short-Term Imagination
Data Selection via Optimal Control for Language Models
Improving Deep Regression with Tightness
Gaussian Splatting Lucas-Kanade
FACTS: A Factored State-Space Framework for World Modelling
Progressive Token Length Scaling in Transformer Encoders for Efficient Universal Segmentation
IDInit: A Universal and Stable Initialization Method for Neural Network Training
Aligned Datasets Improve Detection of Latent Diffusion-Generated Images
Mitigating the Backdoor Effect for Multi-Task Model Merging via Safety-Aware Subspace
Learning Diagrams: A Graphical Language for Compositional Training Regimes
DeepTAGE: Deep Temporal-Aligned Gradient Enhancement for Optimizing Spiking Neural Networks
Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts
Eliminating Oversaturation and Artifacts of High Guidance Scales in Diffusion Models
Ensembling Diffusion Models via Adaptive Feature Aggregation
PAD: Personalized Alignment at Decoding-time
Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition
Look Before You Leap: Universal Emergent Mechanism for Retrieval in Language Models
Language Representations Can be What Recommenders Need: Findings and Potentials
UniDetox: Universal Detoxification of Large Language Models via Dataset Distillation
DataMan: Data Manager for Pre-training Large Language Models
Rethinking Audio-Visual Adversarial Vulnerability from Temporal and Modality Perspectives
GraphRouter: A Graph-based Router for LLM Selections
BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL
Multi-Resolution Decomposable Diffusion Model for Non-Stationary Time Series Anomaly Detection
How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?
Single Teacher, Multiple Perspectives: Teacher Knowledge Augmentation for Enhanced Knowledge Distillation
Enhancing Learning with Label Differential Privacy by Vector Approximation
Looking Inward: Language Models Can Learn About Themselves by Introspection
TweedieMix: Improving Multi-Concept Fusion for Diffusion-based Image/Video Generation
A Meta-Learning Approach to Bayesian Causal Discovery
Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning
Diffusion Bridge Implicit Models
Disentangled Representation Learning with the Gromov-Monge Gap
One Model Transfer to All: On Robust Jailbreak Prompts Generation against LLMs
Influence Functions for Scalable Data Attribution in Diffusion Models
Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization
Parameter Expanded Stochastic Gradient Markov Chain Monte Carlo
Action Sequence Augmentation for Action Anticipation
Combatting Dimensional Collapse in LLM Pre-Training Data via Submodular File Selection
Iterative Substructure Extraction for Molecular Relational Learning with Interactive Graph Information Bottleneck
Controllable Satellite-to-Street-View Synthesis with Precise Pose Alignment and Zero-Shot Environmental Control
Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs
Efficient Discovery of Pareto Front for Multi-Objective Reinforcement Learning
A Coefficient Makes SVRG Effective
Recovery of Causal Graph Involving Latent Variables via Homologous Surrogates
What is Wrong with Perplexity for Long-context Language Modeling?
LongMamba: Enhancing Mamba's Long-Context Capabilities via Training-Free Receptive Field Enlargement
Tuning Timestep-Distilled Diffusion Model Using Pairwise Sample Optimization
How Much is a Noisy Image Worth? Data Scaling Laws for Ambient Diffusion.
Efficient Automated Circuit Discovery in Transformers using Contextual Decomposition
S4M: S4 for multivariate time series forecasting with Missing values
Optimality of Matrix Mechanism on $\ell_p^p$-metric
Learning Graph Invariance by Harnessing Spuriosity
$\phi$-Update: A Class of Policy Update Methods with Policy Convergence Guarantee
Scalable Bayesian Learning with posteriors
Efficient Imitation under Misspecification
Causal Graph Transformer for Treatment Effect Estimation Under Unknown Interference
UGMathBench: A Diverse and Dynamic Benchmark for Undergraduate-Level Mathematical Reasoning with Large Language Models
DeepGate4: Efficient and Effective Representation Learning for Circuit Design at Scale
Simple ReFlow: Improved Techniques for Fast Flow Models
Vision Language Models are In-Context Value Learners
FIG: Flow with Interpolant Guidance for Linear Inverse Problems
ADePT: Adaptive Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning
A Unified Theory of Quantum Neural Network Loss Landscapes
CL-MFAP: A Contrastive Learning-Based Multimodal Foundation Model for Molecular Property Prediction and Antibiotic Screening
ADBM: Adversarial Diffusion Bridge Model for Reliable Adversarial Purification
Black Sheep in the Herd: Playing with Spuriously Correlated Attributes for Vision-Language Recognition
Multi-domain Distribution Learning for De Novo Drug Design
Beyond Interpretability: The Gains of Feature Monosemanticity on Model Robustness
Unifying Unsupervised Graph-Level Anomaly Detection and Out-of-Distribution Detection: A Benchmark
Exploring channel distinguishability in local neighborhoods of the model space in quantum neural networks
CrossMPT: Cross-attention Message-passing Transformer for Error Correcting Codes
Integrative Decoding: Improving Factuality via Implicit Self-consistency
Zero-shot Model-based Reinforcement Learning using Large Language Models
Learning stochastic dynamics from snapshots through regularized unbalanced optimal transport
RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction
Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining
ALBAR: Adversarial Learning approach to mitigate Biases in Action Recognition
Improving the Sparse Structure Learning of Spiking Neural Networks from the View of Compression Efficiency
GraphBridge: Towards Arbitrary Transfer Learning in GNNs
ClimaQA: An Automated Evaluation Framework for Climate Question Answering Models
A Conditional Independence Test in the Presence of Discretization
Rethinking Visual Counterfactual Explanations Through Region Constraint
Minimal Variance Model Aggregation: A principled, non-intrusive, and versatile integration of black box models
Boosting Perturbed Gradient Ascent for Last-Iterate Convergence in Games
Teaching Human Behavior Improves Content Understanding Abilities Of VLMs
GMValuator: Similarity-based Data Valuation for Generative Models
An Auditing Test to Detect Behavioral Shift in Language Models
Measuring And Improving Persuasiveness Of Large Language Models
Can Textual Gradient Work in Federated Learning?
RNNs are not Transformers (Yet): The Key Bottleneck on In-Context Retrieval
BANGS: Game-theoretic Node Selection for Graph Self-Training
Second Order Bounds for Contextual Bandits with Function Approximation
Do Egocentric Video-Language Models Truly Understand Hand-Object Interactions?
Sharpness-Aware Black-Box Optimization
Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF
Improving Complex Reasoning with Dynamic Prompt Corruption: A Soft Prompt Optimization Approach
Diffusion Bridge AutoEncoders for Unsupervised Representation Learning
Probabilistic Neural Pruning via Sparsity Evolutionary Fokker-Planck-Kolmogorov Equation
GI-GS: Global Illumination Decomposition on Gaussian Splatting for Inverse Rendering
Fine-Tuning Attention Modules Only: Enhancing Weight Disentanglement in Task Arithmetic
SAGEPhos: Sage Bio-Coupled and Augmented Fusion for Phosphorylation Site Detection
Posterior-Mean Rectified Flow: Towards Minimum MSE Photo-Realistic Image Restoration
Improving Generalization and Robustness in SNNs Through Signed Rate Encoding and Sparse Encoding Attacks
Privately Counting Partially Ordered Data
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks
Causal Graphical Models for Vision-Language Compositional Understanding
Semantix: An Energy-guided Sampler for Semantic Style Transfer
Towards Robust Multimodal Open-set Test-time Adaptation via Adaptive Entropy-aware Optimization
Causal Representation Learning from Multimodal Biomedical Observations
Adversarial Search Engine Optimization for Large Language Models
OvercookedV2: Rethinking Overcooked for Zero-Shot Coordination
TDDBench: A Benchmark for Training data detection
Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs
Investigating the Pre-Training Dynamics of In-Context Learning: Task Recognition vs. Task Learning
Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond
Modeling Complex System Dynamics with Flow Matching Across Time and Conditions
Parameter and Memory Efficient Pretraining via Low-rank Riemannian Optimization
SymDiff: Equivariant Diffusion via Stochastic Symmetrisation
ECHOPulse: ECG Controlled Echocardio-gram Video Generation
Lipschitz Bandits in Optimal Space
Interpreting Language Reward Models via Contrastive Explanations
Near-Optimal Online Learning for Multi-Agent Submodular Coordination: Tight Approximation and Communication Efficiency
Improving Graph Neural Networks by Learning Continuous Edge Directions
EcoFace: Audio-Visual Emotional Co-Disentanglement Speech-Driven 3D Talking Face Generation
Spatial-Mamba: Effective Visual State Space Models via Structure-Aware State Fusion
The Optimization Landscape of SGD Across the Feature Learning Strength
Stochastic variance-reduced Gaussian variational inference on the Bures-Wasserstein manifold
ClassDiffusion: More Aligned Personalization Tuning with Explicit Class Guidance
Intricacies of Feature Geometry in Large Language Models
Selective Aggregation for Low-Rank Adaptation in Federated Learning
REMEDY: Recipe Merging Dynamics in Large Vision-Language Models
THE ROBUSTNESS OF DIFFERENTIABLE CAUSAL DISCOVERY IN MISSPECIFIED SCENARIOS
Uncertainty and Influence aware Reward Model Refinement for Reinforcement Learning from Human Feedback
Pareto Low-Rank Adapters: Efficient Multi-Task Learning with Preferences
An Illustrated Guide to Automatic Sparse Differentiation
AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents
DINOv2: Learning Robust Visual Features without Supervision
Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data
Risk-Sensitive Variational Actor-Critic: A Model-Based Approach
SePer: Measure Retrieval Utility Through The Lens Of Semantic Perplexity Reduction
Noisy Test-Time Adaptation in Vision-Language Models
DiffPC: Diffusion-based High Perceptual Fidelity Image Compression with Semantic Refinement
Optimizing importance weighting in the presence of sub-population shifts
Merging LoRAs like Playing LEGO: Pushing the Modularity of LoRA to Extremes Through Rank-Wise Clustering
Attribute-based Visual Reprogramming for Vision-Language Models
Competitive Fair Scheduling with Predictions
AnalogGenie: A Generative Engine for Automatic Discovery of Analog Circuit Topologies
Fundamental Limits of Prompt Tuning Transformers: Universality, Capacity and Efficiency
Dual Process Learning: Controlling Use of In-Context vs. In-Weights Strategies with Weight Forgetting
ComLoRA: A Competitive Learning Approach for Enhancing LoRA
MMTEB: Massive Multilingual Text Embedding Benchmark
On the Convergence of No-Regret Dynamics in Information Retrieval Games with Proportional Ranking Functions
Turning Up the Heat: Min-p Sampling for Creative and Coherent LLM Outputs
Training on the Test Task Confounds Evaluation and Emergence
HeadMap: Locating and Enhancing Knowledge Circuits in LLMs
Learning Interleaved Image-Text Comprehension in Vision-Language Large Models
Extending Mercer's expansion to indefinite and asymmetric kernels
AssembleFlow: Rigid Flow Matching with Inertial Frames for Molecular Assembly
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
The AdEMAMix Optimizer: Better, Faster, Older
Preference Optimization for Reasoning with Pseudo Feedback
The OMG dataset: An Open MetaGenomic corpus for mixed-modality genomic language modeling
To Clip or not to Clip: the Dynamics of SGD with Gradient Clipping in High-Dimensions
Quantifying Generalization Complexity for Large Language Models
Denoising as Adaptation: Noise-Space Domain Adaptation for Image Restoration
Reconstructive Visual Instruction Tuning
Preserving Diversity in Supervised Fine-Tuning of Large Language Models
Indirect Gradient Matching for Adversarial Robust Distillation
Trained Transformer Classifiers Generalize and Exhibit Benign Overfitting In-Context
Identifiable Exchangeable Mechanisms for Causal Structure and Representation Learning
Feedback Schrödinger Bridge Matching
MetaUrban: An Embodied AI Simulation Platform for Urban Micromobility
Learning Video-Conditioned Policy on Unlabelled Data with Joint Embedding Predictive Transformer
Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment
ShortcutsBench: A Large-Scale Real-world Benchmark for API-based Agents
Semantic Loss Guided Data Efficient Supervised Fine Tuning for Safe Responses in LLMs
DeeperForward: Enhanced Forward-Forward Training for Deeper and Better Performance
Open-Set Graph Anomaly Detection via Normal Structure Regularisation
Node-Time Conditional Prompt Learning in Dynamic Graphs
Slot-Guided Adaptation of Pre-trained Diffusion Models for Object-Centric Learning and Compositional Generation
Adaptive Deployment of Untrusted LLMs Reduces Distributed Threats
Convergent Privacy Loss of Noisy-SGD without Convexity and Smoothness
RMB: Comprehensively benchmarking reward models in LLM alignment
Diffusion Attribution Score: Evaluating Training Data Influence in Diffusion Model
Dobi-SVD: Differentiable SVD for LLM Compression and Some New Perspectives
Finally Rank-Breaking Conquers MNL Bandits: Optimal and Efficient Algorithms for MNL Assortment
Reti-Diff: Illumination Degradation Image Restoration with Retinex-based Latent Diffusion Model
Adjoint Matching: Fine-tuning Flow and Diffusion Generative Models with Memoryless Stochastic Optimal Control
Generator Matching: Generative modeling with arbitrary Markov processes
Flow Matching with General Discrete Paths: A Kinetic-Optimal Perspective
Prompt as Knowledge Bank: Boost Vision-language model via Structural Representation for zero-shot medical detection
HiBug2: Efficient and Interpretable Error Slice Discovery for Comprehensive Model Debugging
ADIFF: Explaining audio difference using natural language
MamBEV: Enabling State Space Models to Learn Birds-Eye-View Representations
Do Vision & Language Decoders use Images and Text equally? How Self-consistent are their Explanations?
RFMamba: Frequency-Aware State Space Model for RF-Based Human-Centric Perception
FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference
Why Does the Effective Context Length of LLMs Fall Short?
HMoRA: Making LLMs More Effective with Hierarchical Mixture of LoRA Experts
TempMe: Video Temporal Token Merging for Efficient Text-Video Retrieval
Enhancing Robust Fairness via Confusional Spectral Regularization
Do as I do (Safely): Mitigating Task-Specific Fine-tuning Risks in Large Language Models
Sparse components distinguish visual pathways & their alignment to neural networks
Safety Alignment Should be Made More Than Just a Few Tokens Deep
Analysis of Linear Mode Connectivity via Permutation-Based Weight Matching: With Insights into Other Permutation Search Methods
ForecastBench: A Dynamic Benchmark of AI Forecasting Capabilities
Hidden in the Noise: Two-Stage Robust Watermarking for Images
Causal Concept Graph Models: Beyond Causal Opacity in Deep Learning
From Probability to Counterfactuals: the Increasing Complexity of Satisfiability in Pearl's Causal Hierarchy
gRNAde: Geometric Deep Learning for 3D RNA inverse design
I Can Hear You: Selective Robust Training for Deepfake Audio Detection
Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions
CATCH: Channel-Aware Multivariate Time Series Anomaly Detection via Frequency Patching
LLM-SR: Scientific Equation Discovery via Programming with Large Language Models
Understanding Warmup-Stable-Decay Learning Rates: A River Valley Loss Landscape View
Harnessing Webpage UIs for Text-Rich Visual Understanding
Is Your Video Language Model a Reliable Judge?
Coreset Selection via Reducible Loss in Continual Learning
Diffusion Policy Policy Optimization
Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Models
Data-centric Prediction Explanation via Kernelized Stein Discrepancy
CBGBench: Fill in the Blank of Protein-Molecule Complex Binding Graph
Visually Guided Decoding: Gradient-Free Hard Prompt Inversion with Language Models
Last Iterate Convergence of Incremental Methods as a Model of Forgetting
OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data
Adaptive Length Image Tokenization via Recurrent Allocation
From Tokens to Lattices: Emergent Lattice Structures in Language Models
Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation
Poisson-Dirac Neural Networks for Modeling Coupled Dynamical Systems across Domains
Ready-to-React: Online Reaction Policy for Two-Character Interaction Generation
ThermalGaussian: Thermal 3D Gaussian Splatting
Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation
Subtask-Aware Visual Reward Learning from Segmented Demonstrations
Multi-Label Node Classification with Label Influence Propagation
From Layers to States: A State Space Model Perspective to Deep Neural Network Layer Dynamics
Correlated Proxies: A New Definition and Improved Mitigation for Reward Hacking
Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models
Judge Decoding: Faster Speculative Sampling Requires Going Beyond Model Alignment
Zeroth-Order Fine-Tuning of LLMs with Transferable Static Sparsity
MMR: A Large-scale Benchmark Dataset for Multi-target and Multi-granularity Reasoning Segmentation
Generalization Bounds for Canonicalization: A Comparative Study with Group Averaging
Transformers Provably Solve Parity Efficiently with Chain of Thought
SoftMatcha: A Soft and Fast Pattern Matcher for Billion-Scale Corpus Searches
Linear combinations of latents in generative models: subspaces and beyond
Towards Semantic Equivalence of Tokenization in Multimodal LLM
Balanced Neural ODEs: nonlinear model order reduction and Koopman operator approximations
HARDMath: A Benchmark Dataset for Challenging Problems in Applied Mathematics
YouTube-SL-25: A Large-Scale, Open-Domain Multilingual Sign Language Parallel Corpus
Scale-Free Graph-Language Models
Graph Neural Networks Are More Than Filters: Revisiting and Benchmarking from A Spectral Perspective
Tree-Wasserstein Distance for High Dimensional Data with a Latent Feature Hierarchy
Large-scale and Fine-grained Vision-language Pre-training for Enhanced CT Image Understanding
The Unreasonable Ineffectiveness of the Deeper Layers
Shifting the Paradigm: A Diffeomorphism Between Time Series Data Manifolds for Achieving Shift-Invariancy in Deep Learning
Neural Causal Graph for Interpretable and Intervenable Classification
MeToken: Uniform Micro-environment Token Boosts Post-Translational Modification Prediction
Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning
TOP-ERL: Transformer-based Off-Policy Episodic Reinforcement Learning
An Effective Manifold-based Optimization Method for Distributionally Robust Classification
CAX: Cellular Automata Accelerated in JAX
ZAPBench: A Benchmark for Whole-Brain Activity Prediction in Zebrafish
Generative Classifiers Avoid Shortcut Solutions
Physics of Language Models: Part 3.2, Knowledge Manipulation
LiveXiv - A Multi-Modal live benchmark based on Arxiv papers content
Think-on-Graph 2.0: Deep and Faithful Large Language Model Reasoning with Knowledge-guided Retrieval Augmented Generation
FaceShot: Bring Any Character into Life
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
Track-On: Transformer-based Online Point Tracking with Memory
Can Watermarks be Used to Detect LLM IP Infringement For Free?
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning
ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing
Learning Graph Quantized Tokenizers
Support is All You Need for Certified VAE Training
Minimax Optimal Two-Stage Algorithm For Moment Estimation Under Covariate Shift
Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models
On the Byzantine-Resilience of Distillation-Based Federated Learning
Large (Vision) Language Models are Unsupervised In-Context Learners
Achieving Dimension-Free Communication in Federated Learning via Zeroth-Order Optimization
Stable Segment Anything Model
Hypothetical Minds: Scaffolding Theory of Mind for Multi-Agent Tasks with Large Language Models
Facilitating Multi-turn Function Calling for LLMs via Compositional Instruction Tuning
Learning Causal Alignment for Reliable Disease Diagnosis
A Generalist Hanabi Agent
Mind the GAP: Glimpse-based Active Perception improves generalization and sample efficiency of visual reasoning
Data Scaling Laws in Imitation Learning for Robotic Manipulation
CONTRA: Conformal Prediction Region via Normalizing Flow Transformation
Control-oriented Clustering of Visual Latent Representation
ADMM for Nonconvex Optimization under Minimal Continuity Assumption
Scalable and Certifiable Graph Unlearning: Overcoming the Approximation Error Barrier
Scalable Decision-Making in Stochastic Environments through Learned Temporal Abstraction
Leveraging Flatness to Improve Information-Theoretic Generalization Bounds for SGD
TEASER: Token Enhanced Spatial Modeling for Expressions Reconstruction
Think Thrice Before You Act: Progressive Thought Refinement in Large Language Models
CLoSD: Closing the Loop between Simulation and Diffusion for multi-task character control
SOREL: A Stochastic Algorithm for Spectral Risks Minimization
Adaptive Shrinkage Estimation for Personalized Deep Kernel Regression in Modeling Brain Trajectories
Logicbreaks: A Framework for Understanding Subversion of Rule-based Inference
Tight Time Complexities in Parallel Stochastic Optimization with Arbitrary Computation Dynamics
UV-Attack: Physical-World Adversarial Attacks on Person Detection via Dynamic-NeRF-based UV Mapping
On the Optimization Landscape of Low Rank Adaptation Methods for Large Language Models
Rethinking Invariance in In-context Learning
Retri3D: 3D Neural Graphics Representation Retrieval
$\gamma-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models
Iterative Label Refinement Matters More than Preference Optimization under Weak Supervision
Generalization Bounds and Model Complexity for Kolmogorov–Arnold Networks
Provably Accurate Shapley Value Estimation via Leverage Score Sampling
Homomorphism Counts as Structural Encodings for Graph Learning
Multi-Perspective Data Augmentation for Few-shot Object Detection
Do LLMs ``know'' internally when they follow instructions?
Generalization, Expressivity, and Universality of Graph Neural Networks on Attributed Graphs
Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation
SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks
GravMAD: Grounded Spatial Value Maps Guided Action Diffusion for Generalized 3D Manipulation
Generalized Video Moment Retrieval
FreeCG: Free the Design Space of Clebsch-Gordan Transform for Machine Learning Force Fields
Prioritized Generative Replay
ILLUSION: Unveiling Truth with a Comprehensive Multi-Modal, Multi-Lingual Deepfake Dataset
Mitigating Information Loss in Tree-Based Reinforcement Learning via Direct Optimization
Locality Alignment Improves Vision-Language Models
DEEM: Diffusion models serve as the eyes of large language models for image perception
Balancing Bias in Two-sided Markets for Fair Stable Matchings
ADMM for Structured Fractional Minimization
Breaking the Reclustering Barrier in Centroid-based Deep Clustering
A Large-scale Training Paradigm for Graph Generative Models
Balancing Act: Diversity and Consistency in Large Language Model Ensembles
Transformers Learn to Implement Multi-step Gradient Descent with Chain of Thought
Scaling Stick-Breaking Attention: An Efficient Implementation and In-depth Study
TULIP: Token-length Upgraded CLIP
Semi-Supervised Vision-Centric 3D Occupancy World Model for Autonomous Driving
TS-LIF: A Temporal Segment Spiking Neuron Network for Time Series Forecasting
PathGen-1.6M: 1.6 Million Pathology Image-text Pairs Generation through Multi-agent Collaboration
Metric-Driven Attributions for Vision Transformers
Bayesian Optimization of Antibodies Informed by a Generative Model of Evolving Sequences
SEMDICE: Off-policy State Entropy Maximization via Stationary Distribution Correction Estimation
Systematic Outliers in Large Language Models
Multimodal Large Language Models for Inverse Molecular Design with Retrosynthetic Planning
PhyloLM: Inferring the Phylogeny of Large Language Models and Predicting their Performances in Benchmarks
$InterLCM$: Low-Quality Images as Intermediate States of Latent Consistency Models for Effective Blind Face Restoration
PaLD: Detection of Text Partially Written by Large Language Models
Almost Optimal Batch-Regret Tradeoff for Batch Linear Contextual Bandits
Trajectory-Class-Aware Multi-Agent Reinforcement Learning
High-quality Text-to-3D Character Generation with SparseCubes and Sparse Transformers.
UIFace: Unleashing Inherent Model Capabilities to Enhance Intra-Class Diversity in Synthetic Face Recognition
Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image Animation
{$\tau$}-bench: A Benchmark for \underline{T}ool-\underline{A}gent-\underline{U}ser Interaction in Real-World Domains
UniCBE: An Uniformity-driven Comparing Based Evaluation Framework with Unified Multi-Objective Optimization
Self-Introspective Decoding: Alleviating Hallucinations for Large Vision-Language Models
Online-to-Offline RL for Agent Alignment
Robust Transfer of Safety-Constrained Reinforcement Learning Agents
Efficient Alternating Minimization with Applications to Weighted Low Rank Approximation
Towards Realistic UAV Vision-Language Navigation: Platform, Benchmark, and Methodology
PianoMotion10M: Dataset and Benchmark for Hand Motion Generation in Piano Performance
MADGEN: Mass-Spec attends to De Novo Molecular generation
Jailbreak Antidote: Runtime Safety-Utility Balance via Sparse Representation Adjustment in Large Language Models
Robust LLM safeguarding via refusal feature adversarial training
DRoC: Elevating Large Language Models for Complex Vehicle Routing via Decomposed Retrieval of Constraints
Adaptive Energy Alignment for Accelerating Test-Time Adaptation
ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation
Your Weak LLM is Secretly a Strong Teacher for Alignment
Energy-Based Diffusion Language Models for Text Generation
Proactive Agent: Shifting LLM Agents from Reactive Responses to Active Assistance
Data-adaptive Differentially Private Prompt Synthesis for In-Context Learning
Efficient Masked AutoEncoder for Video Object Counting and A Large-Scale Benchmark
Emerging Safety Attack and Defense in Federated Instruction Tuning of Large Language Models
Autocorrelation Matters: Understanding the Role of Initialization Schemes for State Space Models
Graph Neural Networks Can (Often) Count Substructures
Offline Model-Based Optimization by Learning to Rank
C-CLIP: Multimodal Continual Learning for Vision-Language Model
Bundle Neural Network for message diffusion on graphs
Collaborative Discrete-Continuous Black-Box Prompt Learning for Language Models
Jailbreaking as a Reward Misspecification Problem
PiCO: Peer Review in LLMs based on Consistency Optimization
Instructional Segment Embedding: Improving LLM Safety with Instruction Hierarchy
Neural Spacetimes for DAG Representation Learning
MetaMetrics: Calibrating Metrics for Generation Tasks Using Human Preferences
Provable Robust Overfitting Mitigation in Wasserstein Distributionally Robust Optimization
Better than Your Teacher: LLM Agents that learn from Privileged AI Feedback
Tractable Multi-Agent Reinforcement Learning through Behavioral Economics
SVG: 3D Stereoscopic Video Generation via Denoising Frame Matrix
Context-Alignment: Activating and Enhancing LLMs Capabilities in Time Series
Enhance Multi-View Classification Through Multi-Scale Alignment and Expanded Boundary
Breaking Mental Set to Improve Reasoning through Diverse Multi-Agent Debate
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
Beyond Model Collapse: Scaling Up with Synthesized Data Requires Verification
NUDGE: Lightweight Non-Parametric Fine-Tuning of Embeddings for Retrieval
Sparse Learning for State Space Models on Mobile
Neural ODE Transformers: Analyzing Internal Dynamics and Adaptive Fine-tuning
Quantum (Inspired) $D^2$-sampling with Applications
Perturbation-Restrained Sequential Model Editing
ThinkBot: Embodied Instruction Following with Thought Chain Reasoning
ElasticTok: Adaptive Tokenization for Image and Video
Learning from Imperfect Human Feedback: A Tale from Corruption-Robust Dueling
Holographic Node Representations: Pre-training Task-Agnostic Node Embeddings
Single-agent Poisoning Attacks Suffice to Ruin Multi-Agent Learning
Autoregressive Video Generation without Vector Quantization
Monte Carlo Planning with Large Language Model for Text-Based Game Agents
GLOMA: Global Video Text Spotting with Morphological Association
SC-OmniGS: Self-Calibrating Omnidirectional Gaussian Splatting
PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform Generation
Discrete Diffusion Schrödinger Bridge Matching for Graph Transformation
MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos
AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark
Precedence-Constrained Winter Value for Effective Graph Data Valuation
Dynamic Diffusion Transformer
A Theory for Token-Level Harmonization in Retrieval-Augmented Generation
Scaling and evaluating sparse autoencoders
Knowledge Localization: Mission Not Accomplished? Enter Query Localization!
Shot2Story: A New Benchmark for Comprehensive Understanding of Multi-shot Videos
Learning the Complexity of Weakly Noisy Quantum States
ProAdvPrompter: A Two-Stage Journey to Effective Adversarial Prompting for LLMs
Interpretable Vision-Language Survival Analysis with Ordinal Inductive Bias for Computational Pathology
Cross-Embodiment Dexterous Grasping with Reinforcement Learning
Model-based RL as a Minimalist Approach to Horizon-Free and Second-Order Bounds
Speech Robust Bench: A Robustness Benchmark For Speech Recognition
An Effective Theory of Bias Amplification
MMAD: A Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detection
Dimension Agnostic Neural Processes
Capturing the Temporal Dependence of Training Data Influence
SeRA: Self-Reviewing and Alignment of LLMs using Implicit Reward Margins
$\sigma$-zero: Gradient-based Optimization of $\ell_0$-norm Adversarial Examples
SafeDiffuser: Safe Planning with Diffusion Probabilistic Models
ReGen: Generative Robot Simulation via Inverse Design
Score-based Self-supervised MRI Denoising
E(3)-equivariant models cannot learn chirality: Field-based molecular generation
Faster Diffusion Sampling with Randomized Midpoints: Sequential and Parallel
Learning Splitting Heuristics in Divide-and-Conquer SAT Solvers with Reinforcement Learning
Text2PDE: Latent Diffusion Models for Accessible Physics Simulation
Beyond Autoregression: Fast LLMs via Self-Distillation Through Time
Seeing Eye to AI: Human Alignment via Gaze-Based Response Rewards for Large Language Models
Data Distillation for extrapolative protein design through exact preference optimization
SMT: Fine-Tuning Large Language Models with Sparse Matrices
The Journey Matters: Average Parameter Count over Pre-training Unifies Sparse and Dense Scaling Laws
Multiple Heads are Better than One: Mixture of Modality Knowledge Experts for Entity Representation Learning
Discriminating image representations with principal distortions
Functional Homotopy: Smoothing Discrete Optimization via Continuous Parameters for LLM Jailbreak Attacks
Can Watermarked LLMs be Identified by Users via Crafted Prompts?
On the Optimization and Generalization of Multi-head Attention
The impact of allocation strategies in subset learning on the expressive power of neural networks
EG4D: Explicit Generation of 4D Object without Score Distillation
How Much is Unseen Depends Chiefly on Information About the Seen
Prompting Fairness: Integrating Causality to Debias Large Language Models
Towards Certification of Uncertainty Calibration under Adversarial Attacks
LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models
SynFlowNet: Design of Diverse and Novel Molecules with Synthesis Constraints
Unbounded: A Generative Infinite Game of Character Life Simulation
Reconciling Model Multiplicity for Downstream Decision Making
VideoShield: Regulating Diffusion-based Video Generation Models via Watermarking
Schur's Positive-Definite Network: Deep Learning in the SPD cone with structure
Does SGD really happen in tiny subspaces?
Multimodal Quantitative Language for Generative Recommendation
UniMatch: Universal Matching from Atom to Task for Few-Shot Drug Discovery
PINP: Physics-Informed Neural Predictor with latent estimation of fluid flows
OccProphet: Pushing the Efficiency Frontier of Camera-Only 4D Occupancy Forecasting with an Observer-Forecaster-Refiner Framework
Logic-Logit: A Logic-Based Approach to Choice Modeling
Memory Efficient Transformer Adapter for Dense Predictions
Learning View-invariant World Models for Visual Robotic Manipulation
Long-tailed Adversarial Training with Self-Distillation
Scaling Laws for Downstream Task Performance in Machine Translation
ImagineNav: Prompting Vision-Language Models as Embodied Navigator through Scene Imagination
Residual-MPPI: Online Policy Customization for Continuous Control
Decomposition Polyhedra of Piecewise Linear Functions
Manifold Constraint Reduces Exposure Bias in Accelerated Diffusion Sampling
Neural Approximate Mirror Maps for Constrained Diffusion Models
MindSimulator: Exploring Brain Concept Localization via Synthetic fMRI
DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking head Video Generation
Learning General-purpose Biomedical Volume Representations using Randomized Synthesis
Linear Multistep Solver Distillation for Fast Sampling of Diffusion Models
IMDPrompter: Adapting SAM to Image Manipulation Detection by Cross-View Automated Prompt Learning
Training-free LLM-generated Text Detection by Mining Token Probability Sequences
Param$\Delta$ for Direct Mixing: Post-Train Large Language Model At Zero Cost
Clique Number Estimation via Differentiable Functions of Adjacency Matrix Permutations
Charting the Design Space of Neural Graph Representations for Subgraph Matching
Routing Experts: Learning to Route Dynamic Experts in Existing Multi-modal Large Language Models
DistillHGNN: A Knowledge Distillation Approach for High-Speed Hypergraph Neural Networks
Growth Inhibitors for Suppressing Inappropriate Image Concepts in Diffusion Models
Towards Unbiased Learning in Semi-Supervised Semantic Segmentation
In-Context Editing: Learning Knowledge from Self-Induced Distributions
PIED: Physics-Informed Experimental Design for Inverse Problems
Counterfactual Concept Bottleneck Models
Efficient Low-Bit Quantization with Adaptive Scales for Multi-Task Co-Training
Modeling dynamic social vision highlights gaps between deep learning and humans
Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language Models
Learning Fine-Grained Representations through Textual Token Disentanglement in Composed Video Retrieval
Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models
Unveiling the Secret Recipe: A Guide For Supervised Fine-Tuning Small LLMs
LancBiO: Dynamic Lanczos-aided Bilevel Optimization via Krylov Subspace
LLM-based Typed Hyperresolution for Commonsense Reasoning with Knowledge Bases
Process Reward Model with Q-value Rankings
Towards Effective Evaluations and Comparisons for LLM Unlearning Methods
FlashMask: Efficient and Rich Mask Extension of FlashAttention
Inspection and Control of Self-Generated-Text Recognition Ability in Llama3-8b-Instruct
Proteina: Scaling Flow-based Protein Structure Generative Models
Learning Successor Features with Distributed Hebbian Temporal Memory
ST-GCond: Self-supervised and Transferable Graph Dataset Condensation
A Solvable Attention for Neural Scaling Laws
Making Text Embedders Few-Shot Learners
Video In-context Learning: Autoregressive Transformers are Zero-Shot Video Imitators
Learning to Communicate Through Implicit Communication Channels
PFDiff: Training-Free Acceleration of Diffusion Models Combining Past and Future Scores
Explaining Modern Gated-Linear RNNs via a Unified Implicit Attention Formulation
Scaling Autonomous Agents via Automatic Reward Modeling And Planning
Interpretable Compressed Descriptions For Image Generation
GaussianBlock: Building Part-Aware Compositional and Editable 3D Scene by Primitives and Gaussians
Feature Responsiveness Scores: Model-Agnostic Explanations for Recourse
Learning Evolving Tools for Large Language Models
Failures to Find Transferable Image Jailbreaks Between Vision-Language Models
BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacks
Self-Supervised Diffusion MRI Denoising via Iterative and Stable Refinement
ZeroDiff: Solidified Visual-semantic Correlation in Zero-Shot Learning
Scalable Universal T-Cell Receptor Embeddings from Adaptive Immune Repertoires
Active Learning for Neural PDE Solvers
Beware of Calibration Data for Pruning Large Language Models
When Selection Meets Intervention: Additional Complexities in Causal Discovery
(Mis)Fitting Scaling Laws: A Survey of Scaling Law Fitting Techniques in Deep Learning
Enhancing Uncertainty Estimation and Interpretability with Bayesian Non-negative Decision Layer
Language Models Learn to Mislead Humans via RLHF
Improved Training Technique for Latent Consistency Models
Endless Jailbreaks with Bijection Learning
UniRestore3D: A Scalable Framework For General Shape Restoration
Optimal Transport for Time Series Imputation
Context Steering: Controllable Personalization at Inference Time
Boosting Multiple Views for pretrained-based Continual Learning
OLMoE: Open Mixture-of-Experts Language Models
Deep Signature: Characterization of Large-Scale Molecular Dynamics
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher
FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows"
Vector-ICL: In-context Learning with Continuous Vector Representations
MaxCutPool: differentiable feature-aware Maxcut for pooling in graph neural networks
Feature-Based Online Bilateral Trade
SymmCD: Symmetry-Preserving Crystal Generation with Diffusion Models
JudgeLM: Fine-tuned Large Language Models are Scalable Judges
EDiT: A Local-SGD-Based Efficient Distributed Training Method for Large Language Models
GOLD: Graph Out-of-Distribution Detection via Implicit Adversarial Latent Generation
Agent-to-Sim: Learning Interactive Behavior Models from Casual Longitudinal Videos
Quality over Quantity in Attention Layers: When Adding More Heads Hurts
GPromptShield: Elevating Resilience in Graph Prompt Tuning Against Adversarial Attacks
MDSGen: Fast and Efficient Masked Diffusion Temporal-Aware Transformers for Open-Domain Sound Generation
Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow
Diffusion Transformer Captures Spatial-Temporal Dependencies: A Theory for Gaussian Process Data
A Unified Framework for Forward and Inverse Problems in Subsurface Imaging using Latent Space Translations
MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models
Understanding the Generalization of In-Context Learning in Transformers: An Empirical Study
Pacmann: Efficient Private Approximate Nearest Neighbor Search
A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement
Causally Motivated Sycophancy Mitigation for Large Language Models
Robots Pre-train Robots: Manipulation-Centric Robotic Representation from Large-Scale Robot Datasets
Point-SAM: Promptable 3D Segmentation Model for Point Clouds
Towards Understanding Text Hallucination of Diffusion Models via Local Generation Bias
NoVo: Norm Voting off Hallucinations with Attention Heads in Large Language Models
AIMS.au: A Dataset for the Analysis of Modern Slavery Countermeasures in Corporate Statements
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
Federated Class-Incremental Learning: A Hybrid Approach Using Latent Exemplars and Data-Free Techniques to Address Local and Global Forgetting
ELFS: Label-Free Coreset Selection with Proxy Training Dynamics
DenoiseVAE: Learning Molecule-Adaptive Noise Distributions for Denoising-based 3D Molecular Pre-training
SiReRAG: Indexing Similar and Related Information for Multihop Reasoning
Integral Performance Approximation for Continuous-Time Reinforcement Learning Control
GOAL: A Generalist Combinatorial Optimization Agent Learner
MeshMask: Physics-Based Simulations with Masked Graph Neural Networks
Deep Kernel Relative Test for Machine-generated Text Detection
Efficient Dictionary Learning with Switch Sparse Autoencoders
Group Ligands Docking to Protein Pockets
VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents
Intervening Anchor Token: Decoding Strategy in Alleviating Hallucinations for MLLMs
Exact Byte-Level Probabilities from Tokenized Language Models for FIM-Tasks and Model Ensembles
A Skewness-Based Criterion for Addressing Heteroscedastic Noise in Causal Discovery
EvA: Erasing Spurious Correlations with Activations
Feature Averaging: An Implicit Bias of Gradient Descent Leading to Non-Robustness in Neural Networks
Semantic Temporal Abstraction via Vision-Language Model Guidance for Efficient Reinforcement Learning
Synthesizing Realistic fMRI: A Physiological Dynamics-Driven Hierarchical Diffusion Model for Efficient fMRI Acquisition
Dataset Ownership Verification in Contrastive Pre-trained Models
CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion
SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation
Lines of Thought in Large Language Models
Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel
Understanding the Stability-based Generalization of Personalized Federated Learning
Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems
DiSK: Differentially Private Optimizer with Simplified Kalman Filter for Noise Reduction
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
High-Quality Joint Image and Video Tokenization with Causal VAE
Correcting the Mythos of KL-Regularization: Direct Alignment without Overoptimization via Chi-Squared Preference Optimization
W-PCA Based Gradient-Free Proxy for Efficient Search of Lightweight Language Models
TTVD: Towards a Geometric Framework for Test-Time Adaptation Based on Voronoi Diagram
Forte : Finding Outliers with Representation Typicality Estimation
MQuAKE-Remastered: Multi-Hop Knowledge Editing Can Only Be Advanced with Reliable Evaluations
Mask in the Mirror: Implicit Sparsification
STORM: Spatio-TempOral Reconstruction Model For Large-Scale Outdoor Scenes
On Stochastic Contextual Bandits with Knapsacks in Small Budget Regime
PortLLM: Personalizing Evolving Large Language Models with Training-Free and Portable Model Patches
PALMBENCH: A COMPREHENSIVE BENCHMARK OF COMPRESSED LARGE LANGUAGE MODELS ON MOBILE PLATFORMS
Gradient descent with generalized Newton’s method
Generative Monoculture in Large Language Models
Revisit Micro-batch Clipping: Adaptive Data Pruning via Gradient Manipulation
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
SpinQuant: LLM Quantization with Learned Rotations
U-Nets as Belief Propagation: Efficient Classification, Denoising, and Diffusion in Generative Hierarchical Models
Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design
Diverse Policies Recovering via Pointwise Mutual Information Weighted Imitation Learning
Local Patterns Generalize Better for Novel Anomalies
Scalable Discrete Diffusion Samplers: Combinatorial Optimization and Statistical Physics
Learn hybrid prototypes for multivariate time series anomaly detection
MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding
Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent
Model Risk-sensitive Offline Reinforcement Learning
Multi-Accurate CATE is Robust to Unknown Covariate Shifts
DLEFT-MKC: Dynamic Late Fusion Multiple Kernel Clustering with Robust Tensor Learning via Min-Max Optimization
Apollo-MILP: An Alternating Prediction-Correction Neural Solving Framework for Mixed-Integer Linear Programming
Pursuing Feature Separation based on Neural Collapse for Out-of-Distribution Detection
A Second-Order Perspective on Model Compositionality and Incremental Learning
Structuring Benchmark into Knowledge Graphs to Assist Large Language Models in Retrieving and Designing Models
On the Computation of the Fisher Information in Continual Learning
Distributed Speculative Inference (DSI): Speculation Parallelism for Provably Faster Lossless Language Model Inference
Robust Function-Calling for On-Device Language Model via Function Masking
More Experts Than Galaxies: Conditionally-Overlapping Experts with Biologically-Inspired Fixed Routing
Fantastic Targets for Concept Erasure in Diffusion Models and Where To Find Them
ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability
Perplexity Trap: PLM-Based Retrievers Overrate Low Perplexity Documents
Utility-Directed Conformal Prediction: A Decision-Aware Framework for Actionable Uncertainty Quantification
MIND: Math Informed syNthetic Dialogues for Pretraining LLMs
Visual-O1: Understanding Ambiguous Instructions via Multi-modal Multi-turn Chain-of-thoughts Reasoning
Inference Scaling for Long-Context Retrieval Augmented Generation
A3D: Does Diffusion Dream about 3D Alignment?
Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization
CraftRTL: High-quality Synthetic Data Generation for Verilog Code Models with Correct-by-Construction Non-Textual Representations and Targeted Code Repair
The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs
MMKE-Bench: A Multimodal Editing Benchmark for Diverse Visual Knowledge
Scrutinize What We Ignore: Reining In Task Representation Shift Of Context-Based Offline Meta Reinforcement Learning
SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models
Shh, don't say that! Domain Certification in LLMs
Memory Mosaics
Denoising Levy Probabilistic Models
Progressive Mixed-Precision Decoding for Efficient LLM Inference
Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Context Sparsification
T2V-Turbo-v2: Enhancing Video Model Post-Training through Data, Reward, and Conditional Guidance Design
CURIE: Evaluating LLMs on Multitask Scientific Long-Context Understanding and Reasoning
u-$\mu$P: The Unit-Scaled Maximal Update Parametrization
Does Editing Provide Evidence for Localization?
NRGBoost: Energy-Based Generative Boosted Trees
MMRole: A Comprehensive Framework for Developing and Evaluating Multimodal Role-Playing Agents
Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization
KiVA: Kid-inspired Visual Analogies for Testing Large Multimodal Models
Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist
Gnothi Seauton: Empowering Faithful Self-Interpretability in Black-Box Transformers
Generalizing Weisfeiler-Lehman Kernels to Subgraphs
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
On the Hölder Stability of Multiset and Graph Neural Networks
EIA: ENVIRONMENTAL INJECTION ATTACK ON GENERALIST WEB AGENTS FOR PRIVACY LEAKAGE
Free Hunch: Denoiser Covariance Estimation for Diffusion Models Without Extra Costs
Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms
Palu: KV-Cache Compression with Low-Rank Projection
Unleashing the Power of Task-Specific Directions in Parameter Efficient Fine-tuning
Higher-Order Graphon Neural Networks: Approximation and Cut Distance
TD-Paint: Faster Diffusion Inpainting Through Time Aware Pixel Conditioning
Relation-Aware Diffusion for Heterogeneous Graphs with Partially Observed Features
An Undetectable Watermark for Generative Image Models
PT-T2I/V: An Efficient Proxy-Tokenized Diffusion Transformer for Text-to-Image/Video-Task
Counterfactual Generative Modeling with Variational Causal Inference
Cut the Crap: An Economical Communication Pipeline for LLM-based Multi-Agent Systems
One Step Diffusion via Shortcut Models
Style Outweighs Substance: Failure Modes of LLM Judges in Alignment Benchmarking
Maintaining Structural Integrity in Parameter Spaces for Parameter Efficient Fine-tuning
From Pixels to Tokens: Byte-Pair Encoding on Quantized Visual Modalities
In vivo cell-type and brain region classification via multimodal contrastive learning
Re-Imagining Multimodal Instruction Tuning: A Representation View
MuseGNN: Forming Scalable, Convergent GNN Layers that Minimize a Sampling-Based Energy
ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation
Adversarially Robust Anomaly Detection through Spurious Negative Pair Mitigation
Improving Neural Optimal Transport via Displacement Interpolation
Restructuring Vector Quantization with the Rotation Trick
TimeMixer++: A General Time Series Pattern Machine for Universal Predictive Analysis
When LLMs Play the Telephone Game: Cultural Attractors as Conceptual Tools to Evaluate LLMs in Multi-turn Settings
As Simple as Fine-tuning: LLM Alignment via Bidirectional Negative Feedback Loss
Greener GRASS: Enhancing GNNs with Encoding, Rewiring, and Attention
Adaptive Methods through the Lens of SDEs: Theoretical Insights on the Role of Noise
Episodic Novelty Through Temporal Distance
Any-step Dynamics Model Improves Future Predictions for Online and Offline Reinforcement Learning
AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents
Swift Hydra: Self-Reinforcing Generative Framework for Anomaly Detection with Multiple Mamba Models
Gated Delta Networks: Improving Mamba2 with Delta Rule
Looking Backward: Streaming Video-to-Video Translation with Feature Banks
Graph Transformers Dream of Electric Flow
LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory
mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models
VOILA: Evaluation of MLLMs For Perceptual Understanding and Analogical Reasoning
Linear Transformer Topological Masking with Graph Random Features
Fugatto 1: Foundational Generative Audio Transformer Opus 1
Reconstruction-Guided Policy: Enhancing Decision-Making through Agent-Wise State Consistency
ControlAR: Controllable Image Generation with Autoregressive Models
DoF: A Diffusion Factorization Framework for Offline Multi-Agent Reinforcement Learning
Graph Neural Networks Gone Hogwild
Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
On Statistical Rates of Conditional Diffusion Transformers: Approximation, Estimation and Minimax Optimality
Youku Dense Caption: A Large-scale Chinese Video Dense Caption Dataset and Benchmarks
The Rise and Down of Babel Tower: Investigating the Evolution Process of Multilingual Code Large Language Model
On the Expressive Power of Sparse Geometric MPNNs
On LLM Knowledge Distillation - A Comparison between Forward KL and Reverse KL
Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality
Presto! Distilling Steps and Layers for Accelerating Music Generation
Token-Supervised Value Models for Enhancing Mathematical Problem-Solving Capabilities of Large Language Models
Human Simulacra: Benchmarking the Personification of Large Language Models
Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset
Multi-modal Agent Tuning: Building a VLM-Driven Agent for Efficient Tool Usage
T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching
Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
Towards Hierarchical Rectified Flow
Why RoPE Struggles to Maintain Long-Term Decay in Long Sequences?
Efficient Evolutionary Search Over Chemical Space with Large Language Models
ReNovo: Retrieval-Based \emph{De Novo} Mass Spectrometry Peptide Sequencing
NExT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation
Enhancing the Scalability and Applicability of Kohn-Sham Hamiltonians for Molecular Systems
MOOSE-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses
Learning Equivariant Non-Local Electron Density Functionals
Contextualizing biological perturbation experiments through language
BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments
Hyperbolic Genome Embeddings
Estimation of single-cell and tissue perturbation effect in spatial transcriptomics via Spatial Causal Disentanglement
Time-to-Event Pretraining for 3D Medical Imaging
Reasoning-Enhanced Healthcare Predictions with Knowledge Graph Community Retrieval
PhysBench: Benchmarking and Enhancing Vision-Language Models for Physical World Understanding
QuaDiM: A Conditional Diffusion Model For Quantum State Property Estimation
SimXRD-4M: Big Simulated X-ray Diffraction Data and Crystal Symmetry Classification Benchmark
Can Transformers Do Enumerative Geometry?
No Equations Needed: Learning System Dynamics Without Relying on Closed-Form ODEs
Navigation-Guided Sparse Scene Representation for End-to-End Autonomous Driving
Rapidly Adapting Policies to the Real-World via Simulation-Guided Fine-Tuning
RoboCat: A Self-Improving Generalist Agent for Robotic Manipulation
Physiome-ODE: A Benchmark for Irregularly Sampled Multivariate Time-Series Forecasting Based on Biological ODEs
Diffusion-based Decoupled Deterministic and Uncertain Framework for Probabilistic Multivariate Time Series Forecasting
Zero-shot forecasting of chaotic systems
TimeInf: Time Series Data Contribution via Influence Functions
Infinite-Resolution Integral Noise Warping for Diffusion Models
Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Automated Proof Generation for Rust Code via Self-Evolution
Vevo: Controllable Zero-Shot Voice Imitation with Self-Supervised Disentanglement
OmnixR: Evaluating Omni-modality Language Models on Reasoning across Modalities
DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory
Competing Large Language Models in Multi-Agent Gaming Environments
Animate Your Thoughts: Reconstruction of Dynamic Natural Vision from Human Brain Activity
Brain Mapping with Dense Features: Grounding Cortical Semantic Selectivity in Natural Images With Vision Transformers
Quantized Spike-driven Transformer
Range, not Independence, Drives Modularity in Biologically Inspired Representations
Associative memory and dead neurons
As large as it gets – Studying Infinitely Large Convolutions via Neural Implicit Frequency Filters
MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Masked Image Modeling Representations
Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering
CityAnchor: City-scale 3D Visual Grounding with Multi-modality LLMs
TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning
Less is More: Masking Elements in Image Condition Features Avoids Content Leakages in Style Transfer Diffusion Models
TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio Motion Embedding and Diffusion Interpolation
Order-aware Interactive Segmentation
Pedestrian Motion Reconstruction: A Large-scale Benchmark via Mixed Reality Rendering with Multiple Perspectives and Modalities
Foundation Models Secretly Understand Neural Network Weights: Enhancing Hypernetwork Architectures with Foundation Models
Unleashing the Potential of Vision-Language Pre-Training for 3D Zero-Shot Lesion Segmentation via Mask-Attribute Alignment
Generation and Comprehension Hand-in-Hand: Vision-guided Expression Diffusion for Boosting Referring Expression Generation and Comprehension
3DGS-Drag: Dragging Gaussians for Intuitive Point-Based 3D Editing
AugKD: Ingenious Augmentations Empower Knowledge Distillation for Image Super-Resolution
Cocoon: Robust Multi-Modal Perception with Uncertainty-Aware Sensor Fusion
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting
Streaming Video Question-Answering with In-context Video KV-Cache Retrieval
X-Drive: Cross-modality Consistent Multi-Sensor Data Synthesis for Driving Scenarios
Sketch2Diagram: Generating Vector Diagrams from Hand-Drawn Sketches
Cafe-Talk: Generating 3D Talking Face Animation with Multimodal Coarse- and Fine-grained Control
Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video Grounding
Articulate-Anything: Automatic Modeling of Articulated Objects via a Vision-Language Foundation Model
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels
Latent Radiance Fields with 3D-aware 2D Representations
Which Tasks Should Be Compressed Together? A Causal Discovery Approach for Efficient Multi-Task Representation Compression
Measuring And Improving Engagement of Text-to-Image Generation Models
SurFhead: Affine Rig Blending for Geometrically Accurate 2D Gaussian Surfel Head Avatars
SiMHand: Mining Similar Hands for Large-Scale 3D Hand Pose Pre-training
Learning Color Equivariant Representations
Towards Realistic Data Generation for Real-World Super-Resolution
Re-Aligning Language to Visual Objects with an Agentic Workflow
EffoVPR: Effective Foundation Model Utilization for Visual Place Recognition
ProtoSnap: Prototype Alignment For Cuneiform Signs
On the Transfer of Object-Centric Representation Learning
CertainlyUncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness
Recognize Any Surgical Object: Unleashing the Power of Weakly-Supervised Data
CHAMP: Conformalized 3D Human Multi-Hypothesis Pose Estimators
RESfM: Robust Deep Equivariant Structure from Motion
IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model
LLM-wrapper: Black-Box Semantic-Aware Adaptation of Vision-Language Models for Referring Expression Comprehension
OATS: Outlier-Aware Pruning Through Sparse and Low Rank Decomposition
MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models
EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing
DiscoveryBench: Towards Data-Driven Discovery with Large Language Models
OpenRCA: Can Large Language Models Locate the Root Cause of Software Failures?
Empowering Users in Digital Privacy Management through Interactive LLM-Based Agents
BirdSet: A Large-Scale Dataset for Audio Classification in Avian Bioacoustics
Smoothing the Shift: Towards Stable Test-Time Adaptation under Complex Multimodal Noises
BP-Modified Local Loss for Efficient Training of Deep Neural Networks
Regularizing Energy among Training Samples for Out-of-Distribution Generalization
Rotated Runtime Smooth: Training-Free Activation Smoother for accurate INT4 inference
MotionClone: Training-Free Motion Cloning for Controllable Video Generation
Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model Compression
EMMA: Empowering Multi-modal Mamba with Structural and Hierarchical Alignment
From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions
Progress or Regress? Self-Improvement Reversal in Post-training
Are Large Vision Language Models Good Game Players?
Neural Phylogeny: Fine-Tuning Relationship Detection among Neural Networks
Ensembles of Low-Rank Expert Adapters
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations
Uncovering Latent Memories in Large Language Models
Your Mixture-of-Experts LLM Is Secretly an Embedding Model for Free
Differential learning kinetics govern the transition from memorization to generalization during in-context learning
DOCS: Quantifying Weight Similarity for Deeper Insights into Large Language Models
Pre-training of Foundation Adapters for LLM Fine-tuning
What's New in My Data? Novelty Exploration via Contrastive Generation
Self-Improvement in Language Models: The Sharpening Mechanism
Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling
Sail into the Headwind: Alignment via Robust Rewards and Dynamic Labels against Reward Hacking
HaDeMiF: Hallucination Detection and Mitigation in Large Language Models
BadJudge: Backdoor Vulnerabilities of LLM-As-A-Judge
Improving Pretraining Data Using Perplexity Correlations
Gramian Multimodal Representation Learning and Alignment
Improving Neural Network Accuracy by Concurrently Training with a Twin Network
Beyond correlation: The impact of human uncertainty in measuring the effectiveness of automatic evaluation and LLM-as-a-judge
Robust Representation Consistency Model via Contrastive Denoising
SEBRA : Debiasing through Self-Guided Bias Ranking
Democratic Training Against Universal Adversarial Perturbations
Severing Spurious Correlations with Data Pruning
Provably Reliable Conformal Prediction Sets in the Presence of Data Poisoning
MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments
Morphing Tokens Draw Strong Masked Image Models
The "Law'' of the Unconscious Contrastive Learner: Probabilistic Alignment of Unpaired Modalities
DebGCD: Debiased Learning with Distribution Guidance for Generalized Category Discovery
Bayesian Treatment of the Spectrum of the Empirical Kernel in (Sub)Linear-Width Neural Networks
A Rainbow in Deep Network Black Boxes
Prediction Risk and Estimation Risk of the Ridgeless Least Squares Estimator under General Assumptions on Regression Errors
Oscillatory State-Space Models
Two Sparse Matrices are Better than One: Sparsifying Neural Networks with Double Sparse Factorization
Designing Concise ConvNets with Columnar Stages
Solving hidden monotone variational inequalities with surrogate losses
Sharpness-Aware Minimization: General Analysis and Improved Rates
Local convergence of simultaneous min-max algorithms to differential equilibrium on Riemannian manifold
OCCAM: Towards Cost-Efficient and Accuracy-Aware Classification Inference
Utilitarian Algorithm Configuration for Infinite Parameter Spaces
Optimizing Posterior Samples for Bayesian Optimization via Rootfinding
On the Almost Sure Convergence of the Stochastic Three Points Algorithm
Joint Gradient Balancing for Data Ordering in Finite-Sum Multi-Objective Optimization
Learning on One Mode: Addressing Multi-modality in Offline Reinforcement Learning
Cross-Domain Offline Policy Adaptation with Optimal Transport and Dataset Constraint
Policy Optimization under Imperfect Human Interactions with Agent-Gated Shared Autonomy
Learning to Search from Demonstration Sequences
Policy Gradient with Kernel Quadrature
On Generalization Across Environments In Multi-Objective Reinforcement Learning
Expected Return Symmetries
Learning mirror maps in policy mirror descent
What Makes a Good Diffusion Planner for Decision Making?
Simple, Good, Fast: Self-Supervised World Models Free of Baggage
How to Find the Exact Pareto Front for Multi-Objective MDPs?
CBMA: Improving Conformal Prediction through Bayesian Model Averaging
Residual Deep Gaussian Processes on Manifolds
End-to-end Learning of Gaussian Mixture Priors for Diffusion Sampler
Diffusion On Syntax Trees For Program Synthesis
Benchmarking Predictive Coding Networks -- Made Simple
Kernel-based Optimally Weighted Conformal Time-Series Prediction
Connecting Federated ADMM to Bayes
Training One-Dimensional Graph Neural Networks is NP-Hard
On the Optimal Memorization Capacity of Transformers
State Space Models are Provably Comparable to Transformers in Dynamic Token Selection
Efficient Online Pruning and Abstraction for Imperfect Information Extensive-Form Games
Strategic Classification With Externalities
ONLINE EPSILON NET & PIERCING SET FOR GEOMETRIC CONCEPTS
Bounds on $L_p$ Errors in Density Ratio Estimation via $f$-Divergence Loss Functions
Conservative Contextual Bandits: Beyond Linear Representations
Satisficing Regret Minimization in Bandits
Linear Bandits with Memory
ADAM Optimization with Adaptive Batch Selection
Do Stochastic, Feel Noiseless: Stable Stochastic Optimization via a Double Momentum Mechanism
Reexamining the Aleatoric and Epistemic Uncertainty Dichotomy
Statistical Tractability of Off-policy Evaluation of History-dependent Policies in POMDPs
Generalizable Motion Planning via Operator Learning
On Minimizing Adversarial Counterfactual Error in Adversarial Reinforcement Learning
Topological Zigzag Spaghetti for Diffusion-based Generation and Prediction on Graphs
ADAM: An Embodied Causal Agent in Open-World Environments
Causal Discovery via Bayesian Optimization
Euler Characteristic Tools for Topological Data Analysis
KAN: Kolmogorov–Arnold Networks
Advancing Out-of-Distribution Detection via Local Neuroplasticity
Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation Learning
An Information Criterion for Controlled Disentanglement of Multimodal Data
Test-time Adaptation for Cross-modal Retrieval with Query Shift
Neural networks on Symmetric Spaces of Noncompact Type
Boosting Methods for Interval-censored Data with Regression and Classification
Exact Community Recovery under Side Information: Optimality of Spectral Algorithms
Content-Style Learning from Unaligned Domains: Identifiability under Unknown Latent Dimensions
Scale-Aware Contrastive Reverse Distillation for Unsupervised Medical Anomaly Detection
Let SSMs be ConvNets: State-space Modeling with Optimal Tensor Contractions
Fine-tuning can cripple your foundation model; preserving features may be the solution
Do not write that jailbreak paper
Encryption-Friendly LLM Architecture
Image-level Memorization Detection via Inversion-based Inference Perturbation
Towards hyperparameter-free optimization with differential privacy
The Last Iterate Advantage: Empirical Auditing and Principled Heuristic Analysis of Differentially Private SGD
Learning from End User Data with Shuffled Differential Privacy over Kernel Densities
How to Verify Any (Reasonable) Distribution Property: Computationally Sound Argument Systems for Distributions
More RLHF, More Trust? On The Impact of Preference Alignment On Trustworthiness
Fantastic Copyrighted Beasts and How (Not) to Generate Them
Probe before You Talk: Towards Black-box Defense against Backdoor Unalignment for Large Language Models
When Prompt Engineering Meets Software Engineering: CNL-P as Natural and Robust "APIs'' for Human-AI Interaction
Salvage: Shapley-distribution Approximation Learning Via Attribution Guided Exploration for Explainable Image Classification
Mechanism and emergence of stacked attention heads in multi-layer transformers
Century: A Framework and Dataset for Evaluating Historical Contextualisation of Sensitive Images
Dysca: A Dynamic and Scalable Benchmark for Evaluating Perception Ability of LVLMs
AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents
BBCaL: Black-box Backdoor Detection under the Causality Lens
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal
Dynamic Sparse Training versus Dense Training: The Unexpected Winner in Image Corruption Robustness
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
Language Models are Advanced Anonymizers
Towards Domain Adaptive Neural Contextual Bandits
ConceptPrune: Concept Editing in Diffusion Models via Skilled Neuron Pruning
SAVA: Scalable Learning-Agnostic Data Valuation
$R^2$-Guard: Robust Reasoning Enabled LLM Guardrail via Knowledge-Enhanced Logical Reasoning
Deep MMD Gradient Flow without adversarial training
Gyrogroup Batch Normalization
How Feature Learning Can Improve Neural Scaling Laws
Understanding Factual Recall in Transformers via Associative Memories
WeatherGFM: Learning a Weather Generalist Foundation Model via In-context Learning
On the Benefits of Memory for Modeling Time-Dependent PDEs
Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception
Consistency Models Made Easy
APE: Faster and Longer Context-Augmented Generation via Adaptive Parallel Encoding
Let the Code LLM Edit Itself When You Edit the Code
Exposing and Addressing Cross-Task Inconsistency in Unified Vision-Language Models
Diffusion Models and Gaussian Flow Matching: Two Sides of the Same Coin
Elucidating the Preconditioning in Consistency Distillation
MatExpert: Decomposing Materials Discovery By Mimicking Human Experts
Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows
Dynamical Diffusion: Learning Temporal Dynamics with Diffusion Models
Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents
ViBiDSampler: Enhancing Video Interpolation Using Bidirectional Diffusion Sampler
How many samples are needed to train a deep neural network?
Rationalizing and Augmenting Dynamic Graph Neural Networks
Multi-LLM-Agents Debate - Performance, Efficiency, and Scaling Challenges
AdvWave: Stealthy Adversarial Jailbreak Attack against Large Audio-Language Models
RTDiff: Reverse Trajectory Synthesis via Diffusion for Offline Reinforcement Learning
Wayward Concepts In Large Multimodal Models
Small-to-Large Generalization: Training Data Influences Models Consistently Across Scale
Hymba: A Hybrid-head Architecture for Small Language Models
Decoupling Angles and Strength in Low-rank Adaptation
Exploring The Forgetting in Adversarial Training: A Novel Method for Enhancing Robustness
LoRA-Pro: Are Low-Rank Adapters Properly Optimized?
Real-Time Video Generation with Pyramid Attention Broadcast
BenTo: Benchmark Reduction with In-Context Transferability
Do LLM Agents Have Regret? A Case Study in Online Learning and Games
GUI-World: A Video Benchmark and Dataset for Multimodal GUI-oriented Understanding
Can LLM Simulations Truly Reflect Humanity? A Deep Dive
Compute-Constrained Data Selection
Multi-objective antibody design with constrained preference optimization
Generative World Explorer
Learning Randomized Algorithms with Transformers
KBLaM: Knowledge Base augmented Language Model
Linear SCM Identification in the Presence of Confounders and Gaussian Noise
Transformer-Squared: Self-adaptive LLMs
No Location Left Behind: Measuring and Improving the Fairness of Implicit Representations for Earth Data
StringLLM: Understanding the String Processing Capability of Large Language Models
Progressive distillation induces an implicit curriculum
LLMs' Potential Influences on Our Democracy: Challenges and Opportunities
Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval
Large Scale Knowledge Washing
NatureLM-audio: an Audio-Language Foundation Model for Bioacoustics
Interleaved Scene Graphs for Interleaved Text-and-Image Generation Assessment
MGDA Converges under Generalized Smoothness, Provably
We use cookies to store which papers have been visited.
I agree
Successful Page Load
ICLR uses cookies for essential functions only. We do not sell your personal information.
Our Privacy Policy »
Accept Cookies
We use cookies to store which papers have been visited.
I agree