Downloads 2025
Number of events: 3799
- $\beta$-calibration of Language Model Confidence Scores for Generative QA
- $\forall$uto$\exists$$\lor\!\land$L: Autonomous Evaluation of LLMs for Truth Maintenance and Reasoning Tasks
- $\gamma-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models
- $InterLCM$: Low-Quality Images as Intermediate States of Latent Consistency Models for Effective Blind Face Restoration
- $k$NN Attention Demystified: A Theoretical Exploration for Scalable Transformers
- $\mathbb{X}$-Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs
- $\phi$-Update: A Class of Policy Update Methods with Policy Convergence Guarantee
- $q$-exponential family for policy optimization
- $R^2$-Guard: Robust Reasoning Enabled LLM Guardrail via Knowledge-Enhanced Logical Reasoning
- $\sigma$-zero: Gradient-based Optimization of $\ell_0$-norm Adversarial Examples
- {$\tau$}-bench: A Benchmark for \underline{T}ool-\underline{A}gent-\underline{U}ser Interaction in Real-World Domains
- $\text{I}^2\text{AM}$: Interpreting Image-to-Image Latent Diffusion Models via Bi-Attribution Maps
- $\texttt{BirdSet}$: A Large-Scale Dataset for Audio Classification in Avian Bioacoustics
- 2nd Workshop on Navigating and Addressing Data Problems for Foundation Models (DATA-FM)
- 3D-AffordanceLLM: Harnessing Large Language Models for Open-Vocabulary Affordance Detection in 3D Worlds
- 3DGS-Drag: Dragging Gaussians for Intuitive Point-Based 3D Editing
- 3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation
- 3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian Splatting
- 3DMolFormer: A Dual-channel Framework for Structure-based Drug Discovery
- 3D-Properties: Identifying Challenges in DPO and Charting a Path Forward
- 3D-SPATIAL MULTIMODAL MEMORY
- 3D StreetUnveiler with Semantic-aware 2DGS - a simple baseline
- 3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation
- 3D Vision-Language Gaussian Splatting
- 3rd ICLR Workshop on Machine Learning for Remote Sensing
- 4K4DGen: Panoramic 4D Generation at 4K Resolution
- 6DGS: Enhanced Direction-Aware Gaussian Splatting for Volumetric Rendering
- 6D Object Pose Tracking in Internet Videos for Robotic Manipulation
- 7th Robot Learning Workshop: Towards Robots with Human-Level Abilities
- A3D: Does Diffusion Dream about 3D Alignment?
- A-Bench: Are LMMs Masters at Evaluating AI-generated Images?
- A Benchmark for Semantic Sensitive Information in LLMs Outputs
- A Black Swan Hypothesis: The Role of Human Irrationality in AI Safety
- A Causal Lens for Learning Long-term Fair Policies
- ACC-Debate: An Actor-Critic Approach to Multi-Agent Debate
- Accelerated Over-Relaxation Heavy-Ball Method: Achieving Global Accelerated Convergence with Broad Generalization
- Accelerated training through iterative gradient propagation along the residual path
- Accelerating 3D Molecule Generation via Jointly Geometric Optimal Transport
- Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
- Accelerating Diffusion Transformers with Token-wise Feature Caching
- Accelerating Goal-Conditioned Reinforcement Learning Algorithms and Research
- Accelerating Inference of Retrieval-Augmented Generation via Sparse Context Selection
- Accelerating neural network training: An analysis of the AlgoPerf competition
- Accelerating Neural ODEs: A Variational Formulation-based Approach
- Accelerating Task Generalisation with Multi-Level Hierarchical Options
- Accelerating Training with Neuron Interaction and Nowcasting Networks
- Accessing Vision Foundation Models via ImageNet-1K
- Accurate and Scalable Graph Neural Networks via Message Invariance
- ACE: All-round Creator and Editor Following Instructions via Diffusion Transformer
- ACES: Automatic Cohort Extraction System for Event-Stream Datasets
- Achieving Dimension-Free Communication in Federated Learning via Zeroth-Order Optimization
- A CLIP-Powered Framework for Robust and Generalizable Data Selection
- A Closer Look at Machine Unlearning for Large Language Models
- A Coefficient Makes SVRG Effective
- A Computational Framework for Modeling Emergence of Color Vision in the Human Brain
- A Conditional Independence Test in the Presence of Discretization
- Action abstractions for amortized sampling
- ActionReasoningBench: Reasoning about Actions with and without Ramification Constraints
- Action Sequence Augmentation for Action Anticipation
- Actions Speak Louder Than Words: Rate-Reward Trade-off in Markov Decision Processes
- Activation Gradient based Poisoned Sample Detection Against Backdoor Attacks
- Active Learning for Continual Learning: Keeping the Past Alive in the Present
- Active Learning for Neural PDE Solvers
- ACTIVE: Offline Reinforcement Learning via Adaptive Imitation and In-sample $V$-Ensemble
- Active Task Disambiguation with LLMs
- ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning
- A Curious Case of the Missing Measure: Better Scores and Worse Generation
- AdaFisher: Adaptive Second Order Optimization via Fisher Information
- AdaGrad under Anisotropic Smoothness: A Fine-Grained Analysis
- AdaIR: Adaptive All-in-One Image Restoration via Frequency Mining and Modulation
- Ada-K Routing: Boosting the Efficiency of MoE-based LLMs
- ADAM: An Embodied Causal Agent in Open-World Environments
- AdaManip: Adaptive Articulated Object Manipulation Environments and Policy Learning
- Adam Exploits $\ell_\infty$-geometry of Loss Landscape via Coordinate-wise Adaptivity
- Adam-mini: Use Fewer Learning Rates To Gain More
- ADAM Optimization with Adaptive Batch Selection
- ADAPT: Attentive Self-Distillation and Dual-Decoder Prediction Fusion for Continual Panoptic Segmentation
- Adapters for Altering LLM Vocabularies: What Languages Benefit the Most?
- Adapting Multi-modal Large Language Model to Concept Drift From Pre-training Onwards
- Adaptive $Q$-Network: On-the-fly Target Selection for Deep Reinforcement Learning
- Adaptive backtracking for fast optimization
- Adaptive Batch Size for Privately Finding Second-Order Stationary Points
- Adaptive Camera Sensor for Vision Models
- Adaptive Concept Bottleneck for Foundation Models Under Distribution Shifts
- Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws
- Adaptive Energy Alignment for Accelerating Test-Time Adaptation
- Adaptive Methods through the Lens of SDEs: Theoretical Insights on the Role of Noise
- Adaptive Pruning of Pretrained Transformer via Differential Inclusions
- Adaptive Rank Allocation: Speeding Up Modern Transformers with RaNA Adapters
- Adaptive Rentention & Correction for Continual Learning
- Adaptive Shrinkage Estimation for Personalized Deep Kernel Regression in Modeling Brain Trajectories
- Adaptive teachers for amortized samplers
- Adaptive Transformer Programs: Bridging the Gap Between Performance and Interpretability in Transformers
- AdaRankGrad: Adaptive Gradient Rank and Moments for Memory-Efficient LLMs Training and Fine-Tuning
- AdaWM: Adaptive World Model based Planning for Autonomous Driving
- Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language Models
- Adding Conditional Control to Diffusion Models with Reinforcement Learning
- Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models
- Addressing Label Shift in Distributed Learning via Entropy Regularization
- A Decade's Battle on Dataset Bias: Are We There Yet?
- A Deep Generative Learning Approach for Two-stage Adaptive Robust Optimization
- A deep inverse-mapping model for a flapping robotic wing
- ADePT: Adaptive Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning
- A Differentiable Metric for Discovering Groups and Unitary Representations
- A Differentiable Rank-Based Objective for Better Feature Learning
- ADIFF: Explaining audio difference using natural language
- A Distributional Approach to Uncertainty-Aware Preference Alignment Using Offline Demonstrations
- Adjoint Matching: Fine-tuning Flow and Diffusion Generative Models with Memoryless Stochastic Optimal Control
- ADMM for Nonconvex Optimization under Minimal Continuity Assumption
- ADMM for Structured Fractional Minimization
- Advances in Financial AI: Opportunities, Innovations, and Responsible AI
- Advancing Graph Generation through Beta Diffusion
- Advancing LLM Reasoning Generalists with Preference Trees
- Advancing Mathematical Reasoning in Language Models: The Impact of Problem-Solving Data, Data Synthesis Methods, and Training Stages
- Advancing Out-of-Distribution Detection via Local Neuroplasticity
- Advancing Prompt-Based Methods for Replay-Independent General Continual Learning
- Advantage Alignment Algorithms
- Advantage-Guided Distillation for Preference Alignment in Small Language Models
- Adversarial Attacks on Data Attribution
- Adversarial Diffusion Bridge Model for Reliable Adversarial Purification
- Adversarial Generative Flow Network for Solving Vehicle Routing Problems
- Adversarial Latent Feature Augmentation for Fairness
- Adversarially Robust Anomaly Detection through Spurious Negative Pair Mitigation
- Adversarially Robust Out-of-Distribution Detection Using Lyapunov-Stabilized Embeddings
- Adversarial Machine Unlearning
- Adversarial Mixup Unlearning
- Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI
- Adversarial Policy Optimization for Preference-based Reinforcement Learning
- Adversarial Score identity Distillation: Rapidly Surpassing the Teacher in One Step
- Adversarial Search Engine Optimization for Large Language Models
- Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data
- Adversarial Training for Defense Against Label Poisoning Attacks
- Adversaries With Incentives: A Strategic Alternative to Adversarial Robustness
- AdvPaint: Protecting Images from Inpainting Manipulation via Adversarial Attention Disruption
- AdvWave: Stealthy Adversarial Jailbreak Attack against Large Audio-Language Models
- Affine Steerable Equivariant Layer for Canonicalization of Neural Networks
- AFlow: Automating Agentic Workflow Generation
- A Formal Framework for Understanding Length Generalization in Transformers
- A General Framework for Producing Interpretable Semantic Text Embeddings
- A Generalist Hanabi Agent
- A Generic Framework for Conformal Fairness
- AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents
- Agent-Oriented Planning in Multi-Agent Systems
- AgentQuest: Benchmarking LLM and VLM Agents on Long-Horizon Interactive Tasks
- AgentRefine: Enhancing Agent Generalization through Refinement Tuning
- Agent S: An Open Agentic Framework that Uses Computers Like a Human
- Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents
- Agent Skill Acquisition for Large Language Models via CycleQD
- Agents' Room: Narrative Generation through Multi-step Collaboration
- AgentStudio: A Toolkit for Building General Virtual Agents
- Agent-to-Sim: Learning Interactive Behavior Model from Casual Longitudinal Videos
- AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials
- A Geometric Framework for Understanding Memorization in Generative Models
- A Graph Enhanced Symbolic Discovery Framework For Efficient Circuit Synthesis
- Agree to Disagree: Demystifying Homogeneous Deep Ensembles through Distributional Equivalence
- AHA: A Vision-Language-Model for Detecting and Reasoning Over Failures in Robotic Manipulation
- AI2TALE: An Innovative Information Theory-based Approach for Learning to Localize Phishing Attacks
- AI4MAT-ICLR-2025: ICLR 2025 Workshop on AI for Accelerated Materials Design
- AI as Humanity’s Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text
- AI for Nucleic Acids (AI4NA)
- AIMS.au: A Dataset for the Analysis of Modern Slavery Countermeasures in Corporate Statements
- Aioli: A Unified Optimization Framework for Language Model Data Mixing
- AIR-BENCH 2024: A Safety Benchmark based on Regulation and Policies Specified Risk Categories
- Air Quality Prediction with Physics-Informed Dual Neural ODEs in Open Systems
- AI Sandbagging: Language Models can Strategically Underperform on Evaluations
- A Large-Scale 3D Face Mesh Video Dataset via Neural Re-parameterized Optimization
- A Large-scale Dataset and Benchmark for Commuting Origin-Destination Flow Generation
- A Large-scale Training Paradigm for Graph Generative Models
- ALBAR: Adversarial Learning approach to mitigate Biases in Action Recognition
- Alchemy: Amplifying Theorem-Proving Capability Through Symbolic Mutation
- Algorithmic Phases of In-Context Learning
- Algorithmic Stability Based Generalization Bounds for Adversarial Training
- Aligned Better, Listen Better For Audio-Visual Large Language Models
- Aligned LLMs Are Not Aligned Browser Agents
- Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception
- Aligning Human Motion Generation with Human Perceptions
- Aligning Language Models with Demonstrated Feedback
- Aligning Visual Contrastive learning models via Preference Optimization
- A Little Goes a Long Way: Efficient Long Context Training and Inference with Partial Contexts
- ALLaM: Large Language Models for Arabic and English
- Almost Optimal Batch-Regret Tradeoff for Batch Linear Contextual Bandits
- AlphaEdit: Null-Space Constrained Model Editing for Language Models
- Ambient Diffusion Posterior Sampling: Solving Inverse Problems with Diffusion Models Trained on Corrupted Data
- A Meta-Learning Approach to Bayesian Causal Discovery
- Amortized Control of Continuous State Space Feynman-Kac Model for Irregular Time Series
- Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs
- A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules
- A Multiscale Frequency Domain Causal Framework for Enhanced Pathological Analysis
- ANaGRAM: A Natural Gradient Relative to Adapted Model for efficient PINNs learning
- AnalogGenie: A Generative Engine for Automatic Discovery of Analog Circuit Topologies
- Analysing The Spectral Biases in Generative Models
- Analysis of Linear Mode Connectivity via Permutation-Based Weight Matching: With Insights into Other Permutation Search Methods
- Analytic DAG Constraints for Differentiable DAG Learning
- Analyzing and Boosting the Power of Fine-Grained Visual Recognition for Multi-modal Large Language Models
- Analyzing Neural Scaling Laws in Two-Layer Networks with Power-Law Data Spectra
- Analyzing Pitfalls and Filling the Gaps in Tabular Deep Learning Benchmarks
- An Asynchronous Bundle Method for Distributed Learning Problems
- An Auditing Test to Detect Behavioral Shift in Language Models
- AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents
- An Effective Manifold-based Optimization Method for Distributionally Robust Classification
- An Effective Theory of Bias Amplification
- An Efficient Framework for Crediting Data Contributors of Diffusion Models
- An Empirical Analysis of Uncertainty in Large Language Model Evaluations
- An Engorgio Prompt Makes Large Language Model Babble on
- An Evolved Universal Transformer Memory
- A new framework for evaluating model out-of-distribution generalisation for the biochemical domain
- A New Perspective on Shampoo's Preconditioner
- An Exploration with Entropy Constrained 3D Gaussians for 2D Video Compression
- An Expressive Quantum-Driven Graph Learning Approach with Application to Mixed-integer Linear Programming
- An Illustrated Guide to Automatic Sparse Differentiation
- An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels
- Animate-X: Universal Character Image Animation with Enhanced Motion Representation
- Animate Your Thoughts: Reconstruction of Dynamic Natural Vision from Human Brain Activity
- An Information Criterion for Controlled Disentanglement of Multimodal Data
- An Intelligent Agentic System for Complex Image Restoration Problems
- An Investigation of Conformal Isometry Hypothesis for Grid Cells
- AniSDF: Fused-Granularity Neural Surfaces with Anisotropic Encoding for High-Fidelity 3D Reconstruction
- AnoLLM: Large Language Models for Tabular Anomaly Detection
- A Non-Contrastive Learning Framework for Sequential Recommendation with Preference-Preserving Profile Generation
- An Online Learning Theory of Trading-Volume Maximization
- Answer, Assemble, Ace: Understanding How LMs Answer Multiple Choice Questions
- Anti-Exposure Bias in Diffusion Models via Prompt Learning
- An undetectable watermark for generative image models
- AnyPrefer: An Automatic Framework for Preference Data Synthesis
- Any-step Dynamics Model Improves Future Predictions for Online and Offline Reinforcement Learning
- APE: Faster and Longer Context-Augmented Generation via Adaptive Parallel Encoding
- A Percolation Model of Emergence: Analyzing Transformers Trained on a Formal Language
- A Periodic Bayesian Flow for Material Generation
- API Pack: A Massive Multi-Programming Language Dataset for API Call Generation
- A Policy-Gradient Approach to Solving Imperfect-Information Games with Best-Iterate Convergence
- Apollo-MILP: An Alternating Prediction-Correction Neural Solving Framework for Mixed-Integer Linear Programming
- Approaching Rate-Distortion Limits in Neural Compression with Lattice Transform Coding
- Approximating Full Conformal Prediction for Neural Network Regression with Gauss-Newton Influence
- Approximation algorithms for combinatorial optimization with predictions
- A primer on analytical learning dynamics of nonlinear neural networks
- A Probabilistic Perspective on Unlearning and Alignment for Large Language Models
- A Quantum Circuit-Based Compression Perspective for Parameter-Efficient Learning
- A Rainbow in Deep Network Black Boxes
- ARB-LLM: Alternating Refined Binarizations for Large Language Models
- Are Large Vision Language Models Good Game Players?
- Are Transformers Able to Reason by Connecting Separated Knowledge in Training Data?
- A Riemannian Framework for Learning Reduced-order Lagrangian Dynamics
- Arithmetic Transformers Can Length-Generalize in Both Operand Length and Count
- Arithmetic Without Algorithms: Language Models Solve Math with a Bag of Heuristics
- ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation
- A Robust Method to Discover Causal or Anticausal Relation
- Articulate-Anything: Automatic Modeling of Articulated Objects via a Vision-Language Foundation Model
- Artificial Kuramoto Oscillatory Neurons
- A Sanity Check for AI-generated Image Detection
- A Second-Order Perspective on Model Compositionality and Incremental Learning
- A Simple Approach to Unifying Diffusion-based Conditional Generation
- A Simple Baseline for Multivariate Time Series Forecasting
- A Simple Diffusion Transformer on Unified Video, 3D, and Game Field Generation
- A Simple Framework for Open-Vocabulary Zero-Shot Segmentation
- A Simple yet Effective $\Delta\Delta G$ Predictor is An Unsupervised Antibody Optimizer and Explainer
- A Single Goal is All You Need: Skills and Exploration Emerge from Contrastive RL without Rewards, Demonstrations, or Subgoals
- Ask, and it shall be given: On the Turing completeness of prompting
- A Skewness-Based Criterion for Addressing Heteroscedastic Noise in Causal Discovery
- As large as it gets – Studying Infinitely Large Convolutions via Neural Implicit Frequency Filters
- A Soft and Fast Pattern Matcher for Billion-Scale Corpus Searches
- A Solvable Attention for Neural Scaling Laws
- A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation
- AssembleFlow: Rigid Flow Matching with Inertial Frames for Molecular Assembly
- As Simple as Fine-tuning: LLM Alignment via Bidirectional Negative Feedback Loss
- Associative memory and dead neurons
- A Statistical Approach for Controlled Training Data Detection
- A Statistical Framework for Ranking LLM-based Chatbots
- A Stochastic Approach to the Subset Selection Problem via Mirror Descent
- ASTrA: Adversarial Self-supervised Training with Adaptive-Attacks
- AstroCompress: A benchmark dataset for multi-purpose compression of astronomical data
- Asymmetric Factorized Bilinear Operation for Vision Transformer
- Asymptotic Analysis of Two-Layer Neural Networks after One Gradient Step under Gaussian Mixtures Data with Structure
- Asynchronous Federated Reinforcement Learning with Policy Gradient Updates: Algorithm Design and Convergence Analysis
- A Theoretical Analysis of Self-Supervised Learning for Vision Transformers
- A Theoretical Framework for Partially-Observed Reward States in RLHF
- A Theoretically-Principled Sparse, Connected, and Rigid Graph Representation of Molecules
- A Theoretical Perspective: When and How Self-consuming Training Loops Generalize
- A Theory for Token-Level Harmonization in Retrieval-Augmented Generation
- A Theory of Initialisation's Impact on Specialisation
- A Tight Convergence Analysis of Inexact Stochastic Proximal Point Algorithm for Stochastic Composite Optimization Problems
- Atlas Gaussians Diffusion for 3D Generation
- Atomas: Hierarchical Adaptive Alignment on Molecule-Text for Unified Molecule Understanding and Generation
- AtomSurf: Surface Representation for Learning on Protein Structures
- A Training-Free Sub-quadratic Cost Transformer Model Serving Framework with Hierarchically Pruned Attention
- A Transfer Attack to Image Watermarks
- A transfer learning framework for weak to strong generalization
- A Truncated Newton Method for Optimal Transport
- Attention as a Hypernetwork
- Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers
- Attention layers provably solve single-location regression
- Attention with Markov: A Curious Case of Single-layer Transformers
- AttriBoT: A Bag of Tricks for Efficiently Approximating Leave-One-Out Context Attribution
- Attribute-based Visual Reprogramming for Image Classification with CLIP
- Attributing Culture-Conditioned Generations to Pretraining Corpora
- Audio Large Language Models Can Be Descriptive Speech Quality Evaluators
- AugKD: Ingenious Augmentations Empower Knowledge Distillation for Image Super-Resolution
- A Unified Framework for Forward and Inverse Problems in Subsurface Imaging using Latent Space Translations
- A Unified Theory of Quantum Neural Network Loss Landscapes
- A Unifying Framework for Representation Learning
- AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark
- AutoBencher: Towards Declarative Benchmark Construction
- AutoCLIP: Auto-tuning Zero-Shot Classifiers for Vision-Language Models
- Autocorrelation Matters: Understanding the Role of Initialization Schemes for State Space Models
- AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs
- Auto-GDA: Automatic Domain Adaptation for Efficient Grounding Verification in Retrieval Augmented Generation
- AutoG: Towards automatic graph construction from tabular data
- Automated Design of Agentic Systems
- Automated Filtering of Human Feedback Data for Aligning Text-to-Image Diffusion Models
- Automated Proof Generation for Rust Code via Self-Evolution
- Automatic Curriculum Expert Iteration for Reliable LLM Reasoning
- Autonomous agents from automatic reward modeling and planning
- Autoregressive Pretraining with Mamba in Vision
- Autoregressive Transformers are Zero-Shot Video Imitators
- Autoregressive Video Generation without Vector Quantization
- AutoUAD: Hyper-parameter Optimization for Unsupervised Anomaly Detection
- AvatarGO: Zero-shot 4D Human-Object Interaction Generation and Animation
- AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models
- A Visual Dive into Conditional Flow Matching
- A Watermark for Order-Agnostic Language Models
- BaB-ND: Long-Horizon Motion Planning with Branch-and-Bound and Neural Dynamics
- Backdooring Vision-Language Models with Out-Of-Distribution Data
- Backtracking Improves Generation Safety
- BadJudge: Backdoor Vulnerabilities of LLM-As-A-Judge
- Bad-PFL: Exploiting Backdoor Attacks against Personalized Federated Learning
- Balanced Neural ODEs: nonlinear model order reduction and Koopman operator approximations
- Balanced Ranking with Relative Centrality: A multi-core periphery perspective
- Balancing Act: Diversity and Consistency in Large Language Model Ensembles
- Balancing Bias in Two-sided Markets for Fair Stable Matchings
- BAMDP Shaping: a Unified Theoretical Framework for Intrinsic Motivation and Reward Shaping
- Bandit Learning in Matching Markets with Indifference
- BANGS: Game-theoretic Node Selection for Graph Self-Training
- Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model Compression
- Bayesian Analysis of Combinatorial Gaussian Process Bandits
- Bayesian Experimental Design Via Contrastive Diffusions
- Bayesian Image Regression with Soft-thresholded Conditional Autoregressive Prior
- Bayesian Optimization of Antibodies Informed by a Generative Model of Evolving Sequences
- Bayesian Optimization via Continual Variational Last Layer Training
- Bayesian Regularization of Latent Representation
- Bayesian Treatment of the Spectrum of the Empirical Kernel in (Sub)Linear-Width Neural Networks
- Bayesian WeakS-to-Strong from Text Classification to Generation
- BBCaL: Black-box Backdoor Detection under the Causality Lens
- BEEM: Boosting Performance of Early Exit DNNs using Multi-Exit Classifiers as Experts
- Behavioral Entropy-Guided Dataset Generation for Offline Reinforcement Learning
- Be More Diverse than the Most Diverse: Online Selection of Diverse Mixtures of Generative Models
- Benchmarking Agentic Workflow Generation
- Benchmarking LLMs' Judgments with No Gold Standard
- Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent
- Benchmarking Predictive Coding Networks -- Made Simple
- Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset
- Benign Overfitting in Out-of-Distribution Generalization of Linear Models
- BenTo: Benchmark Reduction with In-Context Transferability
- Better autoregressive regression with LLMs
- Better Instruction-Following Through Minimum Bayes Risk
- Better than Your Teacher: LLM Agents that learn from Privileged AI Feedback
- Beware of Calibration Data for Pruning Large Language Models
- Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning
- Beyond Autoregression: Fast LLMs via Self-Distillation Through Time
- Beyond Canonicalization: How Tensorial Messages Improve Equivariant Message Passing
- Beyond Circuit Connections: A Non-Message Passing Graph Transformer Approach for Quantum Error Mitigation
- Beyond Content Relevance: Evaluating Instruction Following in Retrieval Models
- Beyond correlation: The impact of human uncertainty in measuring the effectiveness of automatic evaluation and LLM-as-a-judge
- Beyond-Expert Performance with Limited Demonstrations: Efficient Imitation Learning with Double Exploration
- Beyond FVD: Enhanced Evaluation Metrics for Video Generation Quality
- Beyond Graphs: Can Large Language Models Comprehend Hypergraphs?
- Beyond Interpretability: The Gains of Feature Monosemanticity on Model Robustness
- Beyond Linear Approximations: A Novel Pruning Approach for Attention Matrix
- Beyond Mere Token Analysis: A Hypergraph Metric Space Framework for Defending Against Socially Engineered LLM Attacks
- Beyond Model Collapse: Scaling Up with Synthesized Data Requires Verification
- Beyond Random Augmentations: Pretraining with Hard Views
- Beyond Random Masking: When Dropout meets Graph Convolutional Networks
- Beyond Sequence: Impact of Geometric Context for RNA Property Prediction
- Beyond Single Concept Vector: Modeling Concept Subspace in LLMs with Gaussian Distribution
- Beyond single neurons: population response geometry in digital twins of mouse visual cortex
- Beyond Squared Error: Exploring Loss Design for Enhanced Training of Generative Flow Networks
- Beyond Surface Structure: A Causal Assessment of LLMs' Comprehension ability
- Beyond the convexity assumption: Realistic tabular data generation under quantifier-free real linear constraints
- Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
- Beyond Worst-Case Dimensionality Reduction for Sparse Vectors
- Bias Mitigation in Graph Diffusion Models
- Bidirectional Decoding: Improving Action Chunking via Closed-Loop Resampling
- Bi-Factorial Preference Optimization: Balancing Safety-Helpfulness in Language Models
- BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
- BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks
- BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities
- Bilinear MLPs enable weight-based mechanistic interpretability
- BinaryDM: Accurate Weight Binarization for Efficient Diffusion Models
- Binary Losses for Density Ratio Estimation
- BingoGuard: LLM Content Moderation Tools with Risk Levels
- BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments
- Biologically Constrained Barrel Cortex Model Integrates Whisker Inputs and Replicates Key Brain Network Dynamics
- Biologically Plausible Brain Graph Transformer
- Bio-xLSTM: Generative modeling, representation and in-context learning of biological and chemical sequences
- BIRD: A Trustworthy Bayesian Inference Framework for Large Language Models
- Bisimulation Metric for Model Predictive Control
- BitStack: Fine-Grained Size Control for Compressed Large Language Models in Variable Memory Environments
- Black-Box Detection of Language Model Watermarks
- Black Sheep in the Herd: Playing with Spuriously Correlated Attributes for Vision-Language Recognition
- BLEND: Behavior-guided Neural Population Dynamics Modeling via Privileged Knowledge Distillation
- BlendRL: A Framework for Merging Symbolic and Neural Policy Learning
- Block-Attention for Efficient RAG
- Block Verification Accelerates Speculative Decoding
- BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacks
- BodyGen: Advancing Towards Efficient Embodiment Co-Design
- BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL
- Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions
- Boltzmann priors for Implicit Transfer Operators
- Boltzmann Semantic Score: A Semantic Metric for Evaluating Large Vision Models Using Large Language Models
- BOND: Aligning LLMs with Best-of-N Distillation
- BoneMet: An Open Large-Scale Multi-Modal Murine Dataset for Breast Tumor Bone Metastasis Diagnosis and Prognosis
- Bonsai: Gradient-free Graph Distillation for Node Classification
- Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturbation
- Boosting Latent Diffusion with Perceptual Objectives
- Boosting Methods for Interval-censored Data with Regression and Classification
- Boosting Multiple Views for pretrained-based Continual Learning
- Boosting Neural Combinatorial Optimization for Large-Scale Vehicle Routing Problems
- Boosting Perturbed Gradient Ascent for Last-Iterate Convergence in Games
- Boosting Ray Search Procedure of Hard-label Attacks with Transfer-based Priors
- Boosting the visual interpretability of CLIP via adversarial fine-tuning
- Boost Self-Supervised Dataset Distillation via Parameterization, Predefined Augmentation, and Approximation
- Bootstrapped Energy Based Models: What are they good for?
- Bootstrapped Model Predictive Control
- Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel
- Bootstrapping Language Models with DPO Implicit Rewards
- Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation
- Boundary constrained Gaussian processes for robust physics-informed machine learning of linear partial differential equations
- Bounds on $L_p$ Errors in Density Ratio Estimation via $f$-Divergence Loss Functions
- BP-Modified Local Loss for Efficient Training of Deep Neural Networks
- BRAID: Input-driven Nonlinear Dynamical Modeling of Neural-Behavioral Data
- BrainACTIV: Identifying visuo-semantic properties driving cortical selectivity using diffusion-based image manipulation
- Brain Bandit: A Biologically Grounded Neural Network for Efficient Control of Exploration
- Brain-inspired $L_p$-Convolution benefits large kernels and aligns better with visual cortex
- Brain Mapping with Dense Features: Grounding Cortical Semantic Selectivity in Natural Images With Vision Transformers
- BrainOOD: Out-of-distribution Generalizable Brain Network Analysis
- BrainUICL: An Unsupervised Individual Continual Learning Framework for EEG Applications
- Breach By A Thousand Leaks: Unsafe Information Leakage in 'Safe' AI Responses
- Breaking Class Barriers: Efficient Dataset Distillation via Inter-Class Feature Compensator
- Breaking Free from MMI: A New Frontier in Rationalization by Probing Input Utilization
- Breaking Mental Set to Improve Reasoning through Diverse Multi-Agent Debate
- Breaking Neural Network Scaling Laws with Modularity
- Breaking the $\log(1/\Delta_2)$ Barrier: Better Batched Best Arm Identification with Adaptive Grids
- Breaking the Reclustering Barrier in Centroid-based Deep Clustering
- Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization
- Bridging Compressed Image Latents and Multimodal Large Language Models
- Bridging Context Gaps: Leveraging Coreference Resolution for Long Contextual Understanding
- Bridging General and Personalized Federated Learning through Selective Model Integration
- Bridging Information Asymmetry in Text-video Retrieval: A Data-centric Approach
- Bridging Jensen Gap for Max-Min Group Fairness Optimization in Recommendation
- Bridging the Data Provenance Gap Across Text, Speech, and Video
- Bridging the Gap Between $f$-divergences and Bayes Hilbert Spaces
- Bridging the Gap between Database Search and \emph{De Novo} Peptide Sequencing with SearchNovo
- Bridging the Gap between Variational Inference and Stochastic Gradient MCMC in Function Space
- BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval
- Bringing NeRFs to the Latent Space: Inverse Graphics Autoencoder
- Broadening Target Distributions for Accelerated Diffusion Models via a Novel Analysis Approach
- Broaden your SCOPE! Efficient Conversation Planning for LLMs with Semantic Space
- B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
- BTBS-LNS: Binarized-Tightening, Branch and Search on Learning LNS Policies for MIP
- Budgeted Online Continual Learning by Adaptive Layer Freezing and Frequency-based Sampling
- Build-A-Scene: Interactive 3D Layout Control for Diffusion-Based Image Generation
- Building Blocks of Differentially Private Training
- Building Interactable Replicas of Complex Articulated Objects via Gaussian Splatting
- Building Math Agents with Multi-Turn Iterative Preference Learning
- Building, Reusing, and Generalizing Abstract Representations from Concrete Sequences
- Bundle Neural Network for message diffusion on graphs
- Cached Multi-Lora Composition for Multi-Concept Image Generation
- C-Adapter: Adapting Deep Classifiers for Efficient Conformal Prediction Sets
- Cafe-Talk: Generating 3D Talking Face Animation with Multimodal Coarse- and Fine-grained Control
- CAKE: Cascading and Adaptive KV Cache Eviction with Layer Preferences
- Calibrating Expressions of Certainty
- Calibrating LLMs with Information-Theoretic Evidential Deep Learning
- CameraCtrl: Enabling Camera Control for Text-to-Video Generation
- CAMEx: Curvature-aware Merging of Experts
- Can a MISL Fly? Analysis and Ingredients for Mutual Information Skill Learning
- Can Generative AI Solve Your In-Context Learning Problem? A Martingale Perspective
- Can In-context Learning Really Generalize to Out-of-distribution Tasks?
- Can Knowledge Editing Really Correct Hallucinations?
- Can Large Language Models Understand Symbolic Graphics Programs?
- Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers
- Can LLM Simulations Truly Reflect Humanity? A Deep Dive
- Can LLMs Really Learn to Translate a Low-Resource Language from One Grammar Book?
- Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?
- Can LLMs Solve Long Math Word Problems Better?
- Can LLMs Understand Time Series Anomalies?
- Can Neural Networks Achieve Optimal Computational-statistical Tradeoff? An Analysis on Single-Index Model
- Can One Modality Model Synergize Training of Other Modality Models?
- Can Reinforcement Learning Solve Asymmetric Combinatorial-Continuous Zero-Sum Games?
- Can Textual Gradient Work in Federated Learning?
- Can Transformers Do Enumerative Geometry?
- Can Video LLMs Refuse to Answer? Alignment for Answerability in Video Large Language Models
- Can Watermarked LLMs be Identified by Users via Crafted Prompts?
- Can Watermarks be Used to Detect LLM IP Infringement For Free?
- Can We Ignore Labels in Out of Distribution Detection?
- Can We Talk Models Into Seeing the World Differently?
- Can We Trust Embodied Agents? Exploring Backdoor Attacks against Embodied LLM-Based Decision-Making Systems
- Capability Localization: Capabilities Can be Localized rather than Individual Knowledge
- CapeX: Category-Agnostic Pose Estimation from Textual Point Explanation
- CaPo: Cooperative Plan Optimization for Efficient Embodied Multi-Agent Cooperation
- Captured by Captions: On Memorization and its Mitigation in CLIP Models
- Capturing the Temporal Dependence of Training Data Influence
- CarbonSense: A Multimodal Dataset and Baseline for Carbon Flux Modelling
- CARTS: Advancing Neural Theorem Proving with Diversified Tactic Calibration and Bias-Resistant Tree Search
- CAT-3DGS: A Context-Adaptive Triplane Approach to Rate-Distortion-Optimized 3DGS Compression
- Catastrophic Failure of LLM Unlearning via Quantization
- CATCH: Channel-Aware Multivariate Time Series Anomaly Detection via Frequency Patching
- CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models
- Cauchy-Schwarz Regularizers
- Causal Concept Graph Models: Beyond Causal Opacity in Deep Learning
- Causal Discovery via Bayesian Optimization
- Causal Effect Estimation with Mixed Latent Confounders and Post-treatment Variables
- Causal Graphical Models for Vision-Language Compositional Understanding
- Causal Graph Transformer for Treatment Effect Estimation Under Unknown Interference
- Causal Identification for Complex Functional Longitudinal Studies
- Causal Information Prioritization for Efficient Reinforcement Learning
- Causally Motivated Sycophancy Mitigation for Large Language Models
- Causal Order: The Key to Leveraging Imperfect Experts in Causal Inference
- Causal Reasoning and Large Language Models: Opening a New Frontier for Causality
- Causal Representation Learning from Multimodal Biological Observations
- CausalRivers - Scaling up benchmarking of causal discovery for real-world time-series
- CAX: Cellular Automata Accelerated in JAX
- CBGBench: Fill in the Blank of Protein-Molecule Complex Binding Graph
- CBMA: Improving Conformal Prediction through Bayesian Model Averaging
- CBQ: Cross-Block Quantization for Large Language Models
- CBraMod: A Criss-Cross Brain Foundation Model for EEG Decoding
- C-CLIP: Multimodal Continual Learning for Vision-Language Model
- CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models
- Centrality-guided Pre-training for Graph
- CertainlyUncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness
- Certified Robustness Under Bounded Levenshtein Distance
- Certifying Language Model Robustness with Fuzzed Randomized Smoothing: An Efficient Defense Against Backdoor Attacks
- CFD: Learning Generalized Molecular Representation via Concept-Enhanced Feedback Disentanglement
- CFG++: Manifold-constrained Classifier Free Guidance for Diffusion Models
- CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding
- Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
- Chain-of-Focus Prompting: Leveraging Sequential Visual Cues to Prompt Large Autoregressive Vision Models
- Chain-of-region: Visual Language Models Need Details for Diagram Analysis
- Chain-of-Thought Provably Enables Learning the (Otherwise) Unlearnable
- CHAMP: Conformalized 3D Human Multi-Hypothesis Pose Estimators
- Charting the Design Space of Neural Graph Representations for Subgraph Matching
- ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation
- ChartMoE: Mixture of Diversely Aligned Expert Connector for Chart Understanding
- CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL
- ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities
- CheapNet: Cross-attention on Hierarchical representations for Efficient protein-ligand binding Affinity Prediction
- Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates
- ChemAgent: Self-updating Memories in Large Language Models Improves Chemical Reasoning
- Chemistry-Inspired Diffusion with Non-Differentiable Guidance
- CHiP: Cross-modal Hierarchical Direct Preference Optimization for Multimodal LLMs
- ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple Domains
- Chunk-Distilled Language Modeling
- CipherPrune: Efficient and Scalable Private Transformer Inference
- CircuitFusion: Multimodal Circuit Representation Learning for Agile Chip Design
- Circuit Representation Learning with Masked Gate Modeling and Verilog-AIG Alignment
- Circuit Transformer: A Transformer That Preserves Logical Equivalence
- CirT: Global Subseasonal-to-Seasonal Forecasting with Geometry-inspired Transformer
- CityAnchor: City-scale 3D Visual Grounding with Multi-modality LLMs
- CityGaussianV2: Efficient and Geometrically Accurate Reconstruction for Large-Scale Scenes
- ClassDiffusion: More Aligned Personalization Tuning with Explicit Class Guidance
- Class Distribution-induced Attention Map for Open-vocabulary Semantic Segmentations
- Classic but Everlasting: Traditional Gradient-Based Algorithms Converges Fast Even in Time-Varying Multi-Player Games
- ClawMachine: Learning to Fetch Visual Tokens for Referential Comprehension
- CL-DiffPhyCon: Closed-loop Diffusion Control of Complex Physical Systems
- CLDyB: Towards Dynamic Benchmarking for Continual Learning with Pre-trained Models
- CLIBD: Bridging Vision and Genomics for Biodiversity Monitoring at Scale
- ClimaQA: An Automated Evaluation Framework for Climate Foundation Models
- CLIPure: Purification in Latent Space via CLIP for Adversarially Robust Zero-Shot Classification
- Clique Number Estimation via Differentiable Functions of Adjacency Matrix Permutations
- CL-MFAP: A contrastive learning-based multimodal foundation model for antibiotic property prediction
- CLoSD: Closing the Loop between Simulation and Diffusion for multi-task character control
- Closed-Form Merging of Parameter-Efficient Modules for Federated Continual Learning
- Co$^{\mathbf{3}}$Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion
- COAT: Compressing Optimizer states and Activations for Memory-Efficient FP8 Training
- Cocoon: Robust Multi-Modal Perception with Uncertainty-Aware Sensor Fusion
- CodeMMLU: A Multi-Task Benchmark for Assessing Code Understanding Capabilities of CodeLLMs
- CofCA: A STEP-WISE Counterfactual Multi-hop QA benchmark
- COFlowNet: Conservative Constraints on Flows Enable High-Quality Candidate Generation
- CogCoM: A Visual Language Model with Chain-of-Manipulations Reasoning
- CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
- Collab: Controlled Decoding using Mixture of Agents for LLM Alignment
- CollabEdit: Towards Non-destructive Collaborative Knowledge Editing
- Collaborative Discrete-Continuous Black-Box Prompt Learning for Language Models
- Collapsed Language Models Promote Fairness
- CoLoRA: A Competitive Learning Approach for Enhancing LoRA
- ColPali: Efficient Document Retrieval with Vision Language Models
- ComaDICE: Offline Cooperative Multi-Agent Reinforcement Learning with Stationary Distribution Shift Regularization
- Combatting Dimensional Collapse in LLM Pre-Training Data via Submodular File Selection
- Combining Induction and Transduction for Abstract Reasoning
- Combining Text-based and Drag-based Editing for Precise and Flexible Image Editing
- COMBO: Compositional World Models for Embodied Multi-Agent Cooperation
- COME: Test-time Adaption by Conservatively Minimizing Entropy
- Commit0: Library Generation from Scratch
- Common Pitfalls of Margin-based Preference Optimization in Language Model Alignment
- CO-MOT: Boosting End-to-end Transformer-based Multi-Object Tracking via Coopetition Label Assignment and Shadow Sets
- CoMotion: Concurrent Multi-person 3D Motion
- Comparing noisy neural population dynamics using optimal transport distances
- Comparing Targeting Strategies for Maximizing Social Welfare with Limited Resources
- ComPC: Completing a 3D Point Cloud with 2D Diffusion Priors
- Competing Large Language Models in Multi-Agent Gaming Environments
- Competitive Fair Scheduling with Predictions
- Complementary Label Learning with Positive Label Guessing and Negative Label Enhancement
- Complexity Lower Bounds of Adaptive Gradient Algorithms for Non-convex Stochastic Optimization under Relaxed Smoothness
- Composable Interventions for Language Models
- Composing Unbalanced Flows for Flexible Docking and Relaxation
- Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering
- Compositional Entailment Learning for Hyperbolic Vision-Language Models
- Compositional simulation-based inference for time series
- Computational Explorations of Total Variation Distance
- Computational Limits of Low-Rank Adaptation (LoRA) Fine-Tuning for Transformer Models
- Computationally Efficient RL under Linear Bellman Completeness for Deterministic Dynamics
- Compute-Constrained Data Selection
- Computing Circuits Optimization via Model-Based Circuit Genetic Evolution
- CoMRes: Semi-Supervised Time Series Forecasting Utilizing Consensus Promotion of Multi-Resolution
- Concept Bottleneck Language Models For Protein Design
- Concept Bottleneck Large Language Models
- Concept Pinpoint Eraser for Text-to-image Diffusion Models via Residual Attention Gate
- ConceptPrune: Concept Editing in Diffusion Models via Skilled Neuron Pruning
- Concept-ROT: Poisoning Concepts in Large Language Models with Model Editing
- Conditional Diffusion Models are Minimax-Optimal and Manifold-Adaptive for Conditional Distribution Estimation
- Conditional Diffusion with Ordinal Regression: Longitudinal Data Generation for Neurodegenerative Disease Studies
- Conditional Testing based on Localized Conformal $p$-values
- Confidence Elicitation: A New Attack Vector for Large Language Models
- ConFIG: Towards Conflict-free Training of Physics Informed Neural Networks
- Conflict-Averse Gradient Aggregation for Constrained Multi-Objective Reinforcement Learning
- Conformal Generative Modeling with Improved Sample Efficiency through Sequential Greedy Filtering
- Conformalized Interactive Imitation Learning: Handling Expert Shift and Intermittent Feedback
- Conformalized Survival Analysis for General Right-Censored Data
- Conformal Language Model Reasoning with Coherent Factuality
- Conformal Prediction Sets Can Cause Disparate Impact
- Conformal Structured Prediction
- CONGO: Compressive Online Gradient Optimization
- ConMix: Contrastive Mixup at Representation Level for Long-tailed Deep Clustering
- Connecting Federated ADMM to Bayes
- Connectome Mapping: Shape-Memory Network via Interpretation of Contextual Semantic Information
- Conservative Contextual Bandits: Beyond Linear Representations
- Consistency Checks for Language Model Forecasters
- Consistency Models Made Easy
- Consistent Flow Distillation for Text-to-3D Generation
- Constraint-Conditioned Actor-Critic for Offline Safe Reinforcement Learning
- Constructing Confidence Intervals for Average Treatment Effects from Multiple Observational Datasets
- Content-Style Learning from Unaligned Domains: Identifiability under Unknown Latent Dimensions
- Context-Alignment: Activating and Enhancing LLMs Capabilities in Time Series
- Context-aware Dynamic Pruning for Speech Foundation Models
- Context Clues: Evaluating Long Context Models for Clinical Prediction Tasks on EHR Data
- ContextGNN: Beyond Two-Tower Recommendation Systems
- Context-Parametric Inversion: Why Instruction Finetuning May Not Actually Improve Context Reliance
- Context Steering: Controllable Personalization at Inference Time
- Contextual Document Embeddings
- Contextualizing biological perturbation experiments through language
- Contextual Self-paced Learning for Weakly Supervised Spatio-Temporal Video Grounding
- Continual Slow-and-Fast Adaptation of Latent Neural Dynamics (CoSFan): Meta-Learning What-How & When to Adapt
- Continuity-Preserving Convolutional Autoencoders for Learning Continuous Latent Dynamical Models from Images
- Continuous Autoregressive Modeling with Stochastic Monotonic Alignment for Speech Synthesis
- Continuous Diffusion for Mixed-Type Tabular Data
- Continuous Ensemble Weather Forecasting with Diffusion models
- Continuous Exposure Learning for Low-light Image Enhancement using Neural ODEs
- CONTRA: Conformal Prediction Region via Normalizing Flow Transformation
- Contractive Dynamical Imitation Policies for Efficient Out-of-Sample Recovery
- ContraDiff: Planning Towards High Return States via Contrastive Learning
- ContraFusion: Contrastively Improving Compositional Understanding in Diffusion Models via Fine-Grained Negative Images
- Contrastive Learning from Synthetic Audio Doppelgängers
- ControlAR: Controllable Image Generation with Autoregressive Models
- Controllable Blur Data Augmentation Using 3D-Aware Motion Estimation
- Controllable Context Sensitivity and the Knob Behind It
- Controllable Generation via Locally Constrained Resampling
- Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements
- Controllable Satellite-to-Street-View Synthesis with Precise Pose Alignment and Zero-Shot Environmental Control
- Controllable Unlearning for Image-to-Image Generative Models via $\epsilon$-Constrained Optimization
- Controlled LLM Decoding via Discrete Auto-regressive Biasing
- Controlling Language and Diffusion Models by Transporting Activations
- Controlling the Fidelity and Diversity of Deep Generative Models via Pseudo Density
- Control-oriented Clustering of Visual Latent Representation
- ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments
- Convergence and Implicit Bias of Gradient Descent on Continual Linear Classification
- Convergence of Distributed Adaptive Optimization with Local Updates
- Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis
- Convergent Privacy Loss of Noisy-SGD without Convexity and Smoothness
- Convex Formulations for Training Two-Layer ReLU Neural Networks
- COPER: Correlation-based Permutations for Multi-View Clustering
- Copyright-Protected Language Generation via Adaptive Model Fusion
- Coreset Selection via Reducible Loss in Continual Learning
- Coreset Spectral Clustering
- CoRNStack: High-Quality Contrastive Data for Better Code Retrieval and Reranking
- Correcting the Mythos of KL-Regularization: Direct Alignment without Overoptimization via Chi-Squared Preference Optimization
- Correlated Proxies: A New Definition and Improved Mitigation for Reward Hacking
- Correlating instruction-tuning (in multimodal models) with vision-language processing (in the brain)
- Correlation and Navigation in the Vocabulary Key Representation Space of Language Models
- CoTFormer: A Chain of Thought Driven Architecture with Budget-Adaptive Computation Cost at Inference
- Counterfactual Concept Bottleneck Models
- Counterfactual Generative Modeling with Variational Causal Inference
- Counterfactual Realizability
- CPSample: Classifier Protected Sampling for Guarding Training Data During Diffusion
- CR2PQ: Continuous Relative Rotary Positional Query for Dense Visual Representation Learning
- CraftRTL: High-quality Synthetic Data Generation for Verilog Code Models with Correct-by-Construction Non-Textual Representations and Targeted Code Repair
- CR-CTC: Consistency regularization on CTC for improved speech recognition
- CREAM: Consistency Regularized Self-Rewarding Language Models
- Credal Wrapper of Model Averaging for Uncertainty Estimation in Classification
- Credit-based self organizing maps: training deep topographic networks with minimal performance degradation
- CREIMBO: Cross-Regional Ensemble Interactions in Multi-view Brain Observations
- Cross-Attention Head Position Patterns Can Align with Human Visual Concepts in Text-to-Image Generative Models
- Cross-Domain Offline Policy Adaptation with Optimal Transport and Dataset Constraint
- Cross-Domain Off-Policy Evaluation and Learning for Contextual Bandits
- Cross-Embodiment Dexterous Grasping with Reinforcement Learning
- Cross-Entropy Is All You Need To Invert the Data Generating Process
- Cross-Modal Safety Mechanism Transfer in Large Vision-Language Models
- CrossMPT: Cross-attention Message-passing Transformer for Error Correcting Codes
- Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion
- CryoFM: A Flow-based Foundation Model for Cryo-EM Densities
- CryoGEN: Cryogenic Electron Tomography Reconstruction via Generative Energy Nets
- cryoSPHERE: Single-Particle HEterogeneous REconstruction from cryo EM
- CSA: Data-efficient Mapping of Unimodal Features to Multimodal Features
- CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery
- CtD: Composition through Decomposition in Emergent Communication
- Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model
- CtrLoRA: An Extensible and Efficient Framework for Controllable Image Generation
- Ctrl-U: Robust Conditional Image Generation via Uncertainty-aware Reward Modeling
- CTSyn: A Foundational Model for Cross Tabular Data Generation
- CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation
- CURIE: Evaluating LLMs on Multitask Scientific Long-Context Understanding and Reasoning
- Curriculum-aware Training for Discriminating Molecular Property Prediction Models
- Cut the Crap: An Economical Communication Pipeline for LLM-based Multi-Agent Systems
- Cut Your Losses in Large-Vocabulary Language Models
- CViT: Continuous Vision Transformer for Operator Learning
- Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models
- Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object Detection
- D2G: Debiased Learning with Distribution Guidance for Generalized Category Discovery
- DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life
- DAMO: Decoding by Accumulating Activations Momentum for Mitigating Hallucinations in Vision-Language Models
- DarkBench: Benchmarking Dark Patterns in Large Language Models
- DART: A Diffusion-Based Autoregressive Motion Model for Real-Time Text-Driven Motion Control
- Data-adaptive Differentially Private Prompt Synthesis for In-Context Learning
- Data-Augmented Phrase-Level Alignment for Mitigating Object Hallucination
- Data Center Cooling System Optimization Using Offline Reinforcement Learning
- Data-centric Prediction Explanation via Kernelized Stein Discrepancy
- Data Distillation for extrapolative protein design through exact preference optimization
- DataEnvGym: Data Generation Agents in Teacher Environments with Student Feedback
- DataGen: Unified Synthetic Dataset Generation via Large Language Models
- DataMan: Data Manager for Pre-training Large Language Models
- Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance
- Data Pruning by Information Maximization
- Data Scaling Laws in Imitation Learning for Robotic Manipulation
- Data Selection via Optimal Control for Language Models
- Dataset Distillation via Knowledge Distillation: Towards Efficient Self-Supervised Pre-training of Deep Networks
- Dataset Ownership Verification in Contrastive Pre-trained Models
- Data Shapley in One Training Run
- Data Taggants: Dataset Ownership Verification Via Harmless Targeted Data Poisoning
- Data Unlearning in Diffusion Models
- DaWin: Training-free Dynamic Weight Interpolation for Robust Adaptation
- DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking head Video Generation
- DCT-CryptoNets: Scaling Private Inference in the Frequency Domain
- Debiasing Federated Learning with Correlated Client Participation
- Debiasing Mini-Batch Quadratics for Applications in Deep Learning
- dEBORA: Efficient Bilevel Optimization-based low-Rank Adaptation
- DebugAgent: Efficient and Interpretable Error Slice Discovery for Comprehensive Model Debugging
- Decentralized Optimization with Coupled Constraints
- Decentralized Sporadic Federated Learning: A Unified Algorithmic Framework with Convergence Guarantees
- DeciMamba: Exploring the Length Extrapolation Potential of Mamba
- DECISION-FOCUSED UNCERTAINTY QUANTIFICATION
- Decision Information Meets Large Language Models: The Future of Explainable Operations Research
- Decision Tree Induction via Semantically-Aware Evolution
- Decoding Game: On Minimax Optimality of Heuristic Text Generation Strategies
- Decomposition Polyhedra of Piecewise Linear Functions
- Deconstructing Denoising Diffusion Models for Self-Supervised Learning
- Deconstructing What Makes a Good Optimizer for Autoregressive Language Models
- Decoupled Finetuning for Domain Generalizable Semantic Segmentation
- Decoupled Graph Energy-based Model for Node Out-of-Distribution Detection on Heterophilic Graphs
- Decoupled Subgraph Federated Learning
- Decoupling Angles and Strength in Low-rank Adaptation
- Decoupling Layout from Glyph in Online Chinese Handwriting Generation
- DEEM: Diffusion models serve as the eyes of large language models for image perception
- Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models
- Deep Distributed Optimization for Large-Scale Quadratic Programming
- DeeperForward: Enhanced Forward-Forward Training for Deeper and Better Performance
- DeepGate4: Efficient and Effective Representation Learning for Circuit Design at Scale
- Deep Generative Model in Machine Learning: Theory, Principle and Efficacy
- Deep Incomplete Multi-view Learning via Cyclic Permutation of VAEs
- Deep Kernel Posterior Learning under Infinite Variance Prior Weights
- Deep Kernel Relative Test for Machine-generated Text Detection
- Deep Learning Alternatives Of The Kolmogorov Superposition Theorem
- Deep Linear Probe Generators for Weight Space Learning
- DeepLTL: Learning to Efficiently Satisfy Complex LTL Instructions
- Deep MMD Gradient Flow without adversarial training
- Deep Networks Learn Features From Local Discontinuities in the Label Function
- Deep Random Features for Scalable Interpolation of Spatiotemporal Data
- DeepRTL: Bridging Verilog Understanding and Generation with a Unified Representation Model
- Deep Signature: Characterization of Large-Scale Molecular Dynamics
- DeepTAGE: Deep Temporal-Aligned Gradient Enhancement for Optimizing Spiking Neural Networks
- Deep Weight Factorization: Sparse Learning Through the Lens of Artificial Symmetries
- DeFT: Decoding with Flash Tree-attention for Efficient Tree-structured LLM Inference
- DELIFT: Data Efficient Language model Instruction Fine-Tuning
- DeLLMa: Decision Making Under Uncertainty with Large Language Models
- DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory
- DELTA: DENSE EFFICIENT LONG-RANGE 3D TRACKING FOR ANY VIDEO
- Democratic Training Against Universal Adversarial Perturbations
- Demystifying Online Clustering of Bandits: Enhanced Exploration Under Stochastic and Smoothed Adversarial Contexts
- Demystifying the Token Dynamics of Deep Selective State Space Models
- Demystifying Topological Message-Passing with Relational Structures: A Case Study on Oversquashing in Simplicial Message-Passing
- DenoiseVAE: Learning Molecule-Adaptive Noise Distributions for Denoising-based 3D Molecular Pre-training
- Denoising as Adaptation: Noise-Space Domain Adaptation for Image Restoration
- Denoising Autoregressive Transformers for Scalable Text-to-Image Generation
- Denoising Levy Probabilistic Models
- Denoising Task Difficulty-based Curriculum for Training Diffusion Models
- Denoising with a Joint-Embedding Predictive Architecture
- DenseGrounding: Improving Dense Language-Vision Semantics for Ego-centric 3D Visual Grounding
- DenseMatcher: Learning 3D Semantic Correspondence for Category-Level Manipulation from One Demo
- Dense Video Object Captioning from Disjoint Supervision
- Density estimation with LLMs: a geometric investigation of in-context learning trajectories
- DEPfold: RNA Secondary Structure Prediction as Dependency Parsing.
- DEPT: Decoupled Embeddings for Pre-training Language Models
- Depth Pro: Sharp Monocular Metric Depth in Less Than a Second
- Deriving Causal Order from Single-Variable Interventions: Guarantees & Algorithm
- Descent with Misaligned Gradients and Applications to Hidden Convexity
- Designing Concise ConvNets with Columnar Stages
- Designing Mechanical Meta-Materials by Learning Equivariant Flows
- Detecting Backdoor Samples in Contrastive Language Image Pretraining
- Determine-Then-Ensemble: Necessity of Top-k Union for Large Language Model Ensembling
- DexTrack: Towards Generalizable Neural Tracking Control for Dexterous Manipulation from Human References
- D-FINE: Redefine Regression Task of DETRs as Fine-grained Distribution Refinement
- DGQ: Distribution-Aware Group Quantization for Text-to-Image Diffusion Models
- DICE: Data Influence Cascade in Decentralized Learning
- DICE: End-to-end Deformation Capture of Hand-Face Interactions from a Single Image
- Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models
- Diff3DS: Generating View-Consistent 3D Sketch via Differentiable Curve Rendering
- Difference-of-submodular Bregman Divergence
- Differentiable and Learnable Wireless Simulation with Geometric Transformers
- Differentiable Causal Discovery for Latent Hierarchical Causal Models
- Differentiable Integer Linear Programming
- Differentiable Optimization of Similarity Scores Between Models and Brains
- Differential learning kinetics govern the transition from memorization to generalization during in-context learning
- Differentially Private Federated Learning with Time-Adaptive Privacy Spending
- Differentially private learners for heterogeneous treatment effects
- Differentially Private Steering for Large Language Model Alignment
- Differential Transformer
- Differentiation and Specialization of Attention Heads via the Refined Local Learning Coefficient
- DiffGAD: A Diffusion-based Unsupervised Graph Anomaly Detector
- DiffPC: Diffusion-based High Perceptual Fidelity Image Compression with Semantic Refinement
- Diff-PIC: Revolutionizing Particle-In-Cell Nuclear Fusion Simulation with Diffusion Models
- Diff-Prompt: Diffusion-driven Prompt Generator with Mask Supervision
- DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation
- Diffusing States and Matching Scores: A New Framework for Imitation Learning
- Diffusing to the Top: Boost Graph Neural Networks with Minimal Hyperparameter Tuning
- Diffusion$^2$: Dynamic 3D Content Generation via Score Composition of Video and Multi-view Diffusion Models
- Diffusion Actor-Critic: Formulating Constrained Policy Iteration as Diffusion Noise Regression for Offline Reinforcement Learning
- Diffusion Attribution Score: Which Training Sample Determines Your Generation?
- Diffusion-based Decoupled Deterministic and Uncertain Framework for Probabilistic Multivariate Time Series Forecasting
- Diffusion-based Neural Network Weights Generation
- Diffusion-Based Planning for Autonomous Driving with Flexible Guidance
- Diffusion Bridge AutoEncoders for Unsupervised Representation Learning
- Diffusion Bridge Implicit Models
- Diffusion Feedback Helps CLIP See Better
- Diffusion Generative Modeling for Spatially Resolved Gene Expression Inference from Histology Images
- DiffusionGuard: A Robust Defense Against Malicious Diffusion-based Image Editing
- Diffusion Models and Gaussian Flow Matching: Two Sides of the Same Coin
- Diffusion Models are Evolutionary Algorithms
- Diffusion Models Are Real-Time Game Engines
- Diffusion Models as Cartoonists! The Curious Case of High Density Regions
- Diffusion Models for 4D Novel View Synthesis
- Diffusion On Syntax Trees For Program Synthesis
- Diffusion Policy Policy Optimization
- Diffusion State-Guided Projected Gradient for Inverse Problems
- Diffusion Transformer Captures Spatial-Temporal Dependencies: A Theory for Gaussian Process Data
- Diffusion Transformers for Tabular Data Time Series Generation
- Digi-Q: Transforming VLMs to Device-Control Agents via Value-Based Offline RL
- Dimension Agnostic Neural Processes
- DINOv2: Learning Robust Visual Features without Supervision
- Direct Distributional Optimization for Provable Alignment of Diffusion Models
- Direct Imitation Learning: RLHF Secretly Performs Imitation Learning
- Directional Gradient Projection for Robust Fine-tuning of Foundation Models
- Direct Post-Training Preference Alignment of Multi-Agent Motion Generation Model with Implicit Feedback from Demonstrations
- Discovering Clone Negatives via Adaptive Contrastive Learning for Image-Text Matching
- Discovering Influential Neuron Path in Vision Transformers
- Discovering Temporally Compositional Neural Manifolds with Switching Infinite GPFA
- Discrete Codebook World Models for Continuous Control
- Discrete Copula Diffusion
- Discrete Diffusion Schrödinger Bridge Matching for Graph Transformation
- Discrete Distribution Networks
- Discrete GCBF Proximal Policy Optimization for Multi-agent Safe Optimal Control
- Discrete Latent Plans via Semantic Skill Abstractions
- Discretization-invariance? On the Discretization Mismatch Errors in Neural Operators
- Discriminating image representations with principal distortions
- Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation
- Discriminator-Guided Embodied Planning for LLM Agent
- Disentangled Representation Learning with the Gromov-Monge Gap
- Disentangling Representations through Multi-task Learning
- DisEnvisioner: Disentangled and Enriched Visual Prompt for Customized Image Generation
- DiSK: Differentially Private Optimizer with Simplified Kalman Filter for Noise Reduction
- DisPose: Disentangling Pose Guidance for Controllable Human Image Animation
- Dissecting Adversarial Robustness of Multimodal LM Agents
- Distance-Based Tree-Sliced Wasserstein Distance
- Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models with Flow Matching
- DistillHGNN: A Knowledge Distillation Approach for High-Speed Hypergraph Neural Networks
- Distilling Dataset into Neural Field
- Distilling Reinforcement Learning Algorithms for In-Context Model-Based Planning
- Distilling Structural Representations into Protein Sequence Models
- Dist Loss: Enhancing Regression in Few-Shot Region through Distribution Distance Constraint
- Distributional Associations vs In-Context Reasoning: A Study of Feed-forward and Attention Layers
- Distribution Backtracking Builds A Faster Convergence Trajectory for Diffusion Distillation
- Distribution-free Data Uncertainty for Neural Network Regression
- Distribution-Specific Agnostic Conditional Classification With Halfspaces
- DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agent
- Divergence-enhanced Knowledge-guided Context Optimization for Visual-Language Prompt Tuning
- Divergence of Neural Tangent Kernel in Classification Problems
- Diverse Policies Recovering via Pointwise Mutual Information Weighted Imitation Learning
- Diverse Preference Learning for Capabilities and Alignment
- Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents
- Divide and Translate: Compositional First-Order Logic Translation and Verification for Complex Logical Reasoning
- DLEFT-MKC: Dynamic Late Fusion Multiple Kernel Clustering with Robust Tensor Learning via Min-Max Optimizaiton
- Do as I do (Safely): Mitigating Task-Specific Fine-tuning Risks in Large Language Models
- Do as We Do, Not as You Think: the Conformity of Large Language Models
- Dobi-SVD: Differential SVD for LLM Compression and Some New Perspectives
- DocMIA: Document-Level Membership Inference Attacks against DocVQA Models
- Do Contemporary CATE Models Capture Real-World Heterogeneity? Findings from a Large-Scale Benchmark
- DOCS: Quantifying Weight Similarity for Deeper Insights into Large Language Models
- Do Deep Neural Network Solutions Form a Star Domain?
- Do Egocentric Video-Language Models Truly Understand Hand-Object Interactions?
- Does Editing Provide Evidence for Localization?
- Does Refusal Training in LLMs Generalize to the Past Tense?
- Does Safety Training of LLMs Generalize to Semantically Related Natural Prompts?
- Does SGD really happen in tiny subspaces?
- Does Spatial Cognition Emerge in Frontier Models?
- Does Training with Synthetic Data Truly Protect Privacy?
- DoF: A Diffusion Factorization Framework for Offline Multi-Agent Decision Making
- Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
- Do Large Language Models Truly Understand Geometric Structures?
- Do LLM Agents Have Regret? A Case Study in Online Learning and Games
- Do LLMs estimate uncertainty well in instruction-following?
- Do LLMs have Consistent Values?
- Do LLMs ``know'' internally when they follow instructions?
- Do LLMs Recognize Your Preferences? Evaluating Personalized Preference Following in LLMs
- Domain Guidance: A Simple Transfer Approach for a Pre-trained Diffusion Model
- Do Mice Grok? Glimpses of Hidden Progress in Sensory Cortex
- Do not write that jailbreak paper
- Don't Cut Corners: Exact Conditions for Modularity in Biologically Inspired Representations
- Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL
- DON’T STOP ME NOW: EMBEDDING BASED SCHEDULING FOR LLMS
- Don't Take Things Out of Context: Attention Intervention for Enhancing Chain-of-Thought Reasoning in Large Language Models
- DOPL: Direct Online Preference Learning for Restless Bandits with Preference Feedback
- Do Stochastic, Feel Noiseless: Stable Stochastic Optimization via a Double Momentum Mechanism
- DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search
- Doubly Optimal Policy Evaluation for Reinforcement Learning
- Doubly robust identification of treatment effects from multiple environments
- Do Vision & Language Decoders use Images and Text equally? How Self-consistent are their Explanations?
- Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference under Ambiguities
- Do vision models perceive objects like toddlers ?
- Do WGANs succeed because they minimize the Wasserstein Distance? Lessons from Discrete Generators
- Do You Keep an Eye on What I Ask? Mitigating Multimodal Hallucination via Attention-Guided Ensemble Decoding
- DPaI: Differentiable Pruning at Initialization with Node-Path Balance Principle
- DPLM-2: A Multimodal Diffusion Protein Language Model
- DP-SGD for non-decomposable objective functions
- Drama: Mamba-Enabled Model-Based Reinforcement Learning Is Sample and Parameter Efficient
- Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want
- DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation
- DreamCatalyst: Fast and High-Quality 3D Editing via Controlling Editability and Identity Preservation
- DreamDistribution: Prompt Distribution Learning for Text-to-Image Diffusion Models
- Dream to Manipulate: Compositional World Models Empowering Robot Imitation Learning with Imagination
- Dreamweaver: Learning Compositional World Models from Pixels
- DRESSing Up LLM: Efficient Stylized Question-Answering via Style Subspace Editing
- DriveTransformer: Unified Transformer for Scalable End-to-End Autonomous Driving
- DRL: Decomposed Representation Learning for Tabular Anomaly Detection
- DRoC: Elevating Large Language Models for Complex Vehicle Routing via Decomposed Retrieval of Constraints
- DRoP: Distributionally Robust Data Pruning
- Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization
- DSBench: How Far Are Data Science Agents from Becoming Data Science Experts?
- DSI: Faster Inference of Large Language Models via Speculation Parallelism
- DS-LLM: Leveraging Dynamical Systems to Enhance Both Training and Inference of Large Language Models
- DUALFormer: A Dual Graph Convolution and Attention Network for Node Classification
- Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces
- Dual Process Learning: Controlling Use of In-Context vs. In-Weights Strategies with Weight Forgetting
- DUET: Decentralized Bilevel Optimization without Lower-Level Strong Convexity
- DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
- Duoduo CLIP: Efficient 3D Understanding with Multi-View Images
- Durable Quantization Conditioned Misalignment Attack on Large Language Models
- DyCAST: Learning Dynamic Causal Structure from Time Series
- DynAlign: Unsupervised Dynamic Taxonomy Alignment for Cross-Domain Segmentation
- DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models
- Dynamical Diffusion: Learning Temporal Dynamics with Diffusion Models
- DynamicCity: Large-Scale LiDAR Generation from Dynamic Scenes
- Dynamic Contrastive Skill Learning with State-Transition Based Skill Clustering and Dynamic Length Adjustment
- Dynamic Diffusion Transformer
- Dynamic Discriminative Operations for Efficient Generative Inference of LLMs
- Dynamic Gaussians Mesh: Consistent Mesh Reconstruction from Monocular Videos
- Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Context Sparsification
- Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining
- Dynamic Low-Rank Sparse Adaptation for Large Language Models
- Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
- Dynamic Modeling of Patients, Modalities and Tasks via Multi-modal Multi-task Mixture of Experts
- Dynamic Multimodal Evaluation with Flexible Complexity by Vision-Language Bootstrapping
- Dynamic Multi-product Selection and Pricing under Preference Feedback
- Dynamic Negative Guidance of Diffusion Models
- Dynamic Neural Fortresses: An Adaptive Shield for Model Extraction Defense
- Dynamics of Concept Learning and Compositional Generalization
- Dynamic Sparse Training versus Dense Training: The Unexpected Winner in Image Corruption Robustnes
- DynaPrompt: Dynamic Test-Time Prompt Tuning
- DynFrs: An Efficient Framework for Machine Unlearning in Random Forest
- Dysca: A Dynamic and Scalable Benchmark for Evaluating Perception Ability of LVLMs
- E(3)-equivariant models cannot learn chirality: Field-based molecular generation
- Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
- Earlier Tokens Contribute More: Learning Direct Preference Optimization From Temporal Decay Perspective
- Easing Training Process of Rectified Flow Models Via Lengthening Inter-Path Distance
- ECD: A Machine Learning Benchmark for Predicting Enhanced-Precision Electronic Charge Density in Crystalline Inorganic Materials
- EC-Diffuser: Multi-Object Manipulation via Entity-Centric Behavior Generation
- EC-DIT: Scaling Diffusion Transformers with Adaptive Expert-Choice Routing
- ECHOPulse: ECG Controlled Echocardio-gram Video Generation
- EcoFace: Audio-Visual Emotional Co-Disentanglement Speech-Driven 3D Talking Face Generation
- econSG: Efficient and Multi-view Consistent Open-Vocabulary 3D Semantic Gaussians
- Edge-aware Image Smoothing with Relative Wavelet Domain Representation
- Edge Prompt Tuning for Graph Neural Networks
- EdgeRunner: Auto-regressive Auto-encoder for Artistic Mesh Generation
- EDiT: A Local-SGD-Based Efficient Distributed Training Method for Large Language Models
- EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing
- Effective and Efficient Time-Varying Counterfactual Prediction with State-Space Models
- Effective Interplay between Sparsity and Quantization: From Theory to Practice
- Effective post-training embedding compression via temperature control in contrastive training
- Efficient Action-Constrained Reinforcement Learning via Acceptance-Rejection Method and Augmented MDPs
- Efficient Active Imitation Learning with Random Network Distillation
- Efficient Alternating Minimization with Applications to Weighted Low Rank Approximation
- Efficient and Accurate Explanation Estimation with Distribution Compression
- Efficient and Context-Aware Label Propagation for Zero-/Few-Shot Training-Free Adaptation of Vision-Language Model
- Efficient and Robust Neural Combinatorial Optimization via Wasserstein-Based Coresets
- Efficient and Trustworthy Causal Discovery with Latent Variables and Complex Relations
- Efficient Automated Circuit Discovery in Transformers using Contextual Decomposition
- Efficient Biological Data Acquisition through Inference Set Design
- Efficient Causal Decision Making with One-sided Feedback
- Efficient Cross-Episode Meta-RL
- Efficient Dictionary Learning with Switch Sparse Autoencoders
- Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning
- Efficient Discovery of Pareto Front for Multi-Objective Reinforcement Learning
- Efficient Distribution Matching of Representations via Noise-Injected Deep InfoMax
- Efficient Evolutionary Search Over Chemical Space with Large Language Models
- Efficient Exploration and Discriminative World Model Learning with an Object-Centric Abstraction
- Efficient Imitation under Misspecification
- Efficient Inference for Large Language Model-based Generative Recommendation
- Efficient Interpolation between Extragradient and Proximal Methods for Weak MVIs
- EFFICIENT JAILBREAK ATTACK SEQUENCES ON LARGE LANGUAGE MODELS VIA MULTI-ARMED BANDIT-BASED CONTEXT SWITCHING
- Efficient Learning with Sine-Activated Low-Rank Matrices
- Efficient Low-Bit Quantization with Adaptive Scales for Multi-Task Co-Training
- Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts
- Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs
- Efficiently Parameterized Neural Metriplectic Systems
- Efficient Masked AutoEncoder for Video Object Counting and A Large-Scale Benchmark
- Efficient Model-Based Reinforcement Learning Through Optimistic Thompson Sampling
- Efficient Model Editing with Task-Localized Sparse Fine-tuning
- Efficient Multi-agent Offline Coordination via Diffusion-based Trajectory Stitching
- Efficient Neuron Segmentation in Electron Microscopy by Affinity-Guided Queries
- Efficient Off-Policy Learning for High-Dimensional Action Spaces
- Efficient Online Pruning and Abstraction for Imperfect Information Extensive-Form Games
- Efficient Online Reinforcement Learning Fine-Tuning Should Not Retain Offline Data
- Efficient Perplexity Bound and Ratio Matching in Discrete Diffusion Language Models
- Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning
- Efficient Reinforcement Learning with Large Language Model Priors
- Efficient Residual Learning with Mixture-of-Experts for Universal Dexterous Grasping
- Efficient Reward Poisoning Attacks on Online Deep Reinforcement Learning
- Efficient Source-Free Time-Series Adaptation via Parameter Subspace Disentanglement
- Efficient Sparse PCA via Block-Diagonalization
- Efficient stagewise pretraining via progressive subnetworks
- Efficient Structure-Aware 3D Gaussians via Lightweight Information Shaping
- Efficient Training of Neural Stochastic Differential Equations by Matching Finite Dimensional Distributions
- EffoVPR: Effective Foundation Model Utilization for Visual Place Recognition
- EG4D: Explicit Generation of 4D Object without Score Distillation
- EgoSim: Egocentric Exploration in Virtual Worlds with Multi-modal Conditioning
- EIA: ENVIRONMENTAL INJECTION ATTACK ON GENERALIST WEB AGENTS FOR PRIVACY LEAKAGE
- ElasticTok: Adaptive Tokenization for Image and Video
- ELBOing Stein: Variational Bayes with Stein Mixture Inference
- Eliciting Human Preferences with Language Models
- ELICIT: LLM Augmentation Via External In-context Capability
- Eliminating Oversaturation and Artifacts of High Guidance Scales in Diffusion Models
- Eliminating Position Bias of Language Models: A Mechanistic Approach
- Elliptic Loss Regularization
- Elucidating the Preconditioning in Consistency Distillation
- EmbedLLM: Learning Compact Representations of Large Language Models
- EmbodiedSAM: Online Segment Any 3D Thing in Real Time
- Emergence of a High-Dimensional Abstraction Phase in Language Transformers
- Emergence of meta-stable clustering in mean-field transformer models
- Emergent Orientation Maps —— Mechanisms, Coding Efficiency and Robustness
- Emergent Possibilities and Challenges in Deep Learning for Code
- Emerging Safety Attack and Defense in Federated Instruction Tuning of Large Language Models
- EMMA: Empowering Multi-modal Mamba with Structural and Hierarchical Alignment
- EMOS: Embodiment-aware Heterogeneous Multi-robot Operating System with LLM Agents
- Empowering Users in Digital Privacy Management through Interactive LLM-Based Agents
- Encryption-Friendly LLM Architecture
- Endless Jailbreaks with Bijection Learning
- Endowing Visual Reprogramming with Adversarial Robustness
- End-to-end Learning of Gaussian Mixture Priors for Diffusion Sampler
- End-to-End Rule Induction from Raw Sequence Inputs
- E(n) Equivariant Topological Neural Networks
- Energy-based Backdoor Defense Against Federated Graph Learning
- Energy-Based Diffusion Language Models for Text Generation
- Energy-Weighted Flow Matching for Offline Reinforcement Learning
- Enhanced Diffusion Sampling via Extrapolation with Multiple ODE Solutions
- Enhance Multi-View Classification Through Multi-Scale Alignment and Expanded Boundary
- Enhancing Clustered Federated Learning: Integration of Strategies and Improved Methodologies
- Enhancing Cognition and Explainability of Multimodal Foundation Models with Self-Synthesized Data
- Enhancing Compositional Text-to-Image Generation with Reliable Random Seeds
- Enhancing Document Understanding with Group Position Embedding: A Novel Approach to Incorporate Layout Information
- Enhancing End-to-End Autonomous Driving with Latent World Model
- Enhancing Federated Domain Adaptation with Multi-Domain Prototype-Based Federated Fine-Tuning
- Enhancing Graph Of Thought: Enhancing Prompts with LLM Rationales and Dynamic Temperature Control
- Enhancing Language Model Agents using Diversity of Thoughts
- Enhancing Large Language Models' Situated Faithfulness to External Contexts
- Enhancing Learning with Label Differential Privacy by Vector Approximation
- Enhancing Multilingual Reasoning in LLMs: Insights from Cross-Linguistic Correlations and Optimal Data Proportions
- Enhancing Pre-trained Representation Classifiability can Boost its Interpretability
- Enhancing Robust Fairness via Confusional Spectral Regularization
- Enhancing Software Agents with Monte Carlo Tree Search and Hindsight Feedback
- Enhancing the Scalability and Applicability of Kohn-Sham Hamiltonians for Molecular Systems
- Enhancing Training Robustness through Influence Measure
- Enhancing Uncertainty Estimation and Interpretability with Bayesian Non-negative Decision Layer
- Enhancing Vision-Language Model with Unmasked Token Alignment
- Enhancing Zeroth-order Fine-tuning for Language Models with Low-rank Structures
- Ensembles of Low-Rank Expert Adapters
- Ensembling Diffusion Models via Adaptive Feature Aggregation
- Entropic Distribution Matching for Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity
- Entropy-based Activation Function Optimization: A Method on Searching Better Activation Functions
- Episodic Memories Generation and Evaluation Benchmark for Large Language Models
- Episodic Novelty Through Temporal Distance
- Epistemic Monte Carlo Tree Search
- eQMARL: Entangled Quantum Multi-Agent Reinforcement Learning for Distributed Cooperation over Quantum Channels
- EqNIO: Subequivariant Neural Inertial Odometry
- Equivariant Denoisers Cannot Copy Graphs: Align Your Graph Diffusion Models
- Equivariant Masked Position Prediction for Efficient Molecular Representation
- Equivariant Neural Functional Networks for Transformers
- Equivariant Symmetry Breaking Sets
- Erasing Concept Combination from Text-to-Image Diffusion Model
- Error-quantified Conformal Inference for Time Series
- ESE: Espresso Sentence Embeddings
- Estimating the Probabilities of Rare Outputs in Language Models
- Estimation of single-cell and tissue perturbation effect in spatial transcriptomics via Spatial Causal Disentanglement
- ETA: Evaluating Then Aligning Safety of Vision Language Models at Inference Time
- ET-SEED: EFFICIENT TRAJECTORY-LEVEL SE(3) EQUIVARIANT DIFFUSION POLICY
- Euler Characteristic Tools for Topological Data Analysis
- EvA: Erasing Spurious Correlations with Activations
- EVA: Geometric Inverse Design for Fast Protein Motif-Scaffolding with Coupled Flow
- Evaluating and Improving Large Language Models on Graph Computation
- E-Valuating Classifier Two-Sample Tests
- Evaluating Large Language Models through Role-Guide and Self-Reflection: A Comparative Study
- Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective
- Event-Driven Online Vertical Federated Learning
- Everything, Everywhere, All at Once: Is Mechanistic Interpretability Identifiable?
- Everything is Editable: Extend Knowledge Editing to Unstructured Data in Large Language Models
- Evidential Learning-based Certainty Estimation for Robust Dense Feature Matching
- Exact Byte-Level Probabilities from Tokenized Language Models for FIM-Tasks and Model Ensembles
- Exact Certification of (Graph) Neural Networks Against Label Poisoning
- Exact Community Recovery under Side Information: Optimality of Spectral Algorithms
- Exact Computation of Any-Order Shapley Interactions for Graph Neural Networks
- Examining Alignment of Large Language Models through Representative Heuristics: the case of political stereotypes
- Execution-guided within-prompt search for programming-by-example
- Expand and Compress: Exploring Tuning Principles for Continual Spatio-Temporal Graph Forecasting
- Expected Return Symmetries
- Expected Sliced Transport Plans
- Explaining Modern Gated-Linear RNNs via a Unified Implicit Attention Formulation
- Explain Yourself, Briefly! Self-Explaining Neural Networks with Concise Sufficient Reasons
- Explanations of GNN on Evolving Graphs via Axiomatic Layer edges
- EXPLOITING DISTRIBUTION CONSTRAINTS FOR SCALABLE AND EFFICIENT IMAGE RETRIEVAL
- Exploiting Hankel-Toeplitz Structures for Fast Computation of Kernel Precision Matrices
- Exploiting Hidden Symmetry to Improve Objective Perturbation for DP linear learners with a nonsmooth $\ell_1$-norm
- Exploiting Structure in Offline Multi-Agent RL: The Benefits of Low Interaction Rank
- Exploratory Preference Optimization: Provably Sample-Efficient Exploration in RLHF with General Function Approximation
- Exploring a Principled Framework for Deep Subspace Clustering
- Exploring channel distinguishability in local neighborhoods of the model space in quantum neural networks
- Exploring Learning Complexity for Efficient Downstream Dataset Pruning
- Exploring Local Memorization in Diffusion Models via Bright Ending Attention
- Exploring Prosocial Irrationality for LLM Agents: A Social Cognition View
- Exploring the Camera bias of Person Re-identification
- Exploring the Design Space of Visual Context Representation in Video MLLMs
- Exploring the Effectiveness of Object-Centric Representations in Visual Question Answering: Comparative Insights with Foundation Models
- Exploring The Forgetting in Adversarial Training: A Novel Method for Enhancing Robustness
- Exploring the Impact of Activation Functions in Training Neural ODEs
- Exploring The Loss Landscape Of Regularized Neural Networks Via Convex Duality
- Exponential Topology-enabled Scalable Communication in Multi-agent Reinforcement Learning
- Exposing and Addressing Cross-Task Inconsistency in Unified Vision-Language Models
- Exposure Bracketing Is All You Need For A High-Quality Image
- Expressivity of Neural Networks with Random Weights and Learned Biases
- Extendable and Iterative Structure Learning for Bayesian Networks
- Extending Mercer's expansion to indefinite and asymmetric kernels
- Extreme Risk Mitigation in Reinforcement Learning using Extreme Value Theory
- Faceshot: Bring any Character into Life
- Facilitating Multi-turn Function Calling for LLMs via Compositional Instruction Tuning
- Factor Graph-based Interpretable Neural Networks
- FACTS: A Factored State-Space Framework for World Modelling
- Factual Context Validation and Simplification: A Scalable Method to Enhance GPT Trustworthiness and Efficiency
- Failures to Find Transferable Image Jailbreaks Between Vision-Language Models
- Fair Clustering in the Sliding Window Model
- FairDen: Fair Density-Based Clustering
- Fair Submodular Cover
- FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows"
- FakeShield: Explainable Image Forgery Detection and Localization via Multi-modal Large Language Models
- Fantastic Copyrighted Beasts and How (Not) to Generate Them
- Fantastic Targets for Concept Erasure in Diffusion Models and Where To Find Them
- Fast and Accurate Blind Flexible Docking
- Fast and Slow Streams for Online Time Series Forecasting Without Information Leakage
- Fast Direct: Query-Efficient Online Black-box Guidance for Diffusion-model Target Generation
- Fast Diversity-Preserving Reward Finetuning of Diffusion Models via Nabla-GFlowNets
- Faster Algorithms for Structured Linear and Kernel Support Vector Machines
- FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
- Faster Cascades via Speculative Decoding
- Faster Diffusion Sampling with Randomized Midpoints: Sequential and Parallel
- Faster Inference of Flow-Based Generative Models via Improved Data-Noise Coupling
- Faster, More Efficient RLHF through Off-Policy Asynchronous Learning
- Fast Feedforward 3D Gaussian Splatting Compression
- Fast Summation of Radial Kernels via QMC Slicing
- Fast training and sampling of Restricted Boltzmann Machines
- Fast Training of Sinusoidal Neural Fields via Scaling Initialization
- Fast unsupervised ground metric learning with tree-Wasserstein distance
- Fat-to-Thin Policy Optimization: Offline Reinforcement Learning with Sparse Policies
- Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models
- Feature Averaging: An Implicit Bias of Gradient Descent Leading to Non-Robustness in Neural Networks
- Feature-Based Online Bilateral Trade
- Feature Responsiveness Scores: Model-Agnostic Explanations for Recourse
- Federated $Q$-Learning with Reference-Advantage Decomposition: Almost Optimal Regret and Logarithmic Communication Cost
- Federated Class-Incremental Learning: A Hybrid Approach Using Latent Exemplars and Data-Free Techniques to Address Local and Global Forgetting
- Federated Continual Learning Goes Online: Uncertainty-Aware Memory Management for Vision Tasks and Beyond
- Federated Domain Generalization with Data-free On-server Gradient Matching
- Federated Few-Shot Class-Incremental Learning
- Federated Granger Causality Learning For Interdependent Clients With State Space Representation
- Federated Residual Low-Rank Adaption of Large Language Models
- FedLWS: Federated Learning with Adaptive Layer-wise Weight Shrinking
- FedTMOS: Efficient One-Shot Federated Learning with Tsetlin Machine
- Feedback Favors the Generalization of Neural ODEs
- Feedback Schrödinger Bridge Matching
- Fengbo: a Clifford Neural Operator pipeline for 3D PDEs in Computational Fluid Dynamics
- Few-Class Arena: A Benchmark for Efficient Selection of Vision Models and Dataset Difficulty Measurement
- Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset
- Few for Many: Tchebycheff Set Scalarization for Many-Objective Optimization
- F-Fidelity: A Robust Framework for Faithfulness Evaluation of Explainable AI
- Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning
- FIG: Flow with Interpolant Guidance for Linear Inverse Problems
- Filtered not Mixed: Filtering-Based Online Gating for Mixture of Large Language Models
- Finally Rank-Breaking Conquers MNL Bandits: Optimal and Efficient Algorithms for MNL Assortment
- Find A Winning Sign: Sign Is All We Need to Win the Lottery
- Finding and Only Finding Differential Nash Equilibria by Both Pretending to be a Follower
- Finding Shared Decodable Concepts and their Negations in the Brain
- Fine-Grained Verifiers: Preference Modeling as Next-token in Vision-Language Alignment
- Fine-Tuning Attention Modules Only: Enhancing Weight Disentanglement in Task Arithmetic
- Fine-tuning can cripple your foundation model; preserving features may be the solution
- Fine-tuning can Help Detect Pretraining Data from Large Language Models
- Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design
- Fine-Tuning Token-Based Large Multimodal Models: What Works, What Doesn’t and What's Next
- Fine-tuning with Reserved Majority for Noise Reduction
- FIRING-Net: A filtered feature recycling network for speech enhancement
- First-Person Fairness in Chatbots
- Fitting Networks with a Cancellation Trick
- Flash Inference: Near Linear Time Inference for Long Convolution Sequence Models and Beyond
- FlashMask: Efficient and Rich Mask Extension of FlashAttention
- FlashRNN: I/O-Aware Optimization of Traditional RNNs on modern hardware
- Flat Reward in Policy Parameter Space Implies Robust Reinforcement Learning
- Flavors of Margin: Implicit Bias of Steepest Descent in Homogeneous Neural Networks
- Flaws of ImageNet, Computer Vision's Favourite Dataset
- FlexCAD: Unified and Versatile Controllable CAD Generation with Fine-tuned Large Language Models
- FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference
- FlickerFusion: Intra-trajectory Domain Generalizing Multi-agent Reinforcement Learning
- FLIP: Flow-Centric Generative Planning for General-Purpose Manipulation Tasks
- FLOPS: Forward Learning with OPtimal Sampling
- Flow-based Variational Mutual Information: Fast and Flexible Approximations
- FlowDec: A flow-based full-band general audio codec with high perceptual quality
- Flow Distillation Sampling: Regularizing 3D Gaussians with Pre-trained Matching Priors
- Flow matching achieves almost minimax optimal convergence
- Flow Matching with Gaussian Process Priors for Probabilistic Time Series Forecasting
- Flow Matching with General Discrete Paths: A Kinetic-Optimal Perspective
- Flow With What You Know
- Following the Human Thread in Social Navigation
- Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems
- For Better or For Worse? Learning Minimum Variance Features With Label Augmentation
- ForecastBench: A Dynamic Benchmark of AI Forecasting Capabilities
- Forewarned is Forearmed: Harnessing LLMs for Data Synthesis via Failure-induced Exploration
- Forget the Data and Fine-Tuning! Just Fold the Network to Compress
- Forgetting Transformer: Softmax Attention with a Forget Gate
- Forking Paths in Neural Text Generation
- FormalAlign: Automated Alignment Evaluation for Autoformalization
- Formation of Representations in Neural Networks
- Forming Scalable, Convergent GNN Layers that Minimize a Sampling-Based Energy
- Forte : Finding Outliers with Representation Typicality Estimation
- FOSP: Fine-tuning Offline Safe Policy through World Models
- Foundation Models Secretly Understand Neural Network Weights: Enhancing Hypernetwork Architectures with Foundation Models
- Fourier Head: Helping Large Language Models Learn Complex Probability Distributions
- Fourier Sliced-Wasserstein Embedding for Multisets and Measures
- Fragment and Geometry Aware Tokenization of Molecules for Structure-Based Drug Design Using Language Models
- Framer: Interactive Frame Interpolation
- Frame-Voyager: Learning to Query Frames for Video Large Language Models
- FreCaS: Efficient Higher-Resolution Image Generation via Frequency-aware Cascaded Sampling
- Fréchet Wavelet Distance: A Domain-Agnostic Metric for Image Generation
- FreeCG: Free the Design Space of Clebsch-Gordan Transform for Machine Learning Force Fields
- Free Hunch: Denoiser Covariance Estimation for Diffusion Models Without Extra Costs
- FreeVS: Generative View Synthesis on Free Driving Trajectory
- FreqPrior: Improving Video Diffusion Models with Frequency Filtering Gaussian Noise
- Frequency-Guided Masking for Enhanced Vision Self-Supervised Learning
- FreSh: Frequency Shifting for Accelerated Neural Representation Learning
- From an LLM Swarm to a PDDL-empowered Hive: Planning Self-executed Instructions in a Multi-modal Jungle
- From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic Data
- From Attention to Activation: Unraveling the Enigmas of Large Language Models
- From Commands to Prompts: LLM-based Semantic File System
- From Complexity to Clarity: Analytical Expressions of Deep Neural Network Weights via Clifford Algebra and Convexity
- From Decoupling to Adaptive Transformation: a Wider Optimization Space for PTQ
- From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions
- From Few to Many: Enhancing Many-Shot In-Context Learning with Optimized Example Selection and Expansion
- From GNNs to Trees: Multi-Granular Interpretability for Graph Neural Networks
- From Isolated Conversations to Hierachical Schemas: Dynamic Tree Memory Representation for LLMs
- From Layers to States: A State Space Model Perspective to Deep Neural Network Layer Dynamics
- From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
- From Models to Microtheories: Distilling a Model's Topical Knowledge for Grounded Question-Answering
- From Pixels to Tokens: Byte-Pair Encoding on Quantized Visual Modalities
- From Probability to Counterfactuals: the Increasing Complexity of Satisfiability in Pearl's Causal Hierarchy
- From Promise to Practice: Realizing High-performance Decentralized Training
- From Risk to Uncertainty: Generating Predictive Uncertainty Measures via Bayesian Estimation
- From Search to Sampling: Generative Models for Robust Algorithmic Recourse
- From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency
- From Tokens to Lattices: Emergent Lattice Structures in Language Models
- From Tokens to Words: On the Inner Lexicon of LLMs
- Frontiers in Probabilistic Inference: learning meets Sampling
- Fugatto 1: Foundational Generative Audio Transformer Opus 1
- Fully-inductive Node Classification on Arbitrary Graphs
- Functional Homotopy: Smoothing Discrete Optimization via Continuous Parameters for LLM Jailbreak Attacks
- Fundamental Limitations on Subquadratic Alternatives to Transformers
- Fundamental Limits of Prompt Tuning Transformers: Universality, Capacity and Efficiency
- Future-Guided Pretraining via Time-to-Event Supervision for 3D Medical Imaging
- GALA: Geometry-Aware Local Adaptive Grids for Detailed 3D Generation
- GameArena: Evaluating LLM Reasoning through Live Computer Games
- GameGen-$\mathbb{X}$: Interactive Open-world Game Video Generation
- GANDALF: Generative AttentioN based Data Augmentation and predictive modeLing Framework for personalized cancer treatment
- Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
- Gap Preserving Distillation by Building Bidirectional Mappings with A Dynamic Teacher
- Gated Delta Networks: Improving Mamba2 with Delta Rule
- GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation
- Gaussian-Based Instance-Adaptive Intensity Modeling for Point-Supervised Facial Expression Spotting
- GaussianBlock: Building Part-Aware Compositional and Editable 3D Scene by Primitives and Gaussians
- Gaussian-Det: Learning Closed-Surface Gaussians for 3D Object Detection
- Gaussian Differentially Private Human Faces Under a Face Radial Curve Representation
- Gaussian Ensemble Belief Propagation for Efficient Inference in High-Dimensional, Black-box Systems
- Gaussian Head & Shoulders: High Fidelity Neural Upper Body Avatars with Anchor Gaussian Guided Texture Warping
- Gaussian Mixture Counterfactual Generator
- Gaussian Splatting Lucas-Kanade
- GDrag:Towards General-Purpose Interactive Editing with Anti-ambiguity Point Diffusion
- GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-Time Alignment
- GenDataAgent: On-the-fly Dataset Augmentation with Synthetic Data
- General Framework for Off-Policy Learning with Partially-Observed Reward
- Generalizability of Neural Networks Minimizing Empirical Risk Based on Expressive Power
- Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion
- Generalizable Human Gaussians from Single-View Image
- Generalizable Human Rendering with Learned Iterative Feedback Over Multi-Resolution Gaussians-on-Mesh
- Generalizable Motion Planning via Operator Learning
- Generalization and Distributed Learning of GFlowNets
- Generalization Bounds and Model Complexity for Kolmogorov–Arnold Networks
- Generalization Bounds for Canonicalization: A Comparative Study with Group Averaging
- Generalization, Expressivity, and Universality of Graph Neural Networks on Attributed Graphs
- Generalization Guarantees for Representation Learning via Data-Dependent Gaussian Mixture Priors
- Generalization in VAE and Diffusion Models: A Unified Information-Theoretic Analysis
- Generalization of Transformers with In-Context Learning: An Empirical Study
- Generalization through variance: how noise shapes inductive biases in diffusion models
- Generalization v.s. Memorization: Tracing Language Models’ Capabilities Back to Pretraining Data
- Generalized Behavior Learning from Diverse Demonstrations
- Generalized Consistency Trajectory Models for Image Manipulation
- Generalized Principal-Agent Problem with a Learning Agent
- Generalized Video Moment Retrieval
- Generalizing Reasoning Problems to Longer Lengths
- Generalizing Weisfeiler-Lehman Kernels to Subgraphs
- General Scene Adaptation for Vision-and-Language Navigation
- Generating CAD Code with Vision-Language Models for 3D Designs
- Generating Freeform Endoskeletal Robots
- Generating Less Certain Adversarial Examples Improves Robust Generalization
- Generating Likely Counterfactuals Using Sum-Product Networks
- Generating Multi-Modal and Multi-Attribute Single-Cell Counts with CFGen
- Generating Physical Dynamics under Priors
- Generating with Confidence: Uncertainty Quantification for Black-box Large Language Models
- Generation and Comprehension Hand-in-Hand: Vision-guided Expression Diffusion for Boosting Referring Expression Generation and Comprehension
- Generative Adapter: Contextualizing Language Models in Parameters with A Single Forward Pass
- Generative Adversarial Ranking Nets
- Generative Classifiers Avoid Shortcut Solutions
- Generative Flows on Synthetic Pathway for Drug Design
- Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation
- Generative Models for Robot Learning
- Generative Monoculture in Large Language Models
- Generative Representational Instruction Tuning
- Generative Verifiers: Reward Modeling as Next-Token Prediction
- Generative World Explorer
- Generator Matching: Generative modeling with arbitrary Markov processes
- GenSE: Generative Speech Enhancement via Language Models using Hierarchical Modeling
- GenVP: Generating Visual Puzzles with Contrastive Hierarchical VAEs
- GenXD: Generating Any 3D and 4D Scenes
- GeoILP: A Synthetic Dataset to Guide Large-Scale Rule Induction
- GeoLoRA: Geometric integration for parameter efficient fine-tuning
- Geometric Inductive Biases of Deep Networks: The Role of Data and Architecture
- Geometry-Aware Approaches for Balancing Performance and Theoretical Guarantees in Linear Bandits
- Geometry-aware RL for Manipulation of Varying Shapes and Deformable Objects
- Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with Image-Based Surface Representation
- Geometry of Lightning Self-Attention: Identifiability and Dimension
- Geometry of Neural Reinforcement Learning in Continuous State and Action Spaces
- GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training
- GeSubNet: Gene Interaction Inference for Disease Subtype Network Generation
- GETS: Ensemble Temperature Scaling for Calibration in Graph Neural Networks
- GEVRM: Goal-Expressive Video Generation Model For Robust Visual Manipulation
- GIFT: Unlocking Full Potential of Labels in Distilled Dataset at Near-zero Cost
- GI-GS: Global Illumination Decomposition on Gaussian Splatting for Inverse Rendering
- Glad: A Streaming Scene Generator for Autonomous Driving
- Glauber Generative Model: Discrete Diffusion Models via Binary Classification
- G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model
- Global Convergence of Policy Gradient in Average Reward MDPs
- Global Identifiability of Overcomplete Dictionary Learning via L1 and Volume Minimization
- Global Well-posedness and Convergence Analysis of Score-based Generative Models via Sharp Lipschitz Estimates
- GLOMA: Global Video Text Spotting with Morphological Association
- GLoRa: A Benchmark to Evaluate the Ability to Learn Long-Range Dependencies in Graphs
- GlycanML: A Multi-Task and Multi-Structure Benchmark for Glycan Machine Learning
- GMValuator: Similarity-based Data Valuation for Generative Models
- GNNs Getting ComFy: Community and Feature Similarity Guided Rewiring
- Gnothi Seauton: Empowering Faithful Self-Interpretability in Black-Box Models
- GOAL: A Generalist Combinatorial Optimization Agent Learner
- GOFA: A Generative One-For-All Model for Joint Graph Language Modeling
- Going Beyond Feature Similarity: Effective Dataset distillation based on Class-aware Conditional Mutual Information
- Going Beyond Static: Understanding Shifts with Time-Series Attribution
- GOLD: Graph Out-of-Distribution Detection via Implicit Adversarial Latent Generation
- GoodDrag: Towards Good Practices for Drag Editing with Diffusion Models
- GOPlan: Goal-conditioned Offline Reinforcement Learning by Planning with Learned Models
- GOPS: Learning Generative Object Priors for Unsupervised 3D Instance Segmentation
- GotenNet: Rethinking Efficient 3D Equivariant Graph Neural Networks
- GOttack: Universal Adversarial Attacks on Graph Neural Networks via Graph Orbits Learning
- GPromptShield: Elevating Resilience in Graph Prompt Tuning Against Adversarial Attacks
- GPS: A Probabilistic Distributional Similarity with Gumbel Priors for Set-to-Set Matching
- GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS
- Gradient correlation is needed to accelerate SGD with momentum
- Gradient descent with generalized Newton’s method
- Gradient-Free Generation for Hard-Constrained Systems
- GRAIN: Exact Graph Reconstruction from Gradients
- Gramian Multimodal Representation Learning and Alignment
- Grammar Reinforcement Learning: path and cycle counting in graphs with a Context-Free Grammar and Transformer approach
- Graph Assisted Offline-Online Deep Reinforcement Learning for Dynamic Workflow Scheduling
- Graph-based Document Structure Analysis
- GraphBridge: Towards Arbitrary Transfer Learning in GNNs
- GraphEval: A Lightweight Graph-Based LLM Framework for Idea Evaluation
- Graph-Guided Scene Reconstruction from Images with 3D Gaussian Splatting
- Graph Neural Networks Are More Than Filters: Revisiting and Benchmarking from A Spectral Perspective
- Graph Neural Networks Can (Often) Count Substructures
- Graph Neural Networks for Edge Signals: Orientation Equivariance and Invariance
- Graph Neural Networks Gone Hogwild
- Graph Neural Preconditioners for Iterative Solutions of Sparse Linear Systems
- Graph Neural Ricci Flow: Evolving Feature from a Curvature Perspective
- GraphRouter: A Graph-based Router for LLM Selections
- Graph Sparsification via Mixture of Graphs
- Graph Transformers Dream of Electric Flow
- GRASP: Generating Graphs via Spectral Diffusion
- GravMAD: Grounded Spatial Value Maps Guided Action Diffusion for Generalized 3D Manipulation
- GReaTer: Gradients Over Reasoning Makes Smaller Language Models Strong Prompt Optimizers
- Greener GRASS: Enhancing GNNs with Encoding, Rewiring, and Attention
- Grid Cell-Inspired Fragmentation and Recall for Efficient Map Building
- GridMix: Exploring Spatial Modulation for Neural Fields in PDE Modeling
- gRNAde: Geometric Deep Learning for 3D RNA inverse design
- Grokking at the Edge of Numerical Stability
- GROOT-2: Weakly Supervised Multimodal Instruction Following Agents
- Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval
- Grounding Continuous Representations in Geometry: Equivariant Neural Fields
- Grounding Video Models to Actions through Goal Conditioned Exploration
- Group Distributionally Robust Dataset Distillation with Risk Minimization
- Group Downsampling with Equivariant Anti-aliasing
- Group Ligands Docking to Protein Pockets
- Growth Inhibitors for Suppressing Inappropriate Image Concepts in Diffusion Models
- GSBA$^K$: $top$-$K$ Geometric Score-based Black-box Attack
- GS-CPR: Efficient Camera Pose Refinement via 3D Gaussian Splatting
- GSE: Group-wise Sparse and Explainable Adversarial Attacks
- GS-LiDAR: Generating Realistic LiDAR Point Clouds with Panoramic Gaussian Splatting
- GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
- GTR: Improving Large 3D Reconstruction Models through Geometry and Texture Refinement
- Guaranteed Generation from Large Language Models
- Guided Score identity Distillation for Data-Free One-Step Text-to-Image Generation
- Gyrogroup Batch Normalization
- h4rm3l: A Language for Composable Jailbreak Attack Synthesis
- HADAMRNN: BINARY AND SPARSE TERNARY ORTHOGONAL RNNS
- HaDeMiF: Hallucination Detection and Mitigation in Large Language Models
- HALL-E: Hierarchical Neural Codec Language Model for Minute-Long Zero-Shot Text-to-Speech Synthesis
- Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image Animation
- Halton Scheduler for Masked Generative Image Transformer
- HAMSTER: Hierarchical Action Models for Open-World Robot Manipulation
- Handling Delay in Reinforcement Learning Caused by Parallel Computations of Neurons
- HARDMath: A Benchmark Dataset for Challenging Problems in Applied Mathematics
- HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models
- Harnessing Diversity for Important Data Selection in Pretraining Large Language Models
- Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
- Harnessing Webpage UIs for Text-Rich Visual Understanding
- HASARD: A Benchmark for Harnessing Safe Reinforcement Learning with Doom
- Has the Deep Neural Network learned the Stochastic Process? An Evaluation Viewpoint
- Have the VLMs Lost Confidence? A Study of Sycophancy in VLMs
- HeadMap: Locating and Enhancing Knowledge Circuits in LLMs
- Heavy-Tailed Diffusion Models
- HELMET: How to Evaluate Long-context Models Effectively and Thoroughly
- HELM: Hierarchical Encoding for mRNA Language Modeling
- HelpSteer2-Preference: Complementing Ratings with Preferences
- Herald: A Natural Language Annotated Lean 4 Dataset
- Hessian Free Efficient Single Loop Iterative Differentiation Methods for Bi-Level Optimization Problems
- Hessian-Free Online Certified Unlearning
- HexGen-2: Disaggregated Generative Inference of LLMs in Heterogeneous Environment
- HG-Adapter: Improving Pre-Trained Heterogeneous Graph Neural Networks with Dual Adapters
- HGM³: Hierarchical Generative Masked Motion Modeling with Hard Token Mining
- Hidden in the Noise: Two-Stage Robust Watermarking for Images
- Hierarchical Autoregressive Transformers for Tokenizer-Free Language Modelling
- Hierarchically Encapsulated Representation for Protocol Design in Self-Driving Labs
- Hierarchical Uncertainty Estimation for Learning-based Registration in Neuroimaging
- Hierarchical World Models as Visual Whole-Body Humanoid Controllers
- High-dimensional Analysis of Knowledge Distillation: Weak-to-Strong Generalization and Scaling Laws
- High-Dimensional Bayesian Optimisation with Gaussian Process Prior Variational Autoencoders
- High-dimension Prototype is a Better Incremental Object Detection Learner
- High-Dynamic Radar Sequence Prediction for Weather Nowcasting Using Spatiotemporal Coherent Gaussian Representation
- Highly Efficient Self-Adaptive Reward Shaping for Reinforcement Learning
- High-Precision Dichotomous Image Segmentation via Probing Diffusion Capacity
- High-Quality Joint Image and Video Tokenization with Causal VAE
- High-quality Text-to-3D Character Generation with SparseCubes and Sparse Transformers.
- HiLo: A Learning Framework for Generalized Category Discovery Robust to Domain Shifts
- HiRA: Parameter-Efficient Hadamard High-Rank Adaptation for Large Language Models
- HiSplat: Hierarchical 3D Gaussian Splatting for Generalizable Sparse-View Reconstruction
- HMoRA: Making LLMs More Effective with Hierarchical Mixture of LoRA Experts
- Holistically Evaluating the Environmental Impact of Creating Language Models
- Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data
- Holographic Node Representations: Pre-training Task-Agnostic Node Embeddings
- Homomorphism Counts as Structural Encodings for Graph Learning
- Homomorphism Expressivity of Spectral Invariant Graph Neural Networks
- HOPE for a Robust Parameterization of Long-memory State Space Models
- Hotspot-Driven Peptide Design via Multi-Fragment Autoregressive Extension
- How Discrete and Continuous Diffusion Meet: Comprehensive Analysis of Discrete Diffusion Models via a Stochastic Integral Framework
- How DNNs break the Curse of Dimensionality: Compositionality and Symmetry Learning
- How Does Critical Batch Size Scale in Pre-training?
- How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?
- How Do Large Language Models Understand Graph Patterns? A Benchmark for Graph Pattern Comprehension
- How do students become teachers: A dynamical analysis for two-layer neural networks
- How do we interpret the outputs of a neural network trained on classification?
- How efficient is LLM-generated code? A rigorous & high-standard benchmark
- How Far Are We from True Unlearnability?
- How Feature Learning Can Improve Neural Scaling Laws
- How Learnable Grids Recover Fine Detail in Low Dimesions: A Neural Tangent Kernel Analysis of Multigrid Parameteric Encodings
- How Low Can You Go? Searching for the Intrinsic Dimensionality of Complex Networks using Metric Node Embeddings
- How many samples are needed to train a deep neural network?
- How many tokens is an image worth?
- How Much is a Noisy Image Worth? Data Scaling Laws for Ambient Diffusion.
- How Much is Unseen Depends Chiefly on Information About the Seen
- How much of my dataset did you use? Quantitative Data Usage Inference in Machine Learning
- How new data pollutes LLM knowledge and how to dilute it
- How to Evaluate Reward Models for RLHF
- How to Find the Exact Pareto Front for Multi-Objective MDPs?
- How to Probe: Simple Yet Effective Techniques for Improving Post-hoc Explanations
- How to Verify Any (Reasonable) Distribution Property: Computationally Sound Argument Systems for Distributions
- How to visualize training dynamics in neural networks
- How Two-Layer Neural Networks Learn, One (Giant) Step at a Time
- HQGS: High-Quality Novel View Synthesis with Gaussian Splatting in Degraded Scenes
- HR-Extreme: A High-Resolution Dataset for Extreme Weather Forecasting
- HShare: Fast LLM Decoding by Hierarchical Key-Value Sharing
- Human-Aligned Chess With a Bit of Search
- Human-Feedback Efficient Reinforcement Learning for Online Diffusion Model Finetuning
- Humanizing the Machine: Proxy Attacks to Mislead LLM Detectors
- Human-like Episodic Memory for Infinite Context LLMs
- Human Simulacra: Benchmarking the Personification of Large Language Models
- Hummingbird: High Fidelity Image Generation via Multimodal Context Alignment
- Hybrid Regularization Improves Diffusion-based Inverse Problem Solving
- Hydra-SGG: Hybrid Relation Assignment for One-stage Scene Graph Generation
- Hymba: A Hybrid-head Architecture for Small Language Models
- Hyperbolic Genome Embeddings
- Hyper-Connections
- HyperDAS: Towards Automating Mechanistic Interpretability with Hypernetworks
- HyperFace: Generating Synthetic Face Recognition Datasets by Exploring Face Embedding Hypersphere
- HyperPLR: Hypergraph Generation through Projection, Learning, and Reconstruction
- Hypothetical Minds: Scaffolding Theory of Mind for Multi-Agent Tasks with Large Language Models
- I2VControl-Camera: Precise Video Camera Control with Adjustable Motion Strength
- I Can't Believe It's Not Better: Challenges in Applied Deep Learning
- ICLR 2025 Workshop on Bidirectional Human-AI Alignment
- ICLR 2025 Workshop on GenAI Watermarking: Going beyond safety in watermarking research and development
- ICLR 2025 Workshop on Human-AI Coevolution
- ICLR 2025 Workshop on Tackling Climate Change with Machine Learning: Data-Centric Approaches in ML for Climate Action
- ICL-TSVD: Bridging Theory and Practice in Continual Learning with Pre-trained Models
- Identifiability for Gaussian Processes with Holomorphic Kernels
- Identifiable Exchangeable Mechanisms for Causal Structure and Representation Learning
- Identifying and Tuning Safety Neurons in Large Language Models
- Identifying latent state transitions in non-linear dynamical systems
- IDInit: A Universal and Stable Initialization Method for Neural Network Training
- IDIV: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations
- IFORMER: INTEGRATING CONVNET AND TRANSFORMER FOR MOBILE APPLICATION
- IgGM: A Generative Model for Functional Antibody and Nanobody Design
- IGL-Bench: Establishing the Comprehensive Benchmark for Imbalanced Graph Learning
- ILLUSION: Unveiling Truth with a Comprehensive Multi-Modal, Multi-Lingual Deepfake Dataset
- Image and Video Tokenization with Binary Spherical Quantization
- ImageFolder: Autoregressive Image Generation with Folded Tokens
- Image-level memorization detection via inversion-based inference perturbation
- ImagineNav: Prompting Vision-Language Models as Embodied Navigator through Scene Imagination
- IMDPrompter: Adapting SAM to Image Manipulation Detection by Cross-View Automated Prompt Learning
- ImDy: Human Inverse Dynamics from Imitated Observations
- Immunogenicity Prediction with Dual Attention Enables Vaccine Target Selection
- Implicit Bias of Mirror Flow for Shallow Neural Networks in Univariate Regression
- Implicit In-context Learning
- Implicit Neural Surface Deformation with Explicit Velocity Fields
- Implicit Search via Discrete Diffusion: A Study on Chess
- Improved Algorithms for Kernel Matrix-Vector Multiplication
- Improved Approximation Algorithms for $k$-Submodular Maximization via Multilinear Extension
- Improved Convergence Rate for Diffusion Probabilistic Models
- Improved Diffusion-based Generative Model with Better Adversarial Robustness
- Improved Finite-Particle Convergence Rates for Stein Variational Gradient Descent
- Improved Regret Bounds for Linear Adversarial MDPs via Linear Optimization
- Improved Sampling Algorithms for Lévy-Itô Diffusion Models
- Improved Techniques for Optimization-Based Jailbreaking on Large Language Models
- Improved Training Technique for Latent Consistency Models
- ImProver: Agent-Based Automated Proof Optimization
- Improving Autonomous AI Agents with Reflective Tree Search and Self-Learning
- Improving Complex Reasoning with Dynamic Prompt Corruption: A Soft Prompt Optimization Approach
- Improving Convergence Guarantees of Random Subspace Second-order Algorithm for Nonconvex Optimization
- Improving Data Efficiency via Curating LLM-Driven Rating Systems
- Improving Deep Regression with Tightness
- Improving Equivariant Networks with Probabilistic Symmetry Breaking
- Improving Generalization and Robustness in SNNs Through Signed Rate Encoding and Sparse Encoding Attacks
- Improving Graph Neural Networks by Learning Continuous Edge Directions
- Improving Instruction-Following in Language Models through Activation Steering
- Improving Large Language Model based Multi-Agent Framework through Dynamic Workflow Updating
- Improving Large Language Model Planning with Action Sequence Similarity
- Improving Long-Text Alignment for Text-to-Image Diffusion Models
- Improving Multi-modal Representations via Binding Space in Scale
- Improving Neural Network Accuracy by Concurrently Training with a Twin Network
- Improving Neural Optimal Transport via Displacement Interpolation
- Improving Pretraining Data Using Perplexity Correlations
- Improving Probabilistic Diffusion Models With Optimal Covariance Matching
- Improving Reasoning Performance in Large Language Models via Representation Engineering
- Improving Semantic Understanding in Speech Language Models via Brain-tuning
- Improving Sequence Level Distillation through Hidden State Matching
- Improving Text-to-Image Consistency via Automatic Prompt Optimization
- Improving the Sparse Structure Learning of Spiking Neural Networks from the View of Compression Efficiency
- Improving Uncertainty Estimation through Semantically Diverse Language Generation
- Improving Unsupervised Constituency Parsing via Maximizing Semantic Information
- ImpScore: A Learnable Metric For Quantifying The Implicitness Level of Language
- Imputation for prediction: beware of diminishing returns.
- INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge
- In-Context Editing: Learning Knowledge from Self-Induced Distributions
- In-Context Learning of Representations
- In-context Time Series Predictor
- Incorporating Visual Correspondence into Diffusion Model for Visual Try-On
- Incremental Causal Effect for Time to Treatment Initialization
- Indirect Gradient Matching for Adversarial Robust Distillation
- INFER: A Neural-symbolic Model For Extrapolation Reasoning on Temporal Knowledge Graph
- Inference-Aware Fine-Tuning for Best-of-N Sampling in Large Language Models
- Inference Optimal VLMs Need Only One Visual Token but Larger Models
- Inference Scaling for Long-Context Retrieval Augmented Generation
- Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for LLM Problem-Solving
- Infilling Score: A Pretraining Data Detection Algorithm for Large Language Models
- Infinite-Resolution Integral Noise Warping for Diffusion Models
- Influence Functions for Scalable Data Attribution in Diffusion Models
- Influence-Guided Diffusion for Dataset Distillation
- Information Theoretic Text-to-Image Alignment
- Injecting Universal Jailbreak Backdoors into LLMs in Minutes
- Injective flows for star-like manifolds
- Inner Information Analysis Algorithm for Deep Neural Network based on Community
- Innovative Thinking, Infinite Humor: Humor Research of Large Language Models through Structured Thought Leaps
- Input Space Mode Connectivity in Deep Neural Networks
- In Search of Forgotten Domain Generalization
- In Search of the Engram in LLMs: A Neuroscience Perspective on the Memory Functions in AI Models
- InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation
- INS: Interaction-aware Synthesis to Enhance Offline Multi-agent Reinforcement Learning
- Inspection and Control of Self-Generated-Text Recognition Ability in Llama3-8b-Instruct
- Instance-dependent Early Stopping
- Instant Policy: In-Context Imitation Learning via Graph Diffusion
- InstantPortrait: One-Step Portrait Editing via Diffusion Multi-Objective Distillation
- InstantSplamp: Fast and Generalizable Stenography Framework for Generative Gaussian Splatting
- InstantSwap: Fast Customized Concept Swapping across Sharp Shape Differences
- InstaRevive: One-Step Image Enhancement via Dynamic Score Matching
- InstaSHAP: Interpretable Additive Models Explain Shapley Values Instantly
- InstaTrain: Adaptive Training via Ultra-Fast Natural Annealing within Dynamical Systems
- Instructional Segment Embedding: Improving LLM Safety with Instruction Hierarchy
- InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales
- Instruct-SkillMix: A Powerful Pipeline for LLM Instruction Tuning
- Integral Performance Approximation for Continuous-Time Reinforcement Learning Control
- Integrating Generative and Experimental Platforms for Biomolecular Design
- Integrating Protein Dynamics into Structure-Based Drug Design via Full-Atom Stochastic Flows
- Integrative Decoding: Improving Factuality via Implicit Self-consistency
- Intelligence at the Edge of Chaos
- Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models
- Intent3D: 3D Object Detection in RGB-D Scans Based on Human Intention
- Interaction Asymmetry: A General Principle for Learning Composable Abstractions
- Interactive Adjustment for Human Trajectory Prediction with Individual Feedback
- Interactive Speculative Planning: Enhance Agent Efficiency through Co-design of System and User Interface
- Interference Among First-Price Pacing Equilibria: A Bias and Variance Analysis
- Interleaved Scene Graph for Interleaved Text-and-Image Generation Assessment
- InterMask: 3D Human Interaction Generation via Collaborative Masked Modelling
- Intermediate Layer Classifiers for OOD generalization
- Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence
- Interpolating Autoregressive and Discrete Denoising Diffusion Language Models
- Interpretable Causal Representation Learning for Biological Data in the Pathway Space
- Interpretable Unsupervised Joint Denoising and Enhancement for Real-World low-light Scenarios
- Interpretable Vision-Language Survival Analysis with Ordinal Inductive Bias for Computational Pathology
- Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations
- Interpreting Emergent Planning in Model-Free Reinforcement Learning
- Interpreting Global Perturbation Robustness of Image Models using Axiomatic Spectral Importance Decomposition
- Interpreting Language Reward Models via Contrastive Explanations
- Interpreting the Second-Order Effects of Neurons in CLIP
- IntersectionZoo: Eco-driving for Benchmarking Multi-Agent Contextual Reinforcement Learning
- Intervening Anchor Token: Decoding Strategy in Alleviating Hallucinations for MLLMs
- Intricacies of Feature Geometry in Large Language Models
- Intrinsic Dimension Correlation: uncovering nonlinear connections in multimodal representations
- Intrinsic User-Centric Interpretability through Global Mixture of Experts
- Invariance to Planning in Goal-Conditioned RL
- Invariant Graphon Networks: Approximation and Cut Distance
- Inverse Attention Agent in Multi-Agent System
- InverseBench: Benchmarking Plug-and-Play Diffusion Models for Scientific Inverse Problems
- Inverse Constitutional AI: Compressing Preferences into Principles
- Inverse decision-making using neural amortized Bayesian actors
- Inverse Rendering for Shape, Light, and Material Decomposition using Multi-Bounce Path Tracing and Reservoir Sampling
- Inverse Scaling: When Bigger Isn't Better
- InversionGNN: A Dual Path Network for Multi-Property Molecular Optimization
- InvestESG: A multi-agent reinforcement learning benchmark for studying climate investment as a social dilemma
- Investigating Pattern Neurons in Urban Time Series Forecasting
- Investigating the Pre-Training Dynamics of In-Context Learning: Task Recognition vs. Task Learning
- In vivo cell-type and brain region classification via multimodal contrastive learning
- IPDreamer: Appearance-Controllable 3D Object Generation with Complex Image Prompts
- Is Factuality Enhancement a Free Lunch For LLMs? Better Factuality Can Lead to Worse Context-Faithfulness
- Is Graph Convolution Always Beneficial For Every Feature?
- Is In-Context Learning Sufficient for Instruction Following in LLMs?
- Is Large-scale Pretraining the Secret to Good Domain Generalization?
- Isometric Regularization for Manifolds of Functional Data
- Is uniform expressivity too restrictive? Towards efficient expressivity of GNNs
- Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist
- Is Your Multimodal Language Model Oversensitive to Safe Queries?
- Is Your Video Language Model a Reliable Judge?
- Iterative Dual-RL: An Optimal Discriminator Weighted Imitation Perspective for Reinforcement Learning
- Iterative Label Refinement Matters More than Preference Optimization under Weak Supervision
- Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
- Iterative Substructure Extraction for Molecular Relational Learning with Interactive Graph Information Bottleneck
- IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
- IterGen: Iterative Structured LLM Generation
- It Helps to Take a Second Opinion: Teaching Smaller LLMs To Deliberate Mutually via Selective Rationale Optimisation
- IV-mixed Sampler: Leveraging Image Diffusion Models for Enhanced Video Synthesis
- Jailbreak Antidote: Runtime Safety-Utility Balance via Sparse Representation Adjustment in Large Language Models
- Jailbreaking as a Reward Misspecification Problem
- Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks
- Jamba: Hybrid Transformer-Mamba Language Models
- JetFormer: An autoregressive generative model of raw images and text
- Jogging the Memory of Unlearned LLMs Through Targeted Relearning Attacks
- Joint Fine-tuning and Conversion of Pretrained Speech and Language Models towards Linear Complexity
- Joint Gradient Balancing for Data Ordering in Finite-Sum Multi-Objective Optimization
- Joint Graph Rewiring and Feature Denoising via Spectral Resonance
- Joint Reward and Policy Learning with Demonstrations and Human Feedback Improves Alignment
- JPEG Inspired Deep Learning
- JudgeBench: A Benchmark for Evaluating LLM-Based Judges
- Judge Decoding: Faster Speculative Sampling Requires Going Beyond Model Alignment
- JudgeLM: Fine-tuned Large Language Models are Scalable Judges
- Jump Your Steps: Optimizing Sampling Schedule of Discrete Diffusion Models
- Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge
- KAA: Kolmogorov-Arnold Attention for Enhancing Attentive Graph Neural Networks
- KAN: Kolmogorov–Arnold Networks
- KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models
- KBLaM: Knowledge Base augmented Language Model
- Kernel-based Optimally Weighted Conformal Time-Series Prediction
- Kinetix: Investigating the Training of General Agents through Open-Ended Physics-Based Control Tasks
- KinFormer: Generalizable Dynamical Symbolic Regression for catalytic organic Reaction Kinetics
- KinPFN: Bayesian Approximation of RNA Folding Kinetics using Prior-Data Fitted Networks
- KiVA: Kid-inspired Visual Analogies for Testing Large Multimodal Models
- KLay: Accelerating Neurosymbolic AI
- Knowing Your Target : Target-Aware Transformer Makes Better Spatio-Temporal Video Grounding
- Knowledge And Capability Transfer Through Large Language Models' Parameters Fusing
- Knowledge Benchmark Graph: Assisting Large Language Models in Designing Models by Retrieving Benchmark Knowledge
- Knowledge Distillation with Multi-granularity Mixture of Priors for Image Super-Resolution
- Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition
- Knowledge Graph Finetuning Enhances Knowledge Manipulation in Large Language Models
- Knowledge Localization: Mission Not Accomplished? Enter Query Localization!
- Kolmogorov-Arnold Transformer
- KooNPro: A Variance-Aware Koopman Probabilistic Model Enhanced by Neural Processes for Time Series Forecasting
- KOR-Bench: Benchmarking Language Models on Knowledge-Orthogonal Reasoning Tasks
- Kronecker Mask and Interpretive Prompts are Language-Action Video Learners
- L3Ms — Lagrange Large Language Models
- Label Correlation Biases Direct Time Series Forecast
- Label-Free Coreset Selection with Proxy Training Dynamics
- LaGeM: A Large Geometry Model for 3D Representation Learning and Diffusion
- Lambda-Skip Connections: the architectural component that prevents Rank Collapse
- LaMPlace: Learning to Optimize Cross-Stage Metrics in Macro Placement
- LaMP: Language-Motion Pretraining for Motion Generation, Retrieval, and Captioning
- LancBiO: Dynamic Lanczos-aided Bilevel Optimization via Krylov Subspace
- Langevin Soft Actor-Critic: Efficient Exploration through Uncertainty-Driven Critic Learning
- Language Agents Meet Causality -- Bridging LLMs and Causal World Models
- Language-Assisted Feature Transformation for Anomaly Detection
- Language Guided Skill Discovery
- Language-Image Models with 3D Understanding
- Language Imbalance Driven Rewarding for Multilingual Self-improving
- Language Model Non-Myopic Generation for Reasoning and Planning
- Language Models are Advanced Anonymizers
- Language Models Are Implicitly Continuous
- Language Models Can Articulate Their Implicit Goals
- Language Models Learn to Mislead Humans via RLHF
- Language Models Need Inductive Biases to Count Inductively
- Language models scale reliably with over-training and on downstream tasks
- Language Models Trained to do Arithmetic Predict Human Risky and Intertemporal Choice
- Language Representations Can be What Recommenders Need: Findings and Potentials
- LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding
- Laplace Sample Information: Data Informativeness Through a Bayesian Lens
- Large Convolutional Model Tuning via Filter Subspace
- Large Language Models are Interpretable Learners
- Large Language Models Assume People are More Rational than We Really are
- Large Language Models can be Strong Self-Detoxifiers
- Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses
- Large Language Models Meet Symbolic Provers for Logical Reasoning Evaluation
- Large Language Models Often Say One Thing and Do Another
- Larger Language Models Provably Generalize Better
- Large-scale and Fine-grained Vision-language Pre-training for Enhanced CT Image Understanding
- Large Scale Knowledge Washing
- Large (Vision) Language Models are Unsupervised In-Context Learners
- LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior
- LASER: A Neuro-Symbolic Framework for Learning Spatio-Temporal Scene Graphs with Weak Supervision
- LASeR: Towards Diversified and Generalizable Robot Design with Large Language Models
- Lasso Bandit with Compatibility Condition on Optimal Arm
- Last Iterate Convergence of Incremental Methods as a Model of Forgetting
- Last-Iterate Convergence Properties of Regret-Matching Algorithms in Games
- Latent Action Pretraining from Videos
- Latent Bayesian Optimization via Autoregressive Normalizing Flows
- Latent-EnSF: A Latent Ensemble Score Filter for High-Dimensional Data Assimilation with Sparse Observation Data
- Latent Radiance Fields with 3D-aware 2D Representations
- Latent Safety-Constrained Policy Approach for Safe Offline Reinforcement Learning
- Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation
- Lawma: The Power of Specialization for Legal Tasks
- Law of the Weakest Link: Cross Capabilities of Large Language Models
- LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation
- Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models
- Layerwise Recurrent Router for Mixture-of-Experts
- Layout-your-3D: Controllable and Precise 3D Generation with 2D Blueprint
- LDAdam: Adaptive Optimization from Low-Dimensional Gradient Statistics
- LeanAgent: Lifelong Learning for Formal Theorem Proving
- LeanQuant: Accurate and Scalable Large Language Model Quantization with Loss-error-aware Grid
- Lean-STaR: Learning to Interleave Thinking and Proving
- LeanVec: Searching vectors faster by making them fit
- Learnable Expansion of Graph Operators for Multi-Modal Feature Fusion
- Learn-by-interact: A Data-Centric Framework For Self-Adaptive Agents in Realistic Environments
- Learned Reference-based Diffusion Sampler for multi-modal distributions
- Learn hybrid prototypes for multivariate time series anomaly detection
- Learning 3D Perception from Others' Predictions
- Learning a Fast Mixing Exogenous Block MDP using a Single Trajectory
- Learning and aligning single-neuron invariance manifolds in visual cortex
- Learning a Neural Solver for Parametric PDE to Enhance Physics-Informed Methods
- Learning-Augmented Frequent Directions
- Learning-Augmented Search Data Structures
- Learning Causal Alignment for Reliable Disease Diagnosis
- Learning Chaos In A Linear Way
- Learning Closed-Loop Concept-Guided Policies from Unlabeled Demonstrations
- Learning Clustering-based Prototypes for Compositional Zero-Shot Learning
- Learning Color Equivariant Representations
- Learning Conditionally Independent Marginals Enables Logical Compositions in Conditional Diffusion Models
- Learning Continually by Spectral Regularization
- Learning Diagrams: A Graphical Language for Compositional Training Regimes
- Learning Distributions of Complex Fluid Simulations with Diffusion Graph Networks
- Learning Diverse Attacks on Large Language Models for Robust Red-Teaming and Safety Tuning
- Learning Dynamics of Deep Matrix Factorization Beyond the Edge of Stability
- Learning Dynamics of LLM Finetuning
- Learning Efficient Positional Encodings with Graph Neural Networks
- Learning Equivariant Non-Local Electron Density Functionals
- Learning Evolving Tools for Large Language Models
- Learning Fine-Grained Representations through Textual Token Disentanglement in Composed Video Retrieval
- Learning from End User Data with Shuffled Differential Privacy over Kernel Densities
- Learning from Imperfect Human Feedback: A Tale from Corruption-Robust Dueling
- Learning from weak labelers as constraints
- Learning Gain Map for Inverse Tone Mapping
- Learning Generalizable Skills from Offline Multi-Task Data for Multi-Agent Cooperation
- Learning General-purpose Biomedical Volume Representations using Randomized Synthesis
- Learning Geometric Reasoning Networks For Robot Task And Motion Planning
- Learning Graph Invariance by Harnessing Spuriosity
- Learning Graph Quantized Tokenizers for Transformers
- Learning-Guided Rolling Horizon Optimization for Long-Horizon Flexible Job-Shop Scheduling
- Learning Harmonized Representations for Speculative Sampling
- Learning Hierarchical Polynomials of Multiple Nonlinear Features
- Learning High-Degree Parities: The Crucial Role of the Initialization
- Learning How Hard to Think: Input-Adaptive Allocation of LM Computation
- Learning Interleaved Image-Text Comprehension in Vision-Language Large Models
- Learning Interpretable Hierarchical Dynamical Systems Models from Time Series Data
- Learning LLM-as-a-Judge for Preference Alignment
- Learning local equivariant representations for quantum operators
- Learning Long Range Dependencies on Graphs via Random Walks
- Learning Mask Invariant Mutual Information for Masked Image Modeling
- Learning Meaningful Representations of Life (LMRL) Workshop @ ICLR 2025
- Learning mirror maps in policy mirror descent
- Learning Molecular Representation in a Cell
- Learning Multi-Index Models with Neural Networks via Mean-Field Langevin Dynamics
- Learning multi-modal generative models with permutation-invariant encoders and tighter variational objectives
- Learning Neural Networks with Distribution Shift: Efficiently Certifiable Guarantees
- Learning on One Mode: Addressing Multi-modality in Offline Reinforcement Learning
- Learning Partial Graph Matching via Optimal Partial Transport
- Learning Randomized Algorithms with Transformers
- Learning Regularized Graphon Mean-Field Games with Unknown Graphons
- Learning Representations of Intermittent Temporal Latent Process
- Learning Robust Representations with Long-Term Information for Generalization in Visual Reinforcement Learning
- Learning Shape-Independent Transformation via Spherical Representations for Category-Level Object Pose Estimation
- Learning Spatial-Semantic Features for Robust Video Object Segmentation
- Learning Spatiotemporal Dynamical Systems from Point Process Observations
- Learning Splitting Heuristics in Divide-and-Conquer SAT Solvers with Reinforcement Learning
- Learning stochastic dynamics from snapshots through regularized unbalanced optimal transport
- Learning Structured Representations by Embedding Class Hierarchy with Fast Optimal Transport
- Learning Successor Features with Distributed Hebbian Temporal Memory
- Learning system dynamics without forgetting
- Learning Task Belief Similarity with Latent Dynamics for Meta-Reinforcement Learning
- Learning the Complexity of Weakly Noisy Quantum States
- Learning the Optimal Stopping for Early Classification within Finite Horizons via Sequential Probability Ratio Test
- Learning to Achieve Goals with Belief State Transformers
- Learning to Adapt Frozen CLIP for Few-Shot Test-Time Domain Adaptation
- Learning to Clarify: Multi-turn Conversations with Action-Based Contrastive Self-Training
- Learning to Communicate Through Implicit Communication Channels
- Learning to Contextualize Web Pages for Enhanced Decision Making by LLM Agents
- Learning to Discover Regulatory Elements for Gene Expression Prediction
- Learning to Discretize Denoising Diffusion ODEs
- Learning to engineer protein flexibility
- Learning to Explore and Exploit with GNNs for Unsupervised Combinatorial Optimization
- Learning to Help in Multi-Class Settings
- Learning to Permute with Discrete Diffusion
- Learning to Plan Before Answering: Self-Teaching LLMs to Learn Abstract Plans for Problem Solving
- Learning to Search from Demonstration Sequences
- Learning to Select Nodes in Branch and Bound with Sufficient Tree Representation
- Learning to Solve Differential Equation Constrained Optimization Problems
- Learning to Steer Markovian Agents under Model Uncertainty
- Learning Transformer-based World Models with Contrastive Predictive Coding
- Learning under Temporal Label Noise
- Learning Unified Static-Dynamic Representation across Multiple Visuo-tactile Sensors
- Learning vector fields of differential equations on manifolds with geometrically constrained operator-valued kernels
- Learning Video-Conditioned Policy on Unlabelled Data with Joint Embedding Predictive Transformer
- Learning View-invariant World Models for Visual Robotic Manipulation
- Learn Your Reference Model for Real Good Alignment
- Leave-One-Out Stable Conformal Prediction
- LeFusion: Controllable Pathology Synthesis via Lesion-Focused Diffusion Models
- LegoScale: One-stop PyTorch native solution for production ready LLM pre-training
- Let Me Grok for You: Accelerating Grokking via Embedding Transfer from a Weaker Model
- Let SSMs be ConvNets: State-space Modeling with Optimal Tensor Contractions
- Let the Code LLM Edit Itself When You Edit the Code
- Leveraging Discrete Structural Information for Molecule-Text Modeling
- Leveraging Driver Field-of-View for Multimodal Ego-Trajectory Prediction
- Leveraging Flatness to Improve Information-Theoretic Generalization Bounds for SGD
- Leveraging Submodule Linearity Enhances Task Arithmetic Performance in LLMs
- Leveraging Sub-Optimal Data for Human-in-the-Loop Reinforcement Learning
- Leveraging Variable Sparsity to Refine Pareto Stationarity in Multi-Objective Optimization
- LICO: Large Language Models for In-Context Molecular Optimization
- LICORICE: Label-Efficient Concept-Based Interpretable Reinforcement Learning
- Lie Algebra Canonicalization: Equivariant Neural Operators under arbitrary Lie Groups
- LiFT: Learning to Fine-Tune via Bayesian Parameter Efficient Meta Fine-Tuning
- Lift Your Molecules: Molecular Graph Generation in Latent Euclidean Space
- Lightning-Fast Image Inversion and Editing for Text-to-Image Diffusion Models
- Lightweight Neural App Control
- Lightweight Predictive 3D Gaussian Splats
- Limits of Deep Learning: Sequence Modeling through the Lens of Complexity Theory
- Limits to scalable evaluation at the frontier: LLM as judge won’t beat twice the data
- Linear Bandits with Memory
- Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better
- Linear combinations of Gaussian latents in generative models: interpolation and beyond
- Linear Mode Connectivity in Differentiable Tree Ensembles
- Linear Multistep Solver Distillation for Fast Sampling of Diffusion Models
- Linear Partial Gromov-Wasserstein Embedding
- Linear Recursions for Everyone
- Linear Representations of Political Perspective Emerge in Large Language Models
- Linear SCM Identification in the Presence of Confounders and Gaussian Noise
- Linear Spherical Sliced Optimal Transport: A Fast Metric for Comparing Spherical Data
- Linear Transformer Topological Masking with Graph Random Features
- Lines of Thought in Large Language Models
- LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging
- Lipschitz Bandits in Optimal Space
- LiveBench: A Challenging, Contamination-Free LLM Benchmark
- LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code
- LiveXiv - A Multi-Modal live benchmark based on Arxiv papers content
- LLaMaFlex: Many-in-one LLMs via Generalized Pruning and Weight Sharing
- LLaMA-Omni: Seamless Speech Interaction with Large Language Models
- LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
- LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token
- LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation
- LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models
- LLM-Assisted Static Analysis for Detecting Security Vulnerabilities
- LLM-based Typed Hyperresolution for Commonsense Reasoning with Knowledge Bases
- LLMOPT: Learning to Define and Solve General Optimization Problems from Scratch
- LLMs Can Plan Only If We Tell Them
- LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations
- LLMs' Potential Influences on Our Democracy: Challenges and Opportunities
- LLM-SR: Scientific Equation Discovery via Programming with Large Language Models
- LLM Table Reading: Bridging the Semantic Gap Between Text and Table
- LLM Unlearning via Loss Adjustment with Only Forget Data
- LLM-wrapper: Black-Box Semantic-Aware Adaptation of Vision-Language Models for Referring Expression Comprehension
- Local convergence of simultaneous min-max algorithms to differential equilibrium on Riemannian manifold
- Locality Alignment Improves Vision-Language Models
- Locality-aware Gaussian Compression for Fast and High-quality Rendering
- Locality Sensitive Avatars From Video
- Local Loss Optimization in the Infinite Width: Stable Parameterization of Predictive Coding Networks and Target Propagation
- Locally Connected Echo State Networks for Time Series Forecasting
- LoCA: Location-Aware Cosine Adaptation for Parameter-Efficient Fine-Tuning
- Local Patterns Generalize Better for Novel Anomalies
- Local-Prompt: Extensible Local Prompts for Few-Shot Out-of-Distribution Detection
- Local Steps Speed Up Local GD for Heterogeneous Distributed Logistic Regression
- LoCoDL: Communication-Efficient Distributed Learning with Local Training and Compression
- LocoVR: Multiuser Indoor Locomotion Dataset in Virtual Reality
- Logical Consistency of Large Language Models in Fact-Checking
- Logically Consistent Language Models via Neuro-Symbolic Integration
- Logicbreaks: A Framework for Understanding Subversion of Rule-based Inference
- Logic-Logit: A Logic-Based Approach to Choice Modeling
- LOIRE: LifelOng learning on Incremental data via pre-trained language model gRowth Efficiently
- LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models
- LoLCATs: On Low-Rank Linearizing of Large Language Models
- Long Context Compression with Activation Beacon
- Long-Context Linear System Identification
- Long-Context LLMs Meet RAG: Overcoming Challenges for Long Inputs in RAG
- LongGenBench: Benchmarking Long-Form Generation in Long Context LLMs
- Long-horizon Visual Instruction Generation with Logic and Attribute Self-reflection
- Longhorn: State Space Models are Amortized Online Learners
- LongMamba: Enhancing Mamba's Long-Context Capabilities via Training-Free Receptive Field Enlargement
- LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory
- LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization
- Long-Sequence Recommendation Models Need Decoupled Embeddings
- Long-Short Decision Transformer: Bridging Global and Local Dependencies for Generalized Decision-Making
- Long-tailed Adversarial Training with Self-Distillation
- Long-time asymptotics of noisy SVGD outside the population limit
- LongVILA: Scaling Long-Context Visual Language Models for Long Videos
- LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
- Look Before You Leap: Universal Emergent Mechanism for Retrieval in Language Models
- Looking Backward: Retrospective Backward Synthesis for Goal-Conditioned GFlowNets
- Looking into User’s Long-term Interests through the Lens of Conservative Evidential Learning
- Looking Inward: Language Models Can Learn About Themselves by Introspection
- Looped Transformers for Length Generalization
- LoRA3D: Low-Rank Self-Calibration of 3D Geometric Foundation models
- LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization
- LoRA Learns Less and Forgets Less
- LoRA-Pro: Are Low-Rank Adapters Properly Optimized?
- LoRA-X: Bridging Foundation Models with Training-Free Cross-Model Adaptation
- LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation
- Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escaping, and Network Embedding
- Lossy Compression with Pretrained Diffusion Models
- Lost in Prediction: Why Social Media Narratives Don't Help Macroeconomic Forecasting?
- Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
- LR0.FM: LOW-RESOLUTION ZERO-SHOT CLASSIFICATION BENCHMARK FOR FOUNDATION MODELS
- LucidPPN: Unambiguous Prototypical Parts Network for User-centric Interpretable Computer Vision
- Lumina-T2X: Scalable Flow-based Large Diffusion Transformer for Flexible Resolution Generation
- LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias
- L-WISE: Boosting human image category learning through model-based image selection and enhancement
- M^3PC: Test-time Model Predictive Control using Pretrained Masked Trajectory Model
- MA$^2$E: Addressing Partial Observability in Multi-Agent Reinforcement Learning with Masked Auto-Encoder
- Machine Learning for Genomics Explorations (MLGenX)
- Machine Learning Multiscale Processes
- Machine Unlearning Fails to Remove Data Poisoning Attacks
- Machine Unlearning via Simulated Oracle Matching
- MACPO: Weak-to-Strong Alignment via Multi-Agent Contrastive Preference Optimization
- MADGEN - Mass-Spec attends to De Novo Molecular generation
- MAD-TD: Model-Augmented Data stabilizes High Update Ratio RL
- MAESTRO: Masked Encoding Set Transformer with Self-Distillation
- MaestroMotif: Skill Design from Artificial Intelligence Feedback
- MAGE: Model-Level Graph Neural Networks Explanations via Motif-based Graph Generation
- MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding
- MagicPIG: LSH Sampling for Efficient LLM Generation
- Magnetic Mirror Descent Self-play Preference Optimization
- MAGNet: Motif-Agnostic Generation of Molecules from Scaffolds
- Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing
- MAI: A Multi-turn Aggregation-Iteration Model for Composed Image Retrieval
- Maintaining Structural Integrity in Parameter Spaces for Parameter Efficient Fine-tuning
- Make Haste Slowly: A Theory of Emergent Structured Mixed Selectivity in Feature Learning ReLU Networks
- Making Text Embedders Few-Shot Learners
- Making Transformer Decoders Better Differentiable Indexers
- MallowsPO: Fine-Tune Your LLM with Preference Dispersions
- MambaExtend: A Training-Free Approach to Improve Long Context Extension of Mamba
- MambaPEFT: Exploring Parameter-Efficient Fine-Tuning for Mamba
- MambaQuant: Quantizing the Mamba Family with Variance Aligned Rotation Methods
- MamBEV: Enabling State Space Models to Learn Birds-Eye-View Representations
- MamKO: Mamba-based Koopman operator for modeling and predictive control
- Managing Diffuse Risks in the Safe Deployment of Untrusted Large Language Models
- Manifold Constraint Reduces Exposure Bias in Accelerated Diffusion Sampling
- Manifold Induced Biases for Zero-shot and Few-shot Detection of Generated Images
- Manifold Learning by Mixture Models of VAEs for Inverse Problems
- Manifolds, Random Matrices and Spectral Gaps: The geometric phases of generative diffusion
- ManiSkill-HAB: A Benchmark for Low-Level Manipulation in Home Rearrangement Tasks
- MANTRA: The Manifold Triangulations Assemblage
- Many-Objective Multi-Solution Transport
- MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation
- MAP: Multi-Human-Value Alignment Palette
- MAPS: Advancing Multi-Modal Reasoning in Expert-Level Physical Science
- MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions
- MarS: a Financial Market Simulation Engine Powered by Generative Foundation Model
- MaskBit: Embedding-free Image Generation via Bit Tokens
- Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs
- Masked Diffusion Models are Secretly Time-Agnostic Masked Models and Exploit Inaccurate Categorical Sampling
- Masked Temporal Interpolation Diffusion for Procedure Planning in Instructional Videos
- MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec Transformer
- Mask in the Mirror: Implicit Sparsification
- Mastering Task Arithmetic: $\tau$Jp as a Key Indicator for Weight Disentanglement
- MAST: model-agnostic sparsified training
- Matcha: Mitigating Graph Structure Shifts with Test-Time Adaptation
- Matérn Kernels for Tunable Implicit Surface Reconstruction
- MatExpert: Decomposing Materials Discovery By Mimicking Human Experts
- MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code
- MathGAP: Out-of-Distribution Evaluation on Problems with Arbitrarily Complex Proofs
- Matrix Product Sketching via Coordinated Sampling
- MatryoshkaKV: Adaptive KV Compression via Trainable Orthogonal Projection
- Matryoshka Multimodal Models
- MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine
- MaxCutPool: differentiable feature-aware Maxcut for pooling in graph neural networks
- Maximizing the Potential of Synthetic Data: Insights from Random Matrix Theory
- MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
- McEval: Massively Multilingual Code Evaluation
- MC-MoE: Mixture Compressor for Mixture-of-Experts LLMs Gains More
- MCNC: Manifold-Constrained Reparameterization for Neural Compression
- MDSGen: Fast and Efficient Masked Diffusion Temporal-Aware Transformers for Open-Domain Sound Generation
- Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse
- Measuring And Improving Persuasiveness Of Generative Models
- Measuring memorization in RLHF for code completion
- Measuring Non-Adversarial Reproduction of Training Data in Large Language Models
- Mechanism and emergence of stacked attention heads in multi-layer transformers
- Mechanistic Interpretability Meets Vision Language Models: Insights and Limitations
- Mechanistic Permutability: Match Features Across Layers
- MediConfusion: Can you trust your AI radiologist? Probing the reliability of multimodal medical foundation models
- Medium-Difficulty Samples Constitute Smoothed Decision Boundary for Knowledge Distillation on Pruned Datasets
- MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine
- MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks
- Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
- MELODI: Exploring Memory Compression for Long Contexts
- Memory-efficient Training of Large Language Models with Larger Mini-batches
- Memory Efficient Transformer Adapter for Dense Predictions
- Memory Mosaics
- Mentored Learning: Improving Generalization and Convergence of Student Learner
- Merging LoRAs like Playing LEGO: Pushing the Modularity of LoRA to Extremes Through Rank-Wise Clustering
- MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers
- MeshMask: Physics-Based Simulations with Masked Graph Neural Networks
- metabench - A Sparse Benchmark of Reasoning and Knowledge in Large Language Models
- Meta-Continual Learning of Neural Fields
- MetaDesigner: Advancing Artistic Typography through AI-Driven, User-Centric, and Multilingual WordArt Synthesis
- Meta-Dynamical State Space Models for Integrative Neural Data Analysis
- Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold
- Metalic: Meta-Learning In-Context with Protein Language Models
- MetaMetrics: Calibrating Metrics for Generation Tasks Using Human Preferences
- Metamizer: A Versatile Neural Optimizer for Fast and Accurate Physics Simulations
- MetaOOD: Automatic Selection of OOD Detection Models
- MetaUrban: An Embodied AI Simulation Platform for Urban Micromobility
- MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models
- Methods for Convex $(L_0,L_1)$-Smooth Optimization: Clipping, Acceleration, and Adaptivity
- Methods with Local Steps and Random Reshuffling for Generally Smooth Non-Convex Federated Optimization
- MeToken: Uniform Micro-environment Token Boosts Post-Translational Modification Prediction
- Metric-Driven Attributions for Vision Transformers
- MGCFNN: A Neural MultiGrid Solver with Novel Fourier Neural Network for High Wave Number Helmholtz Equations
- MGDA Converges under Generalized Smoothness, Provably
- MGMapNet: Multi-Granularity Representation Learning for End-to-End Vectorized HD Map Construction
- MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs
- MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models
- Microcanonical Langevin Ensembles: Advancing the Sampling of Bayesian Neural Networks
- MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Masked Image Modeling Representations
- Mind Control through Causal Inference: Predicting Clean Images from Poisoned Data
- MIND: Math Informed syNthetic Dialogues for Pretraining LLMs
- MIND over Body: Adaptive Thinking using Dynamic Computation
- MindSearch: Mimicking Human Minds Elicits Deep AI Searcher
- MindSimulator: Exploring Brain Concept Localization via Synthetic fMRI
- Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models
- Mind the GAP: Glimpse-based Active Perception improves generalization and sample efficiency of visual reasoning
- miniCTX: Neural Theorem Proving with (Long-)Contexts
- Minimal Impact ControlNet: Advancing Multi-ControlNet Integration
- Minimalistic Predictions for Online Class Constraint Scheduling
- Minimax Optimal Reinforcement Learning with Quasi-Optimism
- Mini-Monkey: Alleviating the Semantic Sawtooth Effect for Lightweight MLLMs via Complementary Image Pyramid
- Mining your own secrets: Diffusion Classifier Scores for Continual Personalization of Text-to-Image Diffusion Models
- MiniPLM: Knowledge Distillation for Pre-training Language Models
- Minixax Optimal Two-Stage Algorithm For Moment Estimation Under Covariate Shift
- Min-K%++: Improved Baseline for Pre-Training Data Detection from Large Language Models
- MIRACLE 3D: Memory-efficient Integrated Robust Approach for Continual Learning on 3D Point Clouds via Shape Model Reconstruction
- MIRAGE: Evaluating and Explaining Inductive Reasoning Process in Language Models
- (Mis)Fitting Scaling Laws: A Survey of Scaling Law Fitting Techniques in Deep Learning
- Misspecified $Q$-Learning with Sparse Linear Function Approximation: Tight Bounds on Approximation Error
- Mitigate the Gap: Improving Cross-Modal Alignment in CLIP
- Mitigating Hallucination in Large Vision-Language Models via Modular Attribution and Intervention
- Mitigating Information Loss in Tree-Based Reinforcement Learning via Direct Optimization
- Mitigating Memorization in Language Models
- Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality
- Mitigating Parameter Interference in Model Merging via Sharpness-Aware Fine-Tuning
- Mitigating Reward Over-Optimization in RLHF via Behavior-Supported Regularization
- Mitigating Robust Overfitting in Wasserstein Distributionally Robust Optimization
- Mitigating Spurious Correlations in Zero-Shot Multimodal Models
- Mitigating Spurious Correlations via Group-robust Sample Reweighting
- Mitigating the Backdoor Effect for Multi-Task Model Merging via Safety-Aware Subspace
- Mix-CPT: A Domain Adaptation Framework via Decoupling Knowledge Learning and Format Alignment
- MixEval-X: Any-to-any Evaluations from Real-world Data Mixture
- Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN
- MixMax: Distributional Robustness in Function Space via Optimal Data Mixtures
- Mixture-of-Agents Enhances Large Language Model Capabilities
- Mixture of Attentions For Speculative Decoding
- Mixture of Experts Made Personalized: Federated Prompt Learning for Vision-Language Models
- Mixture of In-Context Prompters for Tabular PFNs
- Mixture of Parrots: Experts improve memorization more than reasoning
- MLE-Bench: Evaluating Machine Learning Agents on Machine Learning Engineering
- MLLM as Retriever: Interactively Learning Multimodal Retrieval for Embodied Agents
- MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation
- MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs
- MLPs Learn In-Context on Regression and Classification Tasks
- MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning
- MMAD: A Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detection
- MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark
- MMD-Regularized Unbalanced Optimal Transport
- MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models
- MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models
- MMEgo: Towards Building Egocentric Multimodal LLMs
- MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?
- MMFakeBench: A Mixed-Source Multimodal Misinformation Detection Benchmark for LVLMs
- MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models
- MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models
- MMKE-Bench: A Multimodal Editing Benchmark for Diverse Visual Knowledge
- MMQA: Evaluating LLMs with Multi-Table Multi-Hop Complex Questions
- MMR: A Large-scale Benchmark Dataset for Multi-target and Multi-granularity Reasoning Segmentation
- MMRole: A Comprehensive Framework for Developing and Evaluating Multimodal Role-Playing Agents
- MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines
- MMTEB: Massive Multilingual Text Embedding Benchmark
- MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos
- MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments
- Modality-Specialized Synergizers for Interleaved Vision-Language Generalists
- MoDeGPT: Modular Decomposition for Large Language Model Compression
- Model aggregation: minimizing empirical variance outperforms minimizing empirical error
- Model-Agnostic Knowledge Guided Correction for Improved Neural Surrogate Rollout
- Model-agnostic meta-learners for estimating heterogeneous treatment effects over time
- Model-based Offline Reinforcement Learning with Lower Expectile Q-Learning
- Model-based RL as a Minimalist Approach to Horizon-Free and Second-Order Bounds
- Model Editing as a Robust and Denoised variant of DPO: A Case Study on Toxicity
- Model Equality Testing: Which Model is this API Serving?
- Model-Free Offline Reinforcement Learning with Enhanced Robustness
- Modeling Complex System Dynamics with Flow Matching Across Time and Conditions
- Modeling dynamic social vision highlights gaps between deep learning and humans
- Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation Learning
- Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions
- Modeling Unseen Environments with Language-guided Composable Causal Components in Reinforcement Learning
- Model merging with SVD to tie the Knots
- Model Risk-sensitive Offline Reinforcement Learning
- MoDGS: Dynamic Gaussian Splatting from Casually-captured Monocular Videos
- Modular, Collaborative and Decentralized Deep Learning
- MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
- MOFFlow: Flow Matching for Structure Prediction of Metal-Organic Frameworks
- MoLEx: Mixture of Layer Experts for Fine-tuning with Sparse Upcycling
- MolSpectra: Pre-training 3D Molecular Representation with Multi-modal Energy Spectra
- Moner: Motion Correction in Undersampled Radial MRI with Unsupervised Neural Representation
- Monet: Mixture of Monosemantic Experts for Transformers
- Monitoring Latent World States in Language Models with Propositional Probes
- MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion
- Monte Carlo Planning with Large Language Model for Text-Based Games
- Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning
- Moral Alignment for LLM Agents
- More Experts Than Galaxies: Conditionally-Overlapping Experts with Biologically-Inspired Fixed Routing
- More RLHF, More Trust? On The Impact of Preference Alignment On Trustworthiness
- MORPHING TOKENS DRAW STRONG MASKED IMAGE MODELS
- MorphoDiff: Cellular Morphology Painting with Diffusion Models
- MOS: Model Synergy for Test-Time Adaptation on LiDAR-Based 3D Object Detection
- MoS: Unleashing Parameter Efficiency of Low-Rank Adaptation with Mixture of Shards
- MotherNet: Fast Training and Inference via Hyper-Network Transformers
- Motion-Agent: A Conversational Framework for Human Motion Generation with LLMs
- MotionAura: Generating High-Quality and Motion Consistent Videos using Discrete Diffusion
- MotionClone: Training-Free Motion Cloning for Controllable Video Generation
- Motion Control of High-Dimensional Musculoskeletal System with Hierarchical Model-Based Planning
- MotionDreamer: One-to-Many Motion Synthesis with Localized Generative Masked Transformer
- MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequences
- mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models
- MP-Mat: A 3D-and-Instance-Aware Matting Framework with Multiplane Representation
- MQuAKE-Remastered: Multi-Hop Knowledge Editing Can Only Be Advanced with Reliable Evaluations
- MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models
- MR-GSM8K: A Meta-Reasoning Benchmark for Large Language Model Evaluation
- MRS: A Fast Sampler for Mean Reverting Diffusion based on ODE and SDE Solvers
- Mr.Steve: Instruction-Following Agents in Minecraft with What-Where-When Memory
- MrT5: Dynamic Token Merging for Efficient Byte-level Language Models
- MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance
- MTSAM: Multi-Task Fine-Tuning for Segment Anything Model
- MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models
- Mufu: Multilingual Fused Learning for Low-Resource Translation with LLM
- MuHBoost: A Multi-Label Boosting Method For Practical Longitudinal Human Behavior Modeling
- MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding
- Multi-Accurate CATE is Robust to Unknown Covariate Shifts
- Multi-agent cooperation through learning-aware policy gradients
- Multiagent Finetuning of Language Models
- Multi-Dimensional Conformal Prediction
- Multi-domain Distribution Learning for De Novo Drug Design
- Multi-Draft Speculative Sampling: Canonical Architectures and Theoretical Limits
- Multi-Field Adaptive Retrieval
- Multi-Label Node Classification with Label Influence Propagation
- Multi-Label Test-Time Adaptation with Bound Entropy Minimization
- Multi-level Certified Defense Against Poisoning Attacks in Offline Reinforcement Learning
- Multilevel Generative Samplers for Investigating Critical Phenomena
- Multi-LLM-Agents Debate - Performance, Efficiency, and Scaling Challenges
- Multi-modal Agent Tuning: Building a VLM-Driven Agent for Efficient Tool Usage
- Multi-modal brain encoding models for multi-modal stimuli
- Multimodality Helps Few-Shot 3D Point Cloud Semantic Segmentation
- Multimodal Large Language Models for Inverse Molecular Design with Retrosynthetic Planning
- Multi-modal Learning: A Look Back and the Road Ahead
- Multimodal Lego: Model Merging and Fine-Tuning Across Topologies and Modalities in Biomedicine
- Multimodal Quantitative Language for Generative Recommendation
- Multimodal Situational Safety
- Multimodal Unsupervised Domain Generalization by Retrieving Across the Modality Gap
- Multi-objective antibody design with constrained preference optimization
- Multi-objective Differentiable Neural Architecture Search
- Multi-Perspective Data Augmentation for Few-shot Object Detection
- Multiple Heads are Better than One: Mixture of Modality Knowledge Experts for Entity Representation Learning
- Multiplicative Logit Adjustment Approximates Neural-Collapse-Aware Decision Boundary Adjustment
- Multi-Resolution Decomposable Diffusion Model for Non-Stationary Time Series Anomaly Detection
- Multi-Reward as Condition for Instruction-based Image Editing
- Multi-Robot Motion Planning with Diffusion Models
- Multi-Scale Fusion for Object Representation
- Multi-session, multi-task neural decoding from distinct cell-types and brain regions
- Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation
- Multi-Task Dense Predictions via Unleashing the Power of Diffusion
- Multiview Equivariance Improves 3D Understanding with Minimal Feature Finetuning
- MuPT: A Generative Symbolic Music Pretrained Transformer
- MUSE: Machine Unlearning Six-Way Evaluation for Language Models
- Mutual Effort for Efficiency: A Similarity-based Token Pruning for Vision Transformers in Self-Supervised Learning
- Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solver
- MVTokenFlow: High-quality 4D Content Generation using Multiview Token Flow
- NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative
- Narrowing Information Bottleneck Theory for Multimodal Image-Text Representations Interpretability
- Natural Language Inference Improves Compositionality in Vision-Language Models
- NatureLM-audio: an Audio-Language Foundation Model for Bioacoustics
- Navigating Neural Space: Revisiting Concept Activation Vectors to Overcome Directional Divergence
- Navigation-Guided Sparse Scene Representation for End-to-End Autonomous Driving
- ND-SDF: Learning Normal Deflection Fields for High-Fidelity Indoor Reconstruction
- NEAR: A Training-Free Pre-Estimator of Machine Learning Model Performance
- Near-Exact Privacy Amplification for Matrix Mechanisms
- Near, far: Patch-ordering enhances vision foundation models' scene understanding
- Near-optimal Active Regression of Single-Index Models
- Near-Optimal Online Learning for Multi-Agent Submodular Coordination: Tight Approximation and Communication Efficiency
- Near-Optimal Policy Identification in Robust Constrained Markov Decision Processes via Epigraph Form
- Needle In A Video Haystack: A Scalable Synthetic Evaluator for Video MLLMs
- Needle Threading: Can LLMs Follow Threads Through Near-Million-Scale Haystacks?
- NeRAF: 3D Scene Infused Neural Radiance and Acoustic Fields
- Nesterov acceleration in benignly non-convex landscapes
- NeSyC: A Neuro-symbolic Continual Learner For Complex Embodied Tasks in Open Domains
- NetFormer: An interpretable model for recovering dynamical connectivity in neuronal population dynamics
- NetMoE: Accelerating MoE Training through Dynamic Sample Placement
- Neural Approximate Mirror Maps for Constrained Diffusion Models
- Neural Causal Graph for Interpretable and Intervenable Classification
- Neural Context Flows for Meta-Learning of Dynamical Systems
- Neural Dueling Bandits: Principled Preference-Based Optimization with Non-Linear Reward Function
- Neural Exploratory Landscape Analysis
- Neural Fluid Simulation on Geometric Surfaces
- Neural Functions for Learning Periodic Signal
- Neural Interactive Proofs
- Neuralized Markov Random Field for Interaction-Aware Stochastic Human Trajectory Prediction
- Neural Multi-Objective Combinatorial Optimization via Graph-Image Multimodal Fusion
- Neural networks on Symmetric Spaces of Noncompact Type
- Neural Network Weights as a New Data Modality
- Neural ODE Transformers: Analyzing Internal Dynamics and Adaptive Fine-tuning
- Neural Phylogeny: Fine-Tuning Relationship Detection among Neural Networks
- NeuralPlane: Structured 3D Reconstruction in Planar Primitives with Neural Fields
- Neural Sampling from Boltzmann Densities: Fisher-Rao Curves in the Wasserstein Geometry
- Neural Spacetimes for DAG Representation Learning
- Neural Stochastic Differential Equations for Uncertainty-Aware Offline RL
- Neural Wave Equation for Irregularly Sampled Sequence Data
- NeurFlow: Interpreting Neural Networks through Critical Neuron Groups and Functional Interactions
- NeuroLM: A Universal Multi-task Foundation Model for Bridging the Gap between Language and EEG Signals
- Neuron-based Multifractal Analysis of Neuron Interaction Dynamics in Large Models
- Neuron based Personality Trait Induction in Large Language Models
- Neuron Platonic Intrinsic Representation From Dynamics Using Contrastive Learning
- Neuroplastic Expansion in Deep Reinforcement Learning
- New Algorithms for the Learning-Augmented k-means Problem
- New Frontiers in Associative Memories
- Newton Meets Marchenko-Pastur: Massively Parallel Second-Order Optimization with Hessian Sketching and Debiasing
- NextBestPath: Efficient 3D Mapping of Unseen Environments
- NEXT-MOL: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation
- N-ForGOT: Towards Not-forgetting and Generalization of Open Temporal Graph Learning
- nGPT: Normalized Transformer with Representation Learning on the Hypersphere
- NL-Eye: Abductive NLI For Images
- NNsight and NDIF: Democratizing Access to Foundation Model Internals
- Node Identifiers: Compact, Discrete Representations for Efficient Graph Learning
- Node Similarities under Random Projections: Limits and Pathological Cases
- Node-Time Conditional Prompt Learning in Dynamic Graphs
- No Equations Needed: Learning System Dynamics Without Relying on Closed-Form ODEs
- No Free Lunch: Fundamental Limits of Learning Non-Hallucinating Generative Models
- Noise-conditioned Energy-based Annealed Rewards (NEAR): A Generative Framework for Imitation Learning from Observation
- Noise Separation guided Candidate Label Reconstruction for Noisy Partial Label Learning
- Noise Stability Optimization for Finding Flat Minima: A Hessian-based Regularization Approach
- Noisy Test-Time Adaptation in Vision-Language Models
- No Location Left Behind: Measuring and Improving the Fairness of Implicit Representations for Earth Data
- Non-Adversarial Inverse Reinforcement Learning via Successor Feature Matching
- Nonasymptotic Analysis of Stochastic Gradient Descent with the Richardson–Romberg Extrapolation
- Nonconvex Stochastic Optimization under Heavy-Tailed Noises: Optimal Convergence without Gradient Clipping
- No Need to Talk: Asynchronous Mixture of Language Models
- Non-Equilibrium Dynamics of Hybrid Continuous-Discrete Ground-State Sampling
- Nonlinear multiregion neural dynamics with parametric impulse response communication channels
- Nonlinear Sequence Embedding by Monotone Variational Inequality
- Non-Stationary Dueling Bandits Under a Weighted Borda Criterion
- No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images
- No Preference Left Behind: Group Distributional Preference Optimization
- Normed Spaces for Graph Embedding
- Not All Heads Matter: A Head-Level KV Cache Compression Method with Integrated Retrieval and Reasoning
- Not All Language Model Features Are Linear
- Not All LLM-Generated Data Are Equal: Rethinking Data Weighting in Text Classification
- Not All Prompts Are Made Equal: Prompt-based Pruning of Text-to-Image Diffusion Models
- Not-So-Optimal Transport Flows for 3D Point Cloud Generation
- Nova: Generative Language Models for Assembly Code with Hierarchical Attention and Contrastive Learning
- NovelQA: Benchmarking Question Answering on Documents Exceeding 200K Tokens
- NoVo: Norm Voting off Hallucinations with Attention Heads in Large Language Models
- NRGBoost: Energy-Based Generative Boosted Trees
- NUDGE: Lightweight Non-Parametric Fine-Tuning of Embeddings for Retrieval
- Null Counterfactual Factor Interactions for Goal-Conditioned Reinforcement Learning
- Number Cookbook: Number Understanding of Language Models and How to Improve It
- NutriBench: A Dataset for Evaluating Large Language Models in Nutrition Estimation from Meal Descriptions
- NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
- OATS: Outlier-Aware Pruning Through Sparse and Low Rank Decomposition
- OBI-Bench: Can LMMs Aid in Study of Ancient Script on Oracle Bones?
- ObscuraCoder: Powering Efficient Code LM Pre-Training Via Obfuscation Grounding
- OCCAM: Towards Cost-Efficient and Accuracy-Aware Classification Inference
- Occlusion-aware Non-Rigid Point Cloud Registration via Unsupervised Neural Deformation Correntropy
- OccProphet: Pushing the Efficiency Frontier of Camera-Only 4D Occupancy Forecasting with an Observer-Forecaster-Refiner Framework
- OCEAN: Offline Chain-of-thought Evaluation and Alignment in Large Language Models
- OCEBO: Object-Centric Pretraining by Target Encoder Bootstrapping
- ODE-based Smoothing Neural Network for Reinforcement Learning Tasks
- O(d/T) Convergence Theory for Diffusion Probabilistic Models under Minimal Assumptions
- Offline Hierarchical Reinforcement Learning via Inverse Optimization
- Offline Model-Based Optimization by Learning to Rank
- Offline RL in Regular Decision Processes: Sample Efficiency via Language Metrics
- Offline RL with Smooth OOD Generalization in Convex Hull and its Neighborhood
- OGBench: Benchmarking Offline Goal-Conditioned RL
- OLMoE: Open Mixture-of-Experts Language Models
- OMG: Opacity Matters in Material Modeling with Gaussian Splatting
- OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision
- OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code
- OmniKV: Dynamic Context Selection for Efficient Long-Context LLMs
- Omni-MATH: A Universal Olympiad Level Mathematic Benchmark for Large Language Models
- OmniPhysGS: 3D Constitutive Gaussians for General Physics-Based Dynamics Generation
- OmniRe: Omni Urban Scene Reconstruction
- OmniSep: Unified Omni-Modality Sound Separation with Query-Mixup
- OmnixR: Evaluating Omni-modality Language Models on Reasoning across Modalities
- On Bits and Bandits: Quantifying the Regret-Information Trade-off
- On Calibration of LLM-based Guard Models for Reliable Content Moderation
- Once-for-All: Controllable Generative Image Compression with Dynamic Granularity Adaption
- On Discriminative Probabilistic Modeling for Self-Supervised Representation Learning
- On Disentangled Training for Nonlinear Transform in Learned Image Compression
- One for all and all for one: Efficient computation of partial Wasserstein distances on the line
- One-for-All Few-Shot Anomaly Detection via Instance-Induced Prompt Learning
- One Hundred Neural Networks and Brains Watching Videos: Lessons from Alignment
- One Model Transfer to All: On Robust Jailbreak Prompts Generation against LLMs
- One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt
- One slice is not enough: In search of stable conclusions in text-to-image evaluation
- One Step Diffusion via Shortcut Models
- On Evaluating the Durability of Safeguards for Open-Weight LLMs
- On Generalization Within Multi-Objective Reinforcement Learning Algorithms
- On Large Language Model Continual Unlearning
- On Linear Representations and Pretraining Data Frequency in Language Models
- Online Clustering with Nearly Optimal Consistency
- ONLINE EPSILON NET & PIERCING SET FOR GEOMETRIC CONCEPTS
- Online Neuro-Symbolic Predicate Invention for High-Level Planning
- Online Preference Alignment for Language Models via Count-based Exploration
- Online Reinforcement Learning in Non-Stationary Context-Driven Environments
- Online Reward-Weighted Fine-Tuning of Flow Matching with Wasserstein Regularization
- Online-to-Offline RL for Agent Alignment
- On LLM Knowledge Distillation - A Comparison between Forward KL and Reverse KL
- On Minimizing Adversarial Counterfactual Error in Adversarial Reinforcement Learning
- On Quantizing Neural Representation for Variable-Rate Video Coding
- On Rollouts in Model-Based Reinforcement Learning
- On Scaling Up 3D Gaussian Splatting Training
- On Speeding Up Language Model Evaluation
- On Statistical Rates of Conditional Diffusion Transformer: Approximation and Estimation
- On Stochastic Contextual Bandits with Knapsacks in Small Budget Regime
- On the Adversarial Risk of Test Time Adaptation: An Investigation into Realistic Test-Time Data Poisoning
- On the Adversarial Vulnerability of Label-Free Test-Time Adaptation
- On the Almost Sure Convergence of the Stochastic Three Points Algorithm
- On the Benefits of Attribute-Driven Graph Domain Adaptation
- On the Benefits of Memory for Modeling Time-Dependent PDEs
- On the Byzantine-Resilience of Distillation-Based Federated Learning
- On the Completeness of Invariant Geometric Deep Learning Models
- On the Computation of the Fisher Information in Continual Learning
- On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization
- On the Convergence of No-Regret Dynamics in Information Retrieval Games with Proportional Ranking Functions
- On the Crucial Role of Initialization for Matrix Factorization
- On the Effectiveness of Dataset Alignment for Fake Image Detection
- On the expressiveness and spectral bias of KANs
- On the Expressiveness of Rational ReLU Neural Networks With Bounded Depth
- On the Expressive Power of Sparse Geometric MPNNs
- On the feature learning in diffusion models
- On-the-fly Preference Alignment via Principle-Guided Decoding
- On the Fourier analysis in the SO(3) space : the EquiLoPO Network
- On the Hölder Stability of Multiset and Graph Neural Networks
- On the Identification of Temporal Causal Representation with Instantaneous Dependence
- On the Importance of Language-driven Representation Learning for Heterogeneous Federated Learning
- On the Inherent Privacy Properties of Discrete Denoising Diffusion Models
- On the Learn-to-Optimize Capabilities of Transformers in In-Context Sparse Recovery
- On the Linear Speedup of Personalized Federated Reinforcement Learning with Shared Representations
- On the Modeling Capabilities of Large Language Models for Sequential Decision Making
- On the Optimization and Generalization of Multi-head Attention
- On the Optimization and Generalization of Two-layer Transformers with Sign Gradient Descent
- On the Optimization Landscape of Low Rank Adaptation Methods for Large Language Models
- On the Performance Analysis of Momentum Method: A Frequency Domain Perspective
- On the Price of Differential Privacy for Hierarchical Clustering
- On the Relation between Trainability and Dequantization of Variational Quantum Learning Models
- On the Role of Attention Heads in Large Language Model Safety
- On the self-verification limitations of large language models on reasoning and planning tasks
- On the Transfer of Object-Centric Representation Learning
- Open-CK: A Large Multi-Physics Fields Coupling benchmarks in Combustion Kinetics
- OpenHands: An Open Platform for AI Software Developers as Generalist Agents
- OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data
- OpenPRM: Building Open-domain Process-based Reward Models with Preference Trees
- Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
- Open Science for Foundation Models
- Open-Set Graph Anomaly Detection via Normal Structure Regularisation
- OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation
- Open-Vocabulary Customization from CLIP via Data-Free Knowledge Distillation
- Open-World Reinforcement Learning over Long Short-Term Imagination
- Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation
- OPTAMI: Global Superlinear Convergence of High-order Methods
- OptiBench Meets ReSocratic: Measure and Improve LLMs for Optimization Modeling
- Optimal Brain Apoptosis
- Optimal Flow Transport and its Entropic Regularization: a GPU-friendly Matrix Iterative Algorithm for Flow Balance Satisfaction
- Optimality and Adaptivity of Deep Neural Features for Instrumental Variable Regression
- Optimality of Matrix Mechanism on $\ell_p^p$-metric
- Optimal Learning of Kernel Logistic Regression for Complex Classification Scenarios
- Optimal Memorization Capacity of Transformers
- Optimal Non-Asymptotic Rates of Value Iteration for Average-Reward Markov Decision Processes
- Optimal Protocols for Continual Learning via Statistical Physics and Control Theory
- Optimal Strong Regret and Violation in Constrained MDPs via Policy Optimization
- Optimal Transport for Time Series Imputation
- Optimistic Games for Combinatorial Bayesian Optimization with Application to Protein Design
- Optimization-Biased Hypernetworks for Generalizable Policy Generation
- Optimization by Parallel Quasi-Quantum Annealing with Gradient-Based Sampling
- Optimization with Access to Auxiliary Information
- Optimized Multi-Token Joint Decoding With Auxiliary Model for LLM Inference
- Optimizing $(L_0, L_1)$-Smooth Functions by Gradient Methods
- Optimizing 4D Gaussians for Dynamic Scene Video from Single Landscape Images
- Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization
- Optimizing importance weighting in the presence of sub-population shifts
- Optimizing Neural Network Representations of Boolean Networks
- Optimizing Posterior Samples for Bayesian Optimization via Rootfinding
- OptionZero: Planning with Learned Options
- Oracle efficient truncated statistics
- Orchestrating Heterogeneous Architectures for Fast Inference of Mixture-of-Experts Models
- Order-aware Interactive Segmentation
- ORSO: Accelerating Reward Design via Online Reward Selection and Policy Optimization
- Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution
- OS-ATLAS: Foundation Action Model for Generalist GUI Agents
- OSCAR: Operating System Control via State-Aware Reasoning and Re-Planning
- Oscillatory State-Space Models
- OSDA Agent: Leveraging Large Language Models for De Novo Design of Organic Structure Directing Agents
- OSTQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitting
- Outlier Synthesis via Hamiltonian Monte Carlo for Out-of-Distribution Detection
- Out-of-distribution Generalization for Total Variation based Invariant Risk Minimization
- Overcoming False Illusions in Real-World Face Restoration with Multi-Modal Guided Diffusion Model
- Overcoming Lower-Level Constraints in Bilevel Optimization: A Novel Approach with Regularized Gap Functions
- Overcoming Slow Decision Frequencies in Continuous Control: Model-Based Sequence Reinforcement Learning for Model-Free Control
- OvercookedV2: Rethinking Overcooked for Zero-Shot Coordination
- OVTR: End-to-End Open-Vocabulary Multiple Object Tracking with Transformer
- PABBO: Preferential Amortized Black-Box Optimization
- PaCA: Partial Connection Adaptation for Efficient Fine-Tuning
- Pacmann: Efficient Private Approximate Nearest Neighbor Search
- PAD: Personalized Alignment at Decoding-time
- PADRe: A Unifying Polynomial Attention Drop-in Replacement for Efficient Vision Transformer
- Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning
- Pairwise Elimination with Instance-Dependent Guarantees for Bandits with Cost Subsidy
- PaLD: Detection of Text Partially Written by Large Language Models
- PALMBENCH: A COMPREHENSIVE BENCHMARK OF COMPRESSED LARGE LANGUAGE MODELS ON MOBILE PLATFORMS
- PAL: Sample-Efficient Personalized Reward Modeling for Pluralistic Alignment
- Palu: KV-Cache Compression with Low-Rank Projection
- Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages
- PaPaGei: Open Foundation Models for Optical Physiological Signals
- Parameter and Memory Efficient Pretraining via Low-rank Riemannian Optimization
- Parameter-Efficient and Stable Singular Value Adaptation for Pre-Trained Models
- Parameter Expanded Stochastic Gradient Markov Chain Monte Carlo
- PaRa: Personalizing Text-to-Image Diffusion via Parameter Rank Reduction
- ParaSolver: A Hierarchical Parallel Integral Solver for Diffusion Models
- ParetoFlow: Guided Flows in Multi-Objective Optimization
- Pareto Low-Rank Adapters: Efficient Multi-Task Learning with Preferences
- Pareto Prompt Optimization
- ParFam -- (Neural Guided) Symbolic Regression via Continuous Global Optimization
- Partial Gromov-Wasserstein Metric
- Partially Observed Trajectory Inference using Optimal Transport and a Dynamics Prior
- PARTNR: A Benchmark for Planning and Reasoning in Embodied Multi-agent Tasks
- Patch-Level Training for Large Language Models
- PathGen-1.6M: 1.6 Million Pathology Image-text Pairs Generation through Multi-agent Collaboration
- PCNN: Probable-Class Nearest-Neighbor Explanations Improve Fine-Grained Image Classification Accuracy for AIs and Humans
- PEARL: Parallel Speculative Decoding with Adaptive Draft Length
- PEARL: Towards Permutation-Resilient LLMs
- PEAR: Primitive Enabled Adaptive Relabeling for Boosting Hierarchical Reinforcement Learning
- Pedestrian Motion Reconstruction: A Large-scale Benchmark via Mixed Reality Rendering with Multiple Perspectives and Modalities
- Peeking Behind Closed Doors: Risks of LLM Evaluation by Private Data Curators
- Periodic Materials Generation using Text-Guided Joint Diffusion Model
- PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform Generation
- Perm: A Parametric Representation for Multi-Style 3D Hair Modeling
- Permute-and-Flip: An optimally stable and watermarkable decoder for LLMs
- Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models
- Perplexity Trap: PLM-Based Retrievers Overrate Low Perplexity Documents
- Persistent Pre-training Poisoning of LLMs
- Personality Alignment of Large Language Models
- Perturbation-Restrained Sequential Model Editing
- PerturboLLaVA: Reducing Multimodal Hallucinations with Perturbative Visual Training
- PETRA: Parallel End-to-end Training with Reversible Architectures
- PFDiff: Training-free Acceleration of Diffusion Models through the Gradient Guidance of Past and Future
- PFGuard: A Generative Framework with Privacy and Fairness Safeguards
- PharmacoMatch: Efficient 3D Pharmacophore Screening via Neural Subgraph Matching
- Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
- PhiNets: Brain-inspired Non-contrastive Learning Based on Temporal Prediction Hypothesis
- PhyloLM: Inferring the Phylogeny of Large Language Models and Predicting their Performances in Benchmarks
- PhyloVAE: Unsupervised Learning of Phylogenetic Trees via Variational Autoencoders
- PhyMPGN: Physics-encoded Message Passing Graph Network for spatiotemporal PDE systems
- Physics-aligned field reconstruction with diffusion bridge
- Physics-Informed Deep Inverse Operator Networks for Solving PDE Inverse Problems
- Physics-Informed Diffusion Models
- Physics-Informed Neural Predictor
- Physics-informed Temporal Difference Metric Learning for Robot Motion Planning
- Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process
- Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems
- Physics of Language Models: Part 3.2, Knowledge Manipulation
- Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws
- Physiome-ODE: A Benchmark for Irregularly Sampled Multivariate Time-Series Forecasting Based on Biological ODEs
- PhysPDE: Rethinking PDE Discovery and a Physical HYpothesis Selection Benchmark
- PianoMotion10M: Dataset and Benchmark for Hand Motion Generation in Piano Performance
- PICASO: Permutation-Invariant Composition of Document States
- PiCO: Peer Review in LLMs based on Consistency Optimization
- PIED: Physics-Informed Experimental Design for Inverse Problems
- PIG: Physics-Informed Gaussians as Adaptive Parametric Mesh Representations
- PIN: Prolate Spheroidal Wave Function-based Implicit Neural Representations
- PIORF: Physics-Informed Ollivier-Ricci Flow for Long–Range Interactions in Mesh Graph Neural Networks
- Pitfalls of Evidence-Based AI Policy
- PivotMesh: Generic 3D Mesh Generation via Pivot Vertices Guidance
- Planning Anything with Rigor: General-Purpose Zero-Shot Planning with LLM-based Formalized Programming
- Planning in Natural Language Improves LLM Search for Code Generation
- Plastic Learning with Deep Fourier Features
- PLENCH: Realistic Evaluation of Deep Partial-Label Learning Algorithms
- Plug, Play, and Generalize: Length Extrapolation with Pointer-Augmented Neural Memory
- pMoE: Prompting Diverse Experts Together Wins More in Visual Adaptation
- PN-GAIL: Leveraging Non-optimal Information from Imperfect Demonstrations
- PnP-Flow: Plug-and-Play Image Restoration with Flow Matching
- POGEMA: A Benchmark Platform for Cooperative Multi-Agent Navigation
- Point-based Instance Completion with Scene Constraints
- Point Cluster: A Compact Message Unit for Communication-Efficient Collaborative Perception
- PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object Detection
- Point-SAM: Promptable 3D Segmentation Model for Point Clouds
- Poison-splat: Computation Cost Attack on 3D Gaussian Splatting
- Poisson-Dirac Neural Networks for Modeling Coupled Dynamical Systems across Domains
- PolaFormer: Polarity-aware Linear Attention for Vision Transformers
- Policy-aware Reward Modeling with Uncertainty-Gradient based Data Augmentation
- Policy Decorator: Model-Agnostic Online Refinement for Large Policy Model
- Policy Design in Long-run Welfare Dynamics
- Policy Gradient with Kernel Quadrature
- Policy Optimization under Imperfect Human Interactions with Agent-Gated Shared Autonomy
- PolyhedronNet: Representation Learning for Polyhedra with Surface-attributed Graph
- PolyNet: Learning Diverse Solution Strategies for Neural Combinatorial Optimization
- Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models
- PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training Runs
- Polyrating: A Cost-Effective and Bias-Aware Rating System for LLM Evaluation
- PooDLe🐩: Pooled and dense self-supervised learning from naturalistic videos
- Population Transformer: Learning Population-level Representations of Neural Activity
- Port-Hamiltonian Architectural Bias for Long-Range Propagation in Deep Graph Networks
- PortLLM: Personalizing Evolving Large Language Models with Training-Free and Portable Model Patches
- Positional Embeddings in Transformer Models: Evolution from Text to Vision Domains
- Positive-Unlabeled Diffusion Models for Preventing Sensitive Data Generation
- PostCast: Generalizable Postprocessing for Precipitation Nowcasting via Unsupervised Blurriness Modeling
- PostEdit: Posterior Sampling for Efficient Zero-Shot Image Editing
- Posterior-Mean Rectified Flow: Towards Minimum MSE Photo-Realistic Image Restoration
- Post-hoc Reward Calibration: A Case Study on Length Bias
- POTEC: Off-Policy Contextual Bandits for Large Action Spaces via Policy Decomposition
- PPT: Patch Order Do Matters In Time Series Pretext Task
- PQMass: Probabilistic Assessment of the Quality of Generative Models using Probability Mass Estimation
- Preble: Efficient Distributed Prompt Scheduling for LLM Serving
- Precedence-Constrained Winter Value for Effective Graph Data Valuation
- Precise Localization of Memories: A Fine-grained Neuron-level Knowledge Editing Technique for LLMs
- Precise Parameter Localization for Textual Generation in Diffusion Models
- Predicate Hierarchies Improve Few-Shot State Classification
- Predicting the Energy Landscape of Stochastic Dynamical System via Physics-informed Self-supervised Learning
- Prediction Risk and Estimation Risk of the Ridgeless Least Squares Estimator under General Assumptions on Regression Errors
- Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation
- Predictive Uncertainty Quantification for Bird's Eye View Segmentation: A Benchmark and Novel Loss Function
- Preference Diffusion for Recommendation
- Preference Elicitation for Offline Reinforcement Learning
- Preference Optimization as Probabilistic Inference
- Preference Optimization for Reasoning with Pseudo Feedback
- Preserving Deep Representations in One-Shot Pruning: A Hessian-Free Second-Order Optimization Framework
- Presto! Distilling Steps and Layers for Accelerating Music Generation
- Pre-Training for 3D Hand Pose Estimation with Contrastive Learning on Large-Scale Hand Images in the Wild
- Pre-training of Foundation Adapters for LLM Fine-tuning
- Prevalence of Negative Transfer in Continual Reinforcement Learning: Analyses and a Simple Baseline
- Prioritized Generative Replay
- PRISM: Privacy-Preserving Improved Stochastic Masking for Federated Generative Models
- Privacy Auditing of Large Language Models
- Privacy-Aware Lifelong Learning
- Privacy-Preserving Personalized Federated Prompt Learning for Multimodal Large Language Models
- Privately Counting Partially Ordered Data
- Private Mechanism Design via Quantile Estimation
- Proactive Agent: Shifting LLM Agents from Reactive Responses to Active Assistance
- Proactive Privacy Amnesia for Large Language Models: Safeguarding PII with Negligible Impact on Model Utility
- ProAdvPrompter: A Two-Stage Journey to Effective Adversarial Prompting for LLMs
- Probabilistic Conformal Prediction with Approximate Conditional Validity
- Probabilistic Geometric Principal Component Analysis with application to neural data
- Probabilistic Language-Image Pre-Training
- Probabilistic Learning to Defer: Handling Missing Expert Annotations and Controlling Workload Distribution
- Probabilistic Neural Pruning via Sparsity Evolutionary Fokker-Planck-Kolmogorov Equation
- Probe before You Talk: Towards Black-box Defense against Backdoor Unalignment for Large Language Models
- Probe Pruning: Accelerating LLMs through Dynamic Pruning via Model-Probing
- Probing the Latent Hierarchical Structure of Data via Diffusion Models
- Problem-Parameter-Free Federated Learning
- Procedural Synthesis of Synthesizable Molecules
- Process Reward Model with Q-value Rankings
- ProFI-Painter: Text-Guided Prompt-Faithful Image Inpainting with Diffusion Models
- Programming Refusal with Conditional Activation Steering
- Progressive Compression with Universally Quantized Diffusion Models
- Progressive distillation induces an implicit curriculum
- Progressively Refined Differentiable Physics
- Progressive Mixed-Precision Decoding for Efficient LLM Inference
- Progressive Parameter Efficient Transfer Learning for Semantic Segmentation
- Progressive Token Length Scaling in Transformer Encoders for Efficient Universal Segmentation
- Progress or Regress? Self-Improvement Reversal in Post-training
- Projection Head is Secretly an Information Bottleneck
- Prompt as Knowledge Bank: Boost Vision-language model via Structural Representation for zero-shot medical detection
- Prompting Fairness: Integrating Causality to Debias Large Language Models
- Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models
- Proposal: ICLR 2025 Workshop on Building Trust in Language Models and Applications
- ProtComposer: Compositional Protein Structure Generation with 3D Ellipsoids
- Protecting against simultaneous data poisoning attacks
- Proteina: Scaling Flow-based Protein Structure Generative Models
- ProteinBench: A Holistic Evaluation of Protein Foundation Models
- Protein Language Model Fitness is a Matter of Preference
- ProtoSnap: Prototype Alignment For Cuneiform Signs
- Prototype antithesis for biological few-shot class-incremental learning
- ProtPainter: Draw or Drag Protein via Topology-guided Diffusion
- Provable Benefit of Annealed Langevin Monte Carlo for Non-log-concave Sampling
- Provable Convergence and Limitations of Geometric Tempering for Langevin Dynamics
- Provable Convergence Bounds for Hybrid Dynamical Sampling and Optimization
- Provable Uncertainty Decomposition via Higher-Order Calibration
- Provable unlearning in topic modeling and downstream tasks
- Provable weak-to-strong generalization via benign overfitting
- Provably Accurate Shapley Value Estimation via Leverage Score Sampling
- Provably Reliable Conformal Prediction Sets in the Presence of Data Poisoning
- Provably Robust Explainable Graph Neural Networks against Graph Perturbation Attacks
- Provably Safeguarding a Classifier from OOD and Adversarial Samples: an Extreme Value Theory Approach
- Provence: efficient and robust context pruning for retrieval-augmented generation
- Proving Olympiad Inequalities by Synergizing LLMs and Symbolic Reasoning
- Proximal Mapping Loss: Understanding Loss Functions in Crowd Counting & Localization
- Proxy Denoising for Source-Free Domain Adaptation
- PseDet: Revisiting the Power of Pseudo Label in Incremental Object Detection
- P-SPIKESSM: HARNESSING PROBABILISTIC SPIKING STATE SPACE MODELS FOR LONG-RANGE DEPENDENCY TASKS
- Pursuing Better Decision Boundaries for Long-Tailed Object Detection via Category Information Amount
- Pursuing Feature Separation based on Neural Collapse for Out-of-Distribution Detection
- Pushing the Limits of All-Atom Geometric Graph Neural Networks: Pre-Training, Scaling, and Zero-Shot Transfer
- PuzzleFusion++: Auto-agglomerative 3D Fracture Assembly by Denoise and Verify
- PvNeXt: Rethinking Network Design and Temporal Motion for Point Cloud Video Recognition
- PWM: Policy Learning with Multi-Task World Models
- Pyramidal Flow Matching for Efficient Video Generative Modeling
- Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation
- QERA: an Analytical Framework for Quantization Error Reconstruction
- Qihoo-T2X: An Efficient Proxy-Tokenized Diffusion Transformer for Text-to-Any-Task
- Qinco2: Vector Compression and Search with Improved Implicit Neural Codebooks
- QMP: Q-switch Mixture of Policies for Multi-Task Behavior Sharing
- qNBO: quasi-Newton Meets Bilevel Optimization
- QPM: Discrete Optimization for Globally Interpretable Image Classification
- QP-SNN: Quantized and Pruned Spiking Neural Networks
- Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning
- QuaDiM: A Conditional Diffusion Model For Quantum State Property Estimation
- Quality Measures for Dynamic Graph Generative Models
- Quamba: A Post-Training Quantization Recipe for Selective State Space Models
- Quantifying Generalization Complexity for Large Language Models
- Quantify Uncertainty and Hallucination in Foundation Models: The Next Frontier in Reliable AI
- Quantitative Approximation for Neural Operators in Nonlinear Parabolic Equations
- Quantitative Certification of Bias in Large Language Models
- Quantized Spike-driven Transformer
- Quantum (Inspired) $D^2$-sampling with Applications
- Quantum-PEFT: Ultra parameter-efficient fine-tuning
- Query-based Knowledge Transfer for Heterogeneous Learning Environments
- Quest: Query-centric Data Synthesis Approach for Long-context Scaling of Large Language Model
- R2Det: Exploring Relaxed Rotation Equivariance in 2D Object Detection
- Radar: Fast Long-Context Decoding for Any Transformer
- RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment
- RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards
- RAG-SR: Retrieval-Augmented Generation for Neural Symbolic Regression
- RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization
- RandLoRA: Full rank parameter-efficient fine-tuning of large models
- Random Is All You Need: Random Noise Injection on Feature Statistics for Generalizable Deep Image Denoising
- Random-Set Neural Networks
- Ranking-aware adapter for text-driven image ordering with CLIP
- RankSHAP: Shapley Value Based Feature Attributions for Learning to Rank
- Rapidly Adapting Policies to the Real-World via Simulation-Guided Fine-Tuning
- RAPID: Retrieval Augmented Training of Differentially Private Diffusion Models
- Rapid Selection and Ordering of In-Context Demonstrations via Prompt Embedding Clustering
- Rare event modeling with self-regularized normalizing flows: what can we learn from a single failure?
- Rare-to-Frequent: Unlocking Compositional Generation Power of Diffusion Models on Rare Concepts with LLM Guidance
- RaSA: Rank-Sharing Low-Rank Adaptation
- Rational Decision-Making Agent with Learning Internal Utility Judgment
- Rationalizing and Augmenting Dynamic Graph Neural Networks
- RA-TTA: Retrieval-Augmented Test-Time Adaptation for Vision-Language Models
- RazorAttention: Efficient KV Cache Compression Through Retrieval Heads
- RB-Modulation: Training-Free Personalization using Stochastic Optimal Control
- RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
- Reading Your Heart: Learning ECG Words and Sentences via Pre-training ECG Language Model
- Ready-to-React: Online Reaction Policy for Two-Character Interaction Generation
- Real2Code: Reconstruct Articulated Objects via Code Generation
- Re-Aligning Language to Visual Objects with an Agentic Workflow
- Real-time design of architectural structures with differentiable mechanics and neural networks
- Realtime Reinforcement Learning: Towards Rapid Asynchronous Deployment of Large Models
- Real-Time Video Generation with Pyramid Attention Broadcast
- Reasoning Elicitation in Language Models via Counterfactual Feedback
- Reasoning-Enhanced Healthcare Predictions with Knowledge Graph Community Retrieval
- Reasoning of Large Language Models over Knowledge Graphs with Super-Relations
- Reassessing EMNLP 2024’s Best Paper: Does Divergence-Based Calibration for MIAs Hold Up?
- Reassessing How to Compare and Improve the Calibration of Machine Learning Models
- ReAttention: Training-Free Infinite Context with Finite Attention Scope
- REBIND: Enhancing Ground-state Molecular Conformation Prediction via Force-Based Graph Rewiring
- RECAST: Reparameterized, Compact weight Adaptation for Sequential Tasks
- RecDreamer: Consistent Text-to-3D Generation via Uniform Score Distillation
- RecFlow: An Industrial Full Flow Recommendation Dataset
- Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon
- ReCogLab: a framework testing relational reasoning, cognitive hypotheses on LLMs
- Recognize Any Surgical Object: Unleashing the Power of Weakly-Supervised Data
- Reconciling Model Multiplicity for Downstream Decision Making
- Reconsidering Faithfulness in Regular, Self-Explainable and Domain Invariant GNNs
- Reconstruction-Guided Policy: Enhancing Decision-Making through Agent-Wise State Consistency
- Reconstructive Visual Instruction Tuning
- Recovering Manifold Structure Using Ollivier Ricci Curvature
- Recovery of Causal Graph Involving Latent Variables via Homologous Surrogates
- Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow
- ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability
- Redefining the task of Bioactivity Prediction
- Reducing Hallucinations in Large Vision-Language Models via Latent Space Steering
- REEF: Representation Encoding Fingerprints for Large Language Models
- Re-evaluating Open-ended Evaluation of Large Language Models
- Re-Evaluating the Impact of Unseen-Class Unlabeled Data on Semi-Supervised Learning Model
- Reexamining the Aleatoric and Epistemic Uncertainty Dichotomy
- RefactorBench: Evaluating Stateful Reasoning in Language Agents Through Code
- Refine-by-Align: Reference-Guided Artifacts Refinement through Semantic Alignment
- REFINE: Inversion-Free Backdoor Defense via Model Reprogramming
- Refine Knowledge of Large Language Models via Adaptive Contrastive Learning
- Refining CLIP's Spatial Awareness: A Visual-Centric Perspective
- Reflective Gaussian Splatting
- Reflexive Guidance: Improving OoDD in Vision-Language Models via Self-Guided Image-Adaptive Concept Generation
- Reframing Structure-Based Drug Design Model Evaluation via Metrics Correlated to Practical Needs
- ReGenesis: LLMs can Grow into Reasoning Generalists via Self-Improvement
- ReGen: Generative Robot Simulation via Inverse Design
- REGENT: A Retrieval-Augmented Generalist Agent That Can Act In-Context in New Environments
- RegMix: Data Mixture as Regression for Language Model Pre-training
- Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
- Regret Bounds for Episodic Risk-Sensitive Linear Quadratic Regulator
- Regretful Decisions under Label Noise
- Regret-Optimal List Replicable Bandit Learning: Matching Upper and Lower Bounds
- Regularization by Texts for Latent Diffusion Inverse Solvers
- Regularized Proportional Fairness Mechanism for Resource Allocation Without Money
- Regularizing Energy among Training Samples for Out-of-Distribution Generalization
- Regulatory DNA Sequence Design with Reinforcement Learning
- Re-Imagining Multimodal Instruction Tuning: A Representation View
- Reinforcement Learning for Control of Non-Markovian Cellular Population Dynamics
- Reinforcement Learning from Imperfect Corrective Actions and Proxy Rewards
- Reinforcement learning with combinatorial actions for coupled restless bandits
- Relation-Aware Diffusion for Heterogeneous Graphs with Partially Observed Features
- Relax and Merge: A Simple Yet Effective Framework for Solving Fair $k$-Means and $k$-sparse Wasserstein Barycenter Problems
- Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA
- RelCon: Relative Contrastive Learning for a Motion Foundation Model for Wearable Data
- Release the Powers of Prompt Tuning: Cross-Modality Prompt Transfer
- Reliable and Diverse Evaluation of LLM Medical Knowledge Mastery
- RelitLRM: Generative Relightable Radiance for Large Reconstruction Models
- ReMatching Dynamic Reconstruction Flow
- REMEDY: Recipe Merging Dynamics in Large Vision-Language Models
- ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing
- Remove Symmetries to Control Model Expressivity
- ReNovo: Retrieval-Based \emph{De Novo} Mass Spectrometry Peptide Sequencing
- Repetition Improves Language Model Embeddings
- RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph
- Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
- Representational Similarity via Interpretable Visual Concepts
- Representation Learning for Long Tail Recognition via Feature Space Re-Construction
- Representative Guidance: Diffusion Model Sampling with Consistency
- Repulsive Latent Score Distillation for Solving Inverse Problems
- Repurposing in AI: A Distinct Approach or an Extension of Creative Problem Solving?
- ReSi: A Comprehensive Benchmark for Representational Similarity Measures
- Residual Connections and Normalization Can Provably Prevent Oversmoothing in GNNs
- Residual Deep Gaussian Processes on Manifolds
- Residual Kernel Policy Network: Enhancing Stability and Robustness in RKHS-Based Reinforcement Learning
- Residual-MPPI: Online Policy Customization for Continuous Control
- Residual Stream Analysis with Multi-Layer SAEs
- Restating the Proof of Linear Convergence for Linear GNNs
- Restructuring Vector Quantization with the Rotation Trick
- Restyling Unsupervised Concept Based Interpretable Networks with Generative Models
- RESuM: A Rare Event Surrogate Model for Physics Detector Design
- Rethinking and improving autoformalization: towards a faithful metric and a Dependency Retrieval-based approach
- Rethinking Audio-Visual Adversarial Vulnerability from Temporal and Modality Perspectives
- Rethinking Classifier Re-Training in Long-Tailed Recognition: Label Over-Smooth Can Balance
- Rethinking Copyright Infringements In the Era Of Text-to-Image Generative Models
- Rethinking Diffusion Posterior Sampling: From Conditional Score Estimator to Maximizing a Posterior
- Rethinking Evaluation of Sparse Autoencoders through the Representation of Polysemous Words
- Rethinking Fair Representation Learning for Performance-Sensitive Tasks
- Rethinking Graph Neural Networks From A Geometric Perspective Of Node Features
- Rethinking Graph Prompts: Unraveling the Power of Data Manipulation in Graph Neural Networks
- Rethinking Invariance in In-context Learning
- Rethinking Invariance Regularization in Adversarial Training to Improve Robustness-Accuracy Trade-off
- Re-Thinking Inverse Graphics With Large Language Models
- Rethinking Light Decoder-based Solvers for Vehicle Routing Problems
- Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond
- Rethinking Multiple-Instance Learning From Feature Space to Probability Space
- Rethinking Neural Multi-Objective Combinatorial Optimization via Neat Weight Embedding
- Rethinking Reward Model Evaluation: Are We Barking up the Wrong Tree?
- Rethinking Reward Modeling in Preference-based Large Language Model Alignment
- Rethinking Self-Distillation: Label Averaging and Enhanced Soft Label Refinement with Partial Labels
- Rethinking Shapley Value for Negative Interactions in Non-convex Games
- Rethinking Sparse Scaling through the Lens of Average Active Parameter Count
- Rethinking Spiking Neural Networks from an Ensemble Learning Perspective
- Rethinking the generalization of drug target affinity prediction algorithms via similarity aware evaluation
- Rethinking the role of frames for SE(3)-invariant crystal structure modeling
- Rethinking Visual Counterfactual Explanations Through Region Constraint
- Reti-Diff: Illumination Degradation Image Restoration with Retinex-based Latent Diffusion Model
- Retri3D: 3D Neural Graphics Representation Retrieval
- Retrieval Augmented Diffusion Model for Structure-informed Antibody Design and Optimization
- Retrieval Head Mechanistically Explains Long-Context Factuality
- RetroInText: A Multimodal Large Language Model Enhanced Framework for Retrosynthetic Planning via In-Context Representation Learning
- Revamping Diffusion Guidance for Conditional and Unconditional Generation
- Revealing and Mitigating Over-Attention in Knowledge Editing
- Revealing and Reducing Gender Biases in Vision and Language Assistants (VLAs)
- Revealing the 3D Cosmic Web through Gravitationally Constrained Neural Fields
- Reveal Object in Lensless Photography via Region Gaze and Amplification
- RevisEval: Improving LLM-as-a-Judge via Response-Adapted References
- Revisiting a Design Choice in Gradient Temporal Difference Learning
- Revisiting Convolution Architecture in the Realm of DNA Foundation Models
- Revisiting Delta-Parameter Pruning For Fine-Tuned Models
- Revisiting DNN Training for Intermittently-Powered Energy-Harvesting Micro-Computers
- Revisiting Energy Based Models as Policies: Ranking Noise Contrastive Estimation and Interpolating Energy Models
- Revisiting Feature Prediction for Learning Visual Representations from Video
- Revisiting In-context Learning Inference Circuit in Large Language Models
- Revisiting Large-Scale Non-convex Distributionally Robust Optimization
- Revisiting Mode Connectivity in Neural Networks with Bezier Surface
- REVISITING MULTI-PERMUTATION EQUIVARIANCE THROUGH THE LENS OF IRREDUCIBLE REPRESENTATIONS
- Revisiting Nearest Neighbor for Tabular Data: A Deep Tabular Baseline Two Decades Later
- Revisiting Prefix-tuning: Statistical Benefits of Reparameterization among Prompts
- Revisiting Random Walks for Learning on Graphs
- Revisiting Source-Free Domain Adaptation: a New Perspective via Uncertainty Control
- Revisiting Zeroth-Order Optimization: Minimum-Variance Two-Point Estimators and Directionally Aligned Perturbations
- Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models
- Revisit Micro-batch Clipping: Adaptive Data Pruning via Gradient Manipulation
- Revisit the open nature of open vocabulary segmentation
- Revolutionizing EMCCD Denoising through a Novel Physics-Based Learning Framework for Noise Modeling
- REvolve: Reward Evolution with Large Language Models using Human Feedback
- Reward Dimension Reduction for Scalable Multi-Objective Reinforcement Learning
- Reward Guided Latent Consistency Distillation
- Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning
- Reward Learning from Multiple Feedback Types
- RFMamba: Frequency-Aware State Space Model for RF-Based Human-Centric Perception
- RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction
- RGB-Event ISP: The Dataset and Benchmark
- Risk-Controlling Model Selection via Guided Bayesian Optimization
- Risk Informed Policy Learning for Safer Exploration
- Risk-Sensitive Diffusion: Robustly Optimizing Diffusion Models with Noisy Samples
- Risk-Sensitive Variational Actor-Critic: A Model-Based Approach
- RMB: Comprehensively benchmarking reward models in LLM alignment
- RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
- RMP-SAM: Towards Real-Time Multi-Purpose Segment Anything
- RNNs are not Transformers (Yet): The Key Bottleneck on In-Context Retrieval
- RoboCat: A Self-Improving Generalist Agent for Robotic Manipulation
- Robotouille: An Asynchronous Planning Benchmark for LLM Agents
- Robots Pre-train Robots: Manipulation-Centric Robotic Representation from Large-Scale Robot Dataset
- RobuRCDet: Enhancing Robustness of Radar-Camera Fusion in Bird's Eye View for 3D Object Detection
- Robust Barycenter Estimation using Semi-Unbalanced Neural Optimal Transport
- Robust Conformal Prediction with a Single Binary Certificate
- Robust Deep Equivariant Structure from Motion
- Robust Feature Learning for Multi-Index Models in High Dimensions
- Robust Function-Calling for On-Device Language Model via Function Masking
- Robust Gymnasium: A Unified Modular Benchmark for Robust Reinforcement Learning
- RobustKV: Defending Large Language Models against Jailbreak Attacks via KV Eviction
- Robust Learning in Bayesian Parallel Branching Graph Neural Networks: The Narrow Width Limit
- Robust LLM safeguarding via refusal feature adversarial training
- Robustness Auditing for Linear Regression: To Singularity and Beyond
- Robustness Inspired Graph Backdoor Defense
- Robustness of Quantum Algorithms for Nonconvex Optimization
- Robustness Reprogramming for Representation Learning
- Robust-PIFu: Robust Pixel-aligned Implicit Function for 3D Human Digitalization from a Single Image
- Robust Representation Consistency Model via Contrastive Denoising
- Robust Root Cause Diagnosis using In-Distribution Interventions
- Robust Simulation-Based Inference under Missing Data
- Robust System Identification: Finite-sample Guarantees and Connection to Regularization
- Robust Transfer of Safety-Constrained Reinforcement Learning Agents
- Robust Watermarking Using Generative Priors Against Image Editing: From Benchmarking to Advances
- Robust Weight Initialization for Tanh Neural Networks with Fixed Point Analysis
- RocketEval: Efficient automated LLM evaluation via grading checklist
- Rodimus*: Breaking the Accuracy-Efficiency Trade-Off with Efficient Attentions
- Root Cause Analysis of Anomalies in Multivariate Time Series through Granger Causal Discovery
- Rotated Runtime Smooth: Training-Free Activation Smoother for accurate INT4 inference
- Round and Round We Go! What makes Rotary Positional Encodings useful?
- RouteLLM: Learning to Route LLMs from Preference Data
- ROUTE: Robust Multitask Tuning and Collaboration for Text-to-SQL
- Routing Experts: Learning to Route Dynamic Experts in Existing Multi-modal Large Language Models
- RRM: Robust Reward Model Training Mitigates Reward Hacking
- R-Sparse: Rank-Aware Activation Sparsity for Efficient LLM Inference
- RTDiff: Reverse Trajectory Synthesis via Diffusion for Offline Reinforcement Learning
- RTop-K: Ultra-Fast Row-Wise Top-K Selection for Neural Network Acceleration on GPUs
- RuAG: Learned-rule-augmented Generation for Large Language Models
- S4M: S4 for multivariate time series forecasting with Missing values
- SafeDiffuser: Safe Planning with Diffusion Probabilistic Models
- Safety Alignment Should be Made More Than Just a Few Tokens Deep
- Safety Layers in Aligned Large Language Models: The Key to LLM Security
- Safety-Prioritizing Curricula for Constrained Reinforcement Learning
- SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation
- SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration
- SAGEPhos: Sage Bio-Coupled and Augmented Fusion for Phosphorylation Site Detection
- Sail into the Headwind: Alignment via Robust Rewards and Dynamic Labels against Reward Hacking
- SaLoRA: Safety-Alignment Preserved Low-Rank Adaptation
- Salvage: Shapley-distribution Approximation Learning Via Attribution Guided Exploration for Explainable Image Classification
- SAM 2: Segment Anything in Images and Videos
- Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
- Samba: Synchronized Set-of-Sequences Modeling for Multiple Object Tracking
- SAM-CP: Marrying SAM with Composable Prompts for Versatile Segmentation
- SaMer: A Scenario-aware Multi-dimensional Evaluator for Large Language Models
- Sample then Identify: A General Framework for Risk Control and Assessment in Multimodal Large Language Models
- SAMRefiner: Taming Segment Anything Model for Universal Mask Refinement
- SANA: Efficient High-Resolution Text-to-Image Synthesis with Linear Diffusion Transformers
- SANER: Annotation-free Societal Attribute Neutralizer for Debiasing CLIP
- SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation
- Satisficing Exploration in Bandit Optimization
- SAVA: Scalable Learning-Agnostic Data Valuation
- SBSC: Step-by-Step Coding for Improving Mathematical Olympiad Performance
- Scalable and Certifiable Graph Unlearning: Overcoming the Approximation Error Barrier
- Scalable Bayesian Learning with posteriors
- Scalable Benchmarking and Robust Learning for Noise-Free Ego-Motion and 3D Reconstruction from Noisy Video
- Scalable Decentralized Learning with Teleportation
- Scalable Decision-Making in Stochastic Environments through Learned Temporal Abstraction
- Scalable Discrete Diffusion Samplers: Combinatorial Optimization and Statistical Physics
- Scalable Extraction of Training Data from Aligned, Production Language Models
- Scalable Influence and Fact Tracing for Large Language Model Pretraining
- Scalable Lifelong Multimodal Instruction Tuning via Dynamic Data Selection
- Scalable Mechanistic Neural Networks
- Scalable Universal T-Cell Receptor Embeddings from Adaptive Immune Repertoires
- Scale-Aware Contrastive Reverse Distillation for Unsupervised Anomaly Detection
- Scale-aware Recognition in Satellite Images under Resource Constraints
- Scale-Free Graph-Language Models
- Scaling and evaluating sparse autoencoders
- Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens
- Scaling Diffusion Language Models via Adaptation from Autoregressive Models
- Scaling FP8 training to trillion-token LLMs
- Scaling Instruction-tuned LLMs to Million-token Contexts via Hierarchical Synthetic Data Generation
- Scaling In-the-Wild Training for Diffusion-based Illumination Harmonization and Editing by Imposing Consistent Light Transport
- Scaling Large Language Model-based Multi-Agent Collaboration
- Scaling Laws for Adversarial Attacks on Language Model Activations and Tokens
- Scaling Laws for Downstream Task Performance in Machine Translation
- Scaling Laws for Precision
- Scaling Long Context Training Data by Long-Distance Referrals
- Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining
- Scaling Optimal LR Across Token Horizons
- Scaling Test-Time Compute Optimally Can be More Effective than Scaling LLM Parameters
- Scaling Transformers for Low-Bitrate High-Quality Speech Coding
- Scaling up Masked Diffusion Models on Text
- Scaling up the Banded Matrix Factorization Mechanism for Large Scale Differentially Private ML
- Scaling Wearable Foundation Models
- Scattered Forest Search: Smarter Code Space Exploration with LLMs
- Scene Flow as a Partial Differential Equation
- Schur's Positive-Definite Network: Deep Learning in the SPD cone with structure
- SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding
- ScImage: How good are multimodal large language models at scientific text-to-image generation?
- SC-OmniGS: Self-Calibrating Omnidirectional Gaussian Splatting
- SCOPE: A Self-supervised framework for Improving Faithfulness in Conditional Text Generation
- SCOPE: SCALABLE OPTIMIZATION FOR EFFICIENT AND ADPATIVE FOUNDATION MODELS
- Score-based free-form architectures for high-dimensional Fokker-Planck equations
- Score-based Self-supervised MRI Denoising
- Score Forgetting Distillation: A Swift, Data-Free Method for Machine Unlearning in Diffusion Models
- Scrutinize What We Ignore: Reining In Task Representation Shift Of Context-Based Offline Meta Reinforcement Learning
- SEAL: Safety-enhanced Aligned LLM Fine-tuning via Bilevel Data Selection
- Searching for Optimal Solutions with LLMs via Bayesian Optimization
- SEBRA : Debiasing through Self-Guided Bias Ranking
- SeCom: On Memory Construction and Retrieval for Personalized Conversational Agents
- Second Order Bounds for Contextual Bandits with Function Approximation
- Second-Order Fine-Tuning without Pain for LLMs: A Hessian Informed Zeroth-Order Optimizer
- Second-Order Min-Max Optimization with Lazy Hessians
- Second Workshop on Representational Alignment (Re$^2$-Align)
- SecureGS: Boosting the Security and Fidelity of 3D Gaussian Splatting Steganography
- SeedLM: Compressing LLM Weights into Seeds of Pseudo-Random Generators
- Seeing Eye to AI: Human Alignment via Gaze-Based Response Rewards for Large Language Models
- See It from My Perspective: How Language Affects Cultural Bias in Image Understanding
- See What You Are Told: Visual Attention Sink in Large Multimodal Models
- SegLLM: Multi-round Reasoning Segmentation with Large Language Model
- Segment Any 3D Object with Language
- Select before Act: Spatially Decoupled Action Repetition for Continuous Control
- SelectFormer: Private and Practical Data Selection for Transformers
- Selective Aggregation for Low-Rank Adaptation in Federated Learning
- Selective Attention Improves Transformer
- Selective induction Heads: How Transformers Select Causal Structures in Context
- Selective Label Enhancement Learning for Test-Time Adaptation
- Selective Task Group Updates for Multi-Task Optimization
- Selective Unlearning via Representation Erasure Using Adversarial Training
- Self-Attention-Based Contextual Modulation Improves Neural System Identification
- Self-Boosting Large Language Models with Synthetic Preference Data
- Self-Correcting Decoding with Generative Feedback for Mitigating Hallucinations in Large Vision-Language Models
- SELF-EVOLVED REWARD LEARNING FOR LLMS
- Self-Evolving Multi-Agent Networks for Software Development
- Self-Improvement for Neural Combinatorial Optimization: Sample Without Replacement, but Improvement
- Self-Improvement in Language Models: The Sharpening Mechanism
- Self-Improving Foundation Models Without Human Supervision
- Self-Improving Robust Preference Optimization
- Self-Introspective Decoding: Alleviating Hallucinations for Large Vision-Language Models
- Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts
- Self-Normalized Resets for Plasticity in Continual Learning
- Self-Play Preference Optimization for Language Model Alignment
- Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models
- Self-supervised contrastive learning performs non-linear system identification
- Self-Supervised Diffusion MRI Denoising via Iterative and Stable Refinement
- Self-Supervised Diffusion Processes for Electron-Aware Molecular Representation Learning
- Self-supervised Monocular Depth Estimation Robust to Reflective Surface Leveraged by Triplet Mining
- Self-Updatable Large Language Models with Parameter Integration
- SelKD: Selective Knowledge Distillation via Optimal Transport Perspective
- Semantic Aware Representation Learning for Lifelong Learning
- Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations
- Semantic Loss Guided Data Efficient Supervised Fine Tuning for Safe Responses in LLMs
- Semantics-Adaptive Activation Intervention for LLMs via Dynamic Steering Vectors
- Semantic Skill Extraction via Vision-Language Model Guidance for Efficient Reinforcement Learning
- Semantix: An Energy-guided Sampler for Semantic Style Transfer
- SEMDICE: Off-policy State Entropy Maximization via Stationary Distribution Correction Estimation
- Semialgebraic Neural Networks: From roots to representations
- Semi-Parametric Retrieval via Binary Bag-of-Tokens Index
- Semi-Supervised CLIP Training by Enforcing Semantic and Trapezoidal Consistency
- Semi-Supervised Vision-Centric 3D Occupancy World Model for Autonomous Driving
- Sensitivity-Aware Amortized Bayesian Inference
- Sensitivity-Constrained Fourier Neural Operators for Forward and Inverse Problems in Parametric Differential Equations
- Sensitivity Verification for Decision Tree Ensembles
- Sensor-Invariant Tactile Representation
- SEPARATE: A Simple Low-rank Projection for Gradient Compression in Modern Large-scale Model Training Process
- Separation Power of Equivariant Neural Networks
- SePer: Measure Retrieval Utility Through The Lens Of Semantic Perplexity Reduction
- Sequential Controlled Langevin Diffusions
- Sequential Stochastic Combinatorial Optimization Using Hierarchal Reinforcement Learning
- Seq-VCR: Preventing Collapse in Intermediate Transformer Representations for Enhanced Reasoning
- SeRA: Self-Reviewing and Alignment of LLMs using Implicit Reward Margins
- Severing Spurious Correlations with Data Pruning
- SFESS: Score Function Estimators for $k$-Subset Sampling
- SGD with memory: fundamental properties and stochastic acceleration
- SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation
- Shallow diffusion networks provably learn hidden low-dimensional structure
- Shape as Line Segments: Accurate and Flexible Implicit Surface Representation
- Shapley-Guided Utility Learning for Effective Graph Inference Data Valuation
- Shared-AE: Unsupervised Identification of Shared Subspaces in High-dimensional Neural and Behavioral Activity
- SharedContextBench: How Lossy are Long-context Methods in KV Cache Reuse
- Sharper Guarantees for Learning Neural Network Classifiers with Gradient Methods
- Sharpness-Aware Black-Box Optimization
- Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late In Training
- Sharpness Aware Minimization: General Analysis and Improved Rates
- Shedding Light on Time Series Classification using Interpretability Gated Networks
- ShEPhERD: Diffusing shape, electrostatics, and pharmacophores for bioisosteric drug design
- Shh, don't say that! Domain Certification in LLMs
- Shifting the Paradigm: A Diffeomorphism Between Time Series Data Manifolds for Achieving Shift-Invariancy in Deep Learning
- ShortcutsBench: A Large-Scale Real-world Benchmark for API-based Agents
- Shot2Story: A New Benchmark for Comprehensive Understanding of Multi-shot Videos
- Should VLMs be Pre-trained with Image Data?
- Show-o: One Single Transformer to Unify Multimodal Understanding and Generation
- Signature Kernel Conditional Independence Tests in Causal Discovery for Stochastic Processes
- SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning
- SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
- Simple and Controllable Uniform Discrete Diffusion Language Models
- Simple, Good, Fast: Self-Supervised World Models Free of Baggage
- Simple is Effective: The Roles of Graphs and Large Language Models in Knowledge-Graph-Based Retrieval-Augmented Generation
- Simple ReFlow: Improved Techniques for Fast Flow Models
- Simple yet Effective Incomplete Multi-view Clustering: Similarity-level Imputation and Intra-view Hybrid-group Prototype Construction
- Simplifying Deep Temporal Difference Learning
- Simplifying, Stabilizing and Scaling Continuous-time Consistency Models
- SIMPL: Scalable and hassle-free optimisation of neural representations from behaviour
- SIM: Surface-based fMRI Analysis for Inter-Subject Multimodal Decoding from Movie-Watching Experiments
- Simulating Human-like Daily Activities with Desire-driven Autonomy
- Simulating Training Dynamics to Reconstruct Training Data from Deep Neural Networks
- SimulPL: Aligning Human Preferences in Simultaneous Machine Translation
- SimXRD-4M: Big Simulated X-ray Diffraction Data and Crystalline Symmetry Classification Benchmark
- SINGAPO: Single Image Controlled Generation of Articulated Parts in Objects
- SINGER: Stochastic Network Graph Evolving Operator for High Dimensional PDEs
- Single-agent Poisoning Attacks Suffice to Ruin Multi-Agent Learning
- Single Teacher, Multiple Perspectives: Teacher Knowledge Augmentation for Enhanced Knowledge Distillation
- Singular Subspace Perturbation Bounds via Rectangular Random Matrix Diffusions
- SiReRAG: Indexing Similar and Related Information for Multihop Reasoning
- Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes
- Size-Generalizable RNA Structure Evaluation by Exploring Hierarchical Geometries
- Sketch2Diagram: Generating Vector Diagrams from Hand-Drawn Sketches
- Sketching for Convex and Nonconvex Regularized Least Squares with Sharp Guarantees
- Skill Expansion and Composition in Parameter Space
- SleepSMC: Ubiquitous Sleep Staging via Supervised Multimodal Coordination
- SLMRec: Empowering Small Language Models for Sequential Recommendation
- SLoPe: Double-Pruned Sparse Plus Lazy Low-Rank Adapter Pretraining of LLMs
- S-LoRA: Scalable Low-Rank Adaptation for Class Incremental Learning
- Slot-Guided Adaptation of Pre-trained Diffusion Models for Object-Centric Learning and Compositional Generation
- SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation
- Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling
- Small-to-Large Generalization: Training Data Influences Models Consistently Across Scale
- SmartPretrain: Model-Agnostic and Dataset-Agnostic Representation Learning for Motion Prediction
- SmartRAG: Jointly Learn RAG-Related Tasks From the Environment Feedback
- SMI-Editor: Edit-based SMILES Language Model with Fragment-level Supervision
- SMITE: Segment Me In TimE
- Smoothing the Shift: Towards Stable Test-Time Adaptation under Complex Multimodal Noises
- SOAP: Improving and Stabilizing Shampoo using Adam
- SoftCVI: Contrastive variational inference with self-generated soft labels
- Soft Merging of Experts with Adaptive Routing
- Solving Differential Equations with Constrained Learning
- Solving hidden monotone variational inequalities with surrogate losses
- Solving Inverse Problems with Model Mismatch using Untrained Neural Networks within Model-based Architectures
- Solving Multiplayer Partially Observable Stochastic Games by Divergence-Regularized Discounted Aggregation
- Solving New Tasks by Adapting Internet Video Knowledge
- Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
- Solving Video Inverse Problems Using Image Diffusion Models
- SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios
- SOO-Bench: Benchmarks for Evaluating the Stability of Offline Black-Box Optimization
- SOREL: A Stochastic Algorithm for Spectral Risks Minimization
- SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal
- Sort-free Gaussian Splatting via Weighted Sum Rendering
- SoundCTM: Unifying Score-based and Consistency Models for Full-band Text-to-Sound Generation
- SPA: 3D Spatial-Awareness Enables Effective Embodied Representation
- SPA-BENCH: A COMPREHENSIVE BENCHMARK FOR SMARTPHONE AGENT EVALUATION
- SpaceGNN: Multi-Space Graph Neural Network for Node Anomaly Detection with Extremely Limited Labels
- SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training
- Sparse Autoencoders Do Not Find Canonical Units of Analysis
- Sparse autoencoders reveal selective remapping of visual concepts during adaptation
- Sparse Autoencoders Reveal Temporal Difference Learning in Large Language Models
- Sparse components distinguish visual pathways & their alignment to neural networks
- Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
- Sparse Learning for State Space Models on Mobile
- SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models
- Sparse Matrix in Large Language Model Fine-tuning
- SparsyFed: Sparse Adaptive Federated Learning
- SPARTUN3D: Situated Spatial Understanding of 3D World in Large Language Model
- Spatial-Mamba: Effective Visual State Space Models via Structure-Aware State Fusion
- SPDIM: Source-Free Unsupervised Conditional and Label Shift Adaptation in EEG
- Specialized Foundation Models struggle to beat Supervised Baselines
- Spectral Compressive Imaging via Unmixing-driven Subspace Diffusion Refinement
- Spectral-Refiner: Accurate Fine-Tuning of Spatiotemporal Fourier Neural Operator for Turbulent Flows
- Spectro-Riemannian Graph Neural Networks
- Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling
- Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting
- Speech Robust Bench: A Robustness Benchmark For Speech Recognition
- Spherical Tree-Sliced Wasserstein Distance
- Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows
- SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks
- SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking
- Spiking Vision Transformer with Saccadic Attention
- SpinQuant: LLM Quantization with Learned Rotations
- SplatFormer: Point Transformer for Robust 3D Gaussian Splatting
- SplineGS: Learning Smooth Trajectories in Gaussian Splatting for Dynamic Scene Reconstruction
- Sports-Traj: A Unified Trajectory Generation Model for Multi-Agent Movement in Sports
- SportU: A Comprehensive Sports Understanding Benchmark for Multimodal Large Language Models
- Spreading Out-of-Distribution Detection on Graphs
- Spread Preference Annotation: Direct Preference Judgment for Efficient LLM Alignment
- Spurious Forgetting in Continual Learning of Language Models
- SqueezeAttention: 2D Management of KV-Cache in LLM Inference via Layer-wise Optimal Budget
- SRSA: Skill Retrieval and Adaptation for Robotic Assembly Tasks
- SSLAM: Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes
- SSOLE: Rethinking Orthogonal Low-rank Embedding for Self-Supervised Learning
- Stabilized Neural Prediction of Potential Outcomes in Continuous Time
- Stabilizing Reinforcement Learning in Differentiable Multiphysics Simulation
- Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning
- Stable Segment Anything Model
- STAFF: Speculative Coreset Selection for Task-Specific Fine-tuning
- STAMP: Scalable Task- And Model-agnostic Collaborative Perception
- Standard Gaussian Process Can Be Excellent for High-Dimensional Bayesian Optimization
- Standardizing Structural Causal Models
- STAR: Stability-Inducing Weight Perturbation for Continual Learning
- STAR: Synthesis of Tailored Architectures
- Start Smart: Leveraging Gradients For Enhancing Mask-based XAI Methods
- State Space Model Meets Transformer: A New Paradigm for 3D Object Detection
- State Space Models are Provably Comparable to Transformers in Dynamic Token Selection
- Statistical Advantages of Perturbing Cosine Router in Mixture of Experts
- Statistical Tractability of Off-policy Evaluation of History-dependent Policies in POMDPs
- STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
- Stealthy Shield Defense: A Conditional Mutual Information-Based Approach against Black-Box Model Inversion Attacks
- Steering Large Language Models between Code Execution and Textual Reasoning
- Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction
- Steering Protein Family Design through Profile Bayesian Flow
- Stem-OB: Generalizable Visual Imitation Learning with Stem-Like Convergent Observation through Diffusion Inversion
- Step-by-Step Reasoning for Math Problems via Twisted Sequential Monte Carlo
- ST-GCond: Self-supervised and Transferable Graph Dataset Condensation
- Stick-breaking Attention
- Stiefel Flow Matching for Moment-Constrained Structure Elucidation
- ST-Modulator: Modulating Space-Time Attention for Multi-Grained Video Editing
- Stochastic Bandits Robust to Adversarial Attacks
- Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance
- Stochastic Semi-Gradient Descent for Learning Mean Field Games with Population-Aware Function Approximation
- Stochastic variance-reduced Gaussian variational inference on the Bures-Wasserstein manifold
- StochSync: Stochastic Diffusion Synchronization for Image Generation in Arbitrary Spaces
- STORM: Spatio-TempOral Reconstruction Model For Large-Scale Outdoor Scenes
- Storybooth: Training-Free Multi-Subject Consistency for Improved Visual Storytelling
- Straightness of Rectified Flow: A Theoretical Insight into Wasserstein Convergence
- Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs
- STRAP: Robot Sub-Trajectory Retrieval for Augmented Policy Learning
- Strategic Classification With Externalities
- Strategist: Self-improvement of LLM Decision Making via Bi-Level Tree Search
- Streaming Algorithms For $\ell_p$ Flows and $\ell_p$ Regression
- Streaming Video Question-Answering with In-context Video KV-Cache Retrieval
- Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge
- Streamlining Bayesian Deep Learning
- Streamlining Redundant Layers to Compress Large Language Models
- Streamlining the Design Space of ML4TSP Suggests Principles for Learning and Search
- Strength Estimation and Human-Like Strength Adjustment in Games
- StringLLM: Understanding the String Processing Capability of Large Language Models
- Strong Preferences Affect the Robustness of Value Alignment
- StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization
- Structural-Entropy-Based Sample Selection for Efficient and Effective Learning
- Structure Language Models for Protein Conformation Generation
- Student-Informed Teacher Training
- Studying the Interplay Between the Actor and Critic Representations in Reinforcement Learning
- Style Outweighs Substance: Failure Modes of LLM Judges in Alignment Benchmarking
- Subgraph Federated Learning for Local Generalization
- Subtask-Aware Visual Reward Learning from Segmented Demonstrations
- Sufficient Context: A New Lens on Retrieval Augmented Generation Systems
- SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights
- Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization
- Supervised and Semi-Supervised Diffusion Maps with Label-Driven Diffusion
- Support is All You Need for Certified VAE Training
- SurFhead: Affine Rig Blending for Geometrically Accurate 2D Gaussian Surfel Head Avatars
- Surgical, Cheap, and Flexible: Mitigating False Refusal in Language Models via Single Vector Ablation
- Surprising Effectiveness of pretraining Ternary Language Model at Scale
- SV4D: Dynamic 3D Content Generation with Multi-Frame and Multi-View Consistency
- SVBench: A Benchmark with Temporal Multi-Turn Dialogues for Streaming Video Understanding
- SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression
- SVDQuant: Absorbing Outliers by Low-Rank Component for 4-Bit Diffusion Models
- SVG: 3D Stereoscopic Video Generation via Denoising Frame Matrix
- SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?
- Swift4D: Adaptive divide-and-conquer Gaussian Splatting for compact and efficient reconstruction of dynamic scene
- Swift Hydra: Self-Reinforcing Generative Framework for Anomaly Detection with Multiple Mamba Models
- SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration
- Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning
- Sylber: Syllabic Embedding Representation of Speech from Raw Audio
- SyllableLM: Learning Coarse Semantic Units for Speech Language Models
- Symbolic regression via MDLformer-guided search: from minimizing prediction error to minimizing description length
- SymDiff: Equivariant Diffusion via Stochastic Symmetrisation
- SymmCD: Symmetry-Preserving Crystal Generation with Diffusion Models
- SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints
- Synergy and Diversity in CLIP: Enhancing Performance Through Adaptive Backbone Ensembling
- Synergy Between Sufficient Changes and Sparse Mixing Procedure for Disentangled Representation Learning
- Synergy Learning with Small Models promotes LLM Zero-Shot Tabular Prediction
- SynFlowNet: Design of Diverse and Novel Molecules with Synthesis Constraints
- SynQ: Accurate Zero-shot Quantization by Synthesis-aware Fine-tuning
- Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo
- Synthesizing Programmatic Reinforcement Learning Policies with Large Language Model Guided Search
- Synthesizing Realistic fMRI: A Physiological Dynamics-Driven Hierarchical Diffusion Model for Efficient fMRI Acquisition
- Synthetic continued pretraining
- Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data
- SysBench: Can LLMs Follow System Message?
- SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems
- System 1.x: Learning to Balance Fast and Slow Planning with Language Models
- Systematic Outliers in Large Language Models
- Systematic Relational Reasoning With Epistemic Graph Neural Networks
- Systems with Switching Causal Relations: A Meta-Causal Perspective
- T2V2: A Unified Non-Autoregressive Model for Speech Recognition and Synthesis via Multitask Learning
- T2V-Turbo-v2: Enhancing Video Model Post-Training through Data, Reward, and Conditional Guidance Design
- TabDiff: a Multi-Modal Diffusion Model for Tabular Data Generation
- TabM: Advancing tabular deep learning with parameter-efficient ensembling
- TabWak: A Watermark for Tabular Diffusion Models
- Tackling Data Corruption in Offline Reinforcement Learning via Sequence Modeling
- TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models
- Tailoring Mixup to Data for Calibration
- Talking Turns: Benchmarking Audio Foundation Models on Turn-Taking Dynamics
- Taming Overconfidence in LLMs: Reward Calibration in RLHF
- Taming Transformer Without Using Learning Rate Warmup
- Targeted Attack Improves Protection against Unauthorized Diffusion Customization
- Targeted Manipulation and Deception Emerge in LLMs Trained on User* Feedback
- TASAR: Transfer-based Attack on Skeletal Action Recognition
- Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling
- Task Descriptors Help Transformers Learn Linear Models In-Context
- TC-MoE: Augmenting Mixture of Experts with Ternary Expert Choice
- TDDBench: A Benchmark for Training data detection
- TD-Paint: Faster Diffusion Inpainting Through Time Aware Pixel Conditioning
- Teaching Human Behavior Improves Content Understanding Abilities Of LLMs
- Teaching LLMs How To Learn with Contextual Fine-Tuning
- TEASER: Token Enhanced Spatial Modeling for Expressions Reconstruction
- TempMe: Video Temporal Token Merging for Efficient Text-Video Retrieval
- Temporal Difference Learning: Why It Can Be Fast and How It Will Be Faster
- Temporal Flexibility in Spiking Neural Networks: Towards Generalization Across Time Steps and Deployment Friendliness
- Temporal Heterogeneous Graph Generation with Privacy, Utility, and Efficiency
- Temporal Reasoning Transfer from Text to Video
- TEOChat: A Large Vision-Language Assistant for Temporal Earth Observation Data
- TestGenEval: A Real World Unit Test Generation and Test Completion Benchmark
- Test of Time: A Benchmark for Evaluating LLMs on Temporal Reasoning
- Test-Time Adaptation for Combating Missing Modalities in Egocentric Videos
- Test-time Adaptation for Cross-modal Retrieval with Query Shift
- Test-time Adaptation for Image Compression with Distribution Regularization
- Test-time Adaptation for Regression by Subspace Alignment
- Test-time Alignment of Diffusion Models without Reward Over-optimization
- Test-Time Ensemble via Linear Mode Connectivity: A Path to Better Adaptation
- TetSphere Splatting: Representing High-Quality Geometry with Lagrangian Volumetric Meshes
- Text2PDE: Latent Diffusion Models for Accessible Physics Simulation
- TexTailor: Customized Text-aligned Texturing via Effective Resampling
- TextSeg: Reimagining Image Segmentation as Text Generation
- Text-to-Image Rectified Flow as Plug-and-Play Priors
- TFG-Flow: Training-free Guidance in Multimodal Generative Flow
- TGB-Seq Benchmark: Challenging Temporal GNNs with Complex Sequential Dynamics
- The 2nd Workshop on Foundation Models in the Wild
- The 3D-PC: a benchmark for visual perspective taking in humans and machines
- The adaptive complexity of log-concave sampling
- The AdEMAMix Optimizer: Better, Faster, Older
- The Breakdown of Gaussian Universality in Classification of High-dimensional Mixtures
- The Case for Cleaner Biosignals: High-fidelity Neural Compressor Enables Transfer from Cleaner iEEG to Noisier EEG
- The Complexity of Two-Team Polymatrix Games with Independent Adversaries
- The Computational Complexity of Circuit Discovery for Inner Interpretability
- The Computational Complexity of Positive Non-Clashing Teaching in Graphs
- The Crucial Role of Samplers in Online Direct Preference Optimization
- The Crystal Ball Hypothesis in diffusion models: Anticipating object positions from initial noise
- The Directionality of Optimization Trajectories in Neural Networks
- The Effectiveness of Curvature-Based Rewiring and the Role of Hyperparameters in GNNs Revisited
- The Foundations of Tokenization: Statistical and Computational Concerns
- The Future of Machine Learning Data Practices and Repositories
- The Hidden Cost of Waiting for Accurate Predictions
- The Hyperfitting Phenomenon: Sharpening and Stabilizing LLMs for Open-Ended Text Generation
- The Illustrated AlphaFold
- The impact of allocation strategies in subset learning on the expressive power of neural networks
- The KoLMogorov Test: Compression by Code Generation
- The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs
- The Last Iterate Advantage: Empirical Auditing and Principled Heuristic Analysis of Differentially Private SGD
- The "Law'' of the Unconscious Contrastive Learner: Probabilistic Alignment of Unpaired Modalities
- The Loss Landscape of Deep Linear Neural Networks: a Second-order Analysis
- The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve?
- The Low-Rank Bottleneck in Attention
- The OMG dataset: An Open MetaGenomic corpus for mixed-modality genomic language modeling
- The Optimization Landscape of SGD Across the Feature Learning Strength
- Theory, Analysis, and Best Practices for Sigmoid Self-Attention
- Theory on Mixture-of-Experts in Continual Learning
- Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers
- The Pitfalls of Memorization: When Memorization Hurts Generalization
- The Power of LLM-Generated Synthetic Data for Stance Detection in Online Political Discussions
- The Ramanujan Library - Automated Discovery on the Hypergraph of Integer Relations
- The Representation Geometry of Features and Hierarchy in Large Language Models
- The Rise and Down of Babel Tower: Investigating the Evolution Process of Multilingual Code Large Language Model
- ThermalGaussian: Thermal 3D Gaussian Splatting
- THE ROBUSTNESS OF DIFFERENTIABLE CAUSAL DISCOVERY IN MISSPECIFIED SCENARIOS
- The Same but Different: Structural Similarities and Differences in Multilingual Language Modeling
- The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities
- The Superposition of Diffusion Models
- The Unreasonable Ineffectiveness of the Deeper Layers
- The Utility and Complexity of In- and Out-of-Distribution Machine Unlearning
- The Value of Sensory Information to a Robot
- The Vital Role of Gradient Clipping in Byzantine-Resilient Distributed Learning
- ThinkBot: Embodied Instruction Following with Thought Chain Reasoning
- Think-on-Graph 2.0: Deep and Faithful Large Language Model Reasoning with Knowledge-guided Retrieval Augmented Generation
- Think Then React: Towards Unconstrained Action-to-Reaction Motion Generation
- ThinK: Thinner Key Cache by Query-Driven Pruning
- Think Thrice Before You Act: Progressive Thought Refinement in Large Language Models
- Think Twice Before Claiming Your Optimization Algorithm Outperformance - Review and Beyond
- Think while You Generate: Discrete Diffusion with Planned Denoising
- Three-in-One: Fast and Accurate Transducer for Hybrid-Autoregressive Speech Recognition
- Three Mechanisms of Feature Learning in a Linear Network
- ThunderKittens: Simple, Fast, and $\textit{Adorable}$ Kernels
- TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention
- TIGER: Time-frequency Interleaved Gain Extraction and Reconstruction for Efficient Speech Separation
- TIGeR: Unifying Text-to-Image Generation and Retrieval with Large Multimodal Models
- Tight Clusters Make Specialized Experts
- Tighter Privacy Auditing of DP-SGD in the Hidden State Threat Model
- Tight Lower Bounds under Asymmetric High-Order Hölder Smoothness and Uniform Convexity
- Tight Time Complexities in Parallel Stochastic Optimization with Arbitrary Computation Dynamics
- Time After Time: Scalable Effect Estimation for Interventions on When and What to do
- TimeInf: Time Series Data Contribution via Influence Functions
- TimeKAN: KAN-based Frequency Decomposition Learning Architecture for Long-term Time Series Forecasting
- TimeMixer++: A General Time Series Pattern Machine for Universal Predictive Analysis
- Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts
- Timer-XL: Long-Context Transformers for Unified Time Series Forecasting
- Time, Space and Streaming Efficient Algorithm for Heavy Attentions
- TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning
- TIPS: Text-Image Pretraining with Spatial awareness
- TIS-DPO: Token-level Importance Sampling for Direct Preference Optimization With Estimated Weights
- T-JEPA: Augmentation-Free Self-Supervised Learning for Tabular Data
- TLDR: Token-Level Detective Reward Model for Large Vision Language Models
- To Clip or not to Clip: the Dynamics of SGD with Gradient Clipping in High-Dimensions
- To Code or Not To Code? Exploring Impact of Code in Pre-training
- To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
- ToddlerDiffusion: Interactive Structured Image Generation with Cascaded Schrödinger Bridge
- TODO: Enhancing LLM Alignment with Ternary Preferences
- TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
- Token Pruning Meets Audio: Investigating Unique Behaviors in Vision Transformer-Based Audio Classification
- Token Statistics Transformer: Linear-Time Attention via Variational Rate Reduction
- Token-Supervised Value Models for Enhancing Mathematical Problem-Solving Capabilities of Large Language Models
- ToolACE: Enhancing Function Calling with Accuracy, Complexity, and Diversity
- Tool Decoding: A Plug-and-Play Approach to Enhancing Language Models for Tool Usage
- ToolDial: Multi-turn Dialogue Generation Method for Tool-Augmented Language Models
- ToolGen: Unified Tool Retrieval and Calling via Generation
- Tool-Planner: Task Planning with Clusters across Multiple Tools
- TOP-ERL: Transformer-based Off-Policy Episodic Reinforcement Learning
- Top-m Data Values Identification
- TopoDiffusionNet: A Topology-aware Diffusion Model
- TopoGaussian: Inferring Internal Topology Structures from Visual Clues
- Topograph: An Efficient Graph-Based Framework for Strictly Topology Preserving Image Segmentation
- TopoLM: brain-like spatio-functional organization in a topographic language model
- Topological Blindspots: Understanding and Extending Topological Deep Learning Through the Lens of Expressivity
- Topological Schrödinger Bridge Matching
- Topological Zigzag Spaghetti for Diffusion-based Generation and Prediction on Graphs
- TopoNets: High performing vision and language models with brain-like topography
- To Tackle Adversarial Transferability: A Novel Ensemble Training Method with Fourier Transformation
- ToVE: Efficient Vision-Language Learning via Knowledge Transfer from Vision Experts
- Toward Efficient Multi-Agent Exploration With Trajectory Entropy Maximization
- Toward Exploratory Inverse Constraint Inference with Generative Diffusion Verifiers
- Toward Generalizing Visual Brain Decoding to Unseen Subjects
- Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment
- Toward Robust Defenses Against LLM Weight Tampering Attacks
- Towards a Complete Logical Framework for GNN Expressiveness
- Towards a General Time Series Anomaly Detector with Adaptive Bottlenecks and Dual Adversarial Decoders
- Towards Agentic AI for Science: Hypothesis Generation, Comprehension, Quantification, and Validation
- Towards a learning theory of representation alignment
- Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective
- Towards a Unified and Verified Understanding of Group-Operation Networks
- Towards Automated Knowledge Integration From Human-Interpretable Representations
- Towards Auto-Regressive Next-Token Prediction: In-context Learning Emerges from Generalization
- Towards Bridging Generalization and Expressivity of Graph Neural Networks
- Towards Calibrated Deep Clustering Network
- Towards Certification of Uncertainty Calibration under Adversarial Attacks
- Towards Continuous Reuse of Graph Models via Holistic Memory Diversification
- Towards counterfactual fairness thorough auxiliary variables
- Towards Domain Adaptive Neural Contextual Bandits
- Towards Effective Evaluations and Comparison for LLM Unlearning Methods
- Towards Empowerment Gain through Causal Structure Learning in Model-Based RL
- Towards Explaining the Power of Constant-depth Graph Neural Networks for Linear Programming
- Towards Faster Decentralized Stochastic Optimization with Communication Compression
- Towards Fast, Specialized Machine Learning Force Fields: Distilling Foundation Models via Energy Hessians
- Towards Federated RLHF with Aggregated Client Preference for LLMs
- Towards Foundation Models for Mixed Integer Linear Programming
- Towards Generalizable Reinforcement Learning via Causality-Guided Self-Adaptive Representations
- Towards Generalization Bounds of GCNs for Adversarially Robust Node Classification
- Towards General-Purpose Model-Free Reinforcement Learning
- Towards Hierarchical Rectified Flow
- Towards hyperparameter-free optimization with differential privacy
- Towards Improving Exploration through Sibling Augmented GFlowNets
- Towards Interpreting Visual Information Processing in Vision-Language Models
- Towards Learning High-Precision Least Squares Algorithms with Sequence Models
- Towards Marginal Fairness Sliced Wasserstein Barycenter
- Towards more rigorous evaluations of language models
- Towards Multiple Character Image Animation Through Enhancing Implicit Decoupling
- Towards Neural Scaling Laws for Time Series Foundation Models
- Towards Optimal Multi-draft Speculative Decoding
- Towards Out-of-Modal Generalization without Instance-level Modal Correspondence
- Towards Principled Evaluations of Sparse Autoencoders for Interpretability and Control
- Towards Realistic Data Generation for Real-World Super-Resolution
- Towards Realistic UAV Vision-Language Navigation: Platform, Benchmark, and Methodology
- Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization
- Towards Robust and Cost-Efficient Knowledge Unlearning for Large Language Models
- Towards Robust Multimodal Open-set Test-time Adaptation via Adaptive Entropy-aware Optimization
- Towards Scalable Exact Machine Unlearning Using Parameter-Efficient Fine-Tuning
- Towards Scalable Topological Regularizers
- Towards Self-Supervised Covariance Estimation in Deep Heteroscedastic Regression
- Towards Semantic Equivalence of Tokenization in Multimodal LLM
- Towards Synergistic Path-based Explanations for Knowledge Graph Completion: Exploration and Evaluation
- Towards Unbiased Calibration using Meta-Regularization
- Towards Unbiased Learning in Semi-Supervised Semantic Segmentation
- Towards Understanding Text Hallucination of Diffusion Models via Local Generation Bias
- Towards Understanding the Robustness of Diffusion-Based Purification: A Stochastic Perspective
- Towards Understanding the Universality of Transformers for Next-Token Prediction
- Towards Understanding Why FixMatch Generalizes Better Than Supervised Learning
- Towards Understanding Why Label Smoothing Degrades Selective Classification and How to Fix It
- Towards Unified Human Motion-Language Understanding via Sparse Interpretable Characterization
- Towards Universality: Studying Mechanistic Similarity Across Language Model Architectures
- Toward Understanding In-context vs. In-weight Learning
- TPO: Aligning Large Language Models with Multi-branch & Multi-step Preference Trees
- TRACE: Temporal Grounding Video LLM via Causal Event Modeling
- TraceVLA: Visual Trace Prompting Enhances Spatial-Temporal Awareness for Generalist Robotic Policies
- Tracing Representation Progression: Analyzing and Enhancing Layer-Wise Similarity
- Tracking objects that change in appearance with phase synchrony
- Tracking the Copyright of Large Vision-Language Models through Parameter Learning Adversarial Attacks
- Track-On: Transformer-based Online Point Tracking with Memory
- TrackTheMind: program-guided adversarial data generation for theory of mind reasoning
- Tractable Multi-Agent Reinforcement Learning through Behavioral Economics
- Trained Transformer Classifiers Generalize and Exhibit Benign Overfitting In-Context
- Training-Free Activation Sparsity in Large Language Models
- Training-free Camera Control for Video Generation
- Training-Free Dataset Pruning for Instance Segmentation
- Training-Free Diffusion Model Alignment with Sampling Demons
- Training Free Exponential Context Extension via Cascading KV Cache
- Training Free Guided Flow-Matching with Optimal Control
- Training-free LLM-generated Text Detection by Mining Token Probability Sequences
- Training-Free Message Passing for Learning on Hypergraphs
- Training Language Models on Synthetic Edit Sequences Improves Code Synthesis
- Training Language Models to Self-Correct via Reinforcement Learning
- Training Large Language Models for Retrieval-Augmented Question Answering through Backtracking Correction
- Training LLMs over Neurally Compressed Text
- Training Mice to Compete with Elephants: A Guide for Customizing Small-Sized LLMs on Knowledge and Skills Data
- Training Neural Networks as Recognizers of Formal Languages
- Training Nonlinear Transformers for Chain-of-Thought Inference: A Theoretical Generalization Analysis
- Training One-Dimensional Graph Neural Networks is NP-Hard
- Training on the Test Task Confounds Evaluation and Emergence
- Training Robust Ensembles Requires Rethinking Lipschitz Continuity
- Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models
- Trajectory attention for fine-grained video motion control
- Trajectory-Class-Aware Multi-Agent Reinforcement Learning
- Trajectory-LLM: A Language-based Data Generator for Trajectory Prediction in Autonomous Driving
- Transformer Block Coupling and its Correlation with Generalization in LLMs
- Transformer Encoder Satisfiability: Complexity and Impact on Formal Reasoning
- Transformer Learns Optimal Variable Selection in Group-Sparse Classification
- Transformer Meets Twicing: Harnessing Unattended Residual Information
- Transformers are Universal In-context Learners
- Transformers Handle Endogeneity in In-Context Linear Regression
- Transformers Learn Low Sensitivity Functions: Investigations and Implications
- Transformers Learn Temporal Difference Methods for In-Context Reinforcement Learning
- Transformers Learn to Implement Multi-step Gradient Descent with Chain of Thought
- Transformers Provably Learn Two-Mixture of Linear Classification via Gradient Flow
- Transformers Provably Solve Parity Efficiently with Chain of Thought
- Transformer-Squared: Self-adaptive LLMs
- Transformers Struggle to Learn to Search Without In-context Exploration
- Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
- Transition Path Sampling with Improved Off-Policy Training of Diffusion Path Samplers
- Tree of Attributes Prompt Learning for Vision-Language Models
- Tree-Wasserstein Distance for High Dimensional Data with a Latent Feature Hierarchy
- TRENDy: Temporal Regression of Effective Nonlinear Dynamics
- Triples as the Key: Structuring Makes Decomposition and Verification Easier in LLM-based TableQA
- Trivialized Momentum Facilitates Diffusion Generative Modeling on Lie Groups
- True Counterfactual Generation from Language Models
- Truncated Consistency Models
- Truncation Is All You Need: Improved Sampling Of Diffusion Models For Physics-Based Simulations
- Trusted Multi-View Classification via Evolutionary Multi-View Fusion
- Trust or Escalate: LLM Judges with Provable Guarantees for Human Agreement
- TSC-Net: Predict Pedestrian Trajectory by Trajectory-Scene-Cell Classification
- TS-LIF: A Temporal Segment Spiking Neuron Network for Time Series Forecasting
- T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching
- TTVD: Towards a Geometric Framework for Test-Time Adaptation Based on Voronoi Diagram
- TULIP: Token-length Upgraded CLIP
- Tuning-Free Bilevel Optimization: New Algorithms and Convergence Analysis
- Tuning Frequency Bias of State Space Models
- Tuning Timestep-Distilled Diffusion Model Using Pairwise Sample Optimization
- Turning Up the Heat: Min-p Sampling for Creative and Coherent LLM Outputs
- TVNet: A Novel Time Series Analysis Method Based on Dynamic Convolution and 3D-Variation
- TweedieMix: Improving Multi-Concept Fusion for Diffusion-based Image/Video Generation
- Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Models
- Two Sparse Matrices are Better than One: Sparsifying Neural Networks with Double Sparse Factorization
- TypedThinker: Typed Thinking Improves Large Language Model Reasoning
- u-$\mu$P: The Unit-Scaled Maximal Update Parametrization
- UGMathBench: A Diverse and Dynamic Benchmark for Undergraduate-Level Mathematical Reasoning with Large Language Models
- UIFace: Unleashing Inherent Model Capabilities to Enhance Intra-Class Diversity in Synthetic Face Recognition
- Ultra-Sparse Memory Network
- Unbounded: A Generative Infinite Game of Character Life Simulation
- Uncertainty-Aware Decoding with Minimum Bayes' Risk
- Uncertainty Herding: One Active Learning Method for All Label Budgets
- Uncertainty modeling for fine-tuned implicit functions
- Uncertainty Modeling in Graph Neural Networks via Stochastic Differential Equations
- Uncovering Gaps in How Humans and LLMs Interpret Subjective Language
- Uncovering Latent Memories in Large Language Models
- Uncovering Overfitting in Large Language Model Editing
- Underdamped Diffusion Bridges with Applications to Sampling
- Understanding and Enhancing the Transferability of Jailbreaking Attacks
- Understanding Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing
- Understanding Constraint Inference in Safety-Critical Inverse Reinforcement Learning
- Understanding Factual Recall in Transformers via Associative Memories
- Understanding Fairness Surrogate Functions in Algorithmic Fairness
- Understanding Long Videos with Multimodal Language Models
- Understanding Matrix Function Normalizations in Covariance Pooling through the Lens of Riemannian Geometry
- Understanding Methods for Scalable MCTS
- Understanding Model Calibration - A gentle introduction and visual exploration of calibration and the expected calibration error (ECE)
- Understanding Optimization in Deep Learning with Central Flows
- Understanding Reasoning with Looped Transformers
- Understanding the Impacts of GenAI Requires Understanding the Impact of Anthropomorphic AI
- Understanding the Stability-based Generalization of Personalized Federated Learning
- Understanding Virtual Nodes: Oversquashing and Node Heterogeneity
- Understanding Visual Concepts Across Models
- Understanding Warmup-Stable-Decay Learning Rates: A River Valley Loss Landscape View
- Unearthing Skill-level Insights for Understanding Trade-offs of Foundation Models
- U-Nets as Belief Propagation: Efficient Classification, Denoising, and Diffusion in Generative Hierarchical Models
- Unhackable Temporal Reward for Scalable Video MLLMs
- Uni$^2$Det: Unified and Universal Framework for Prompt-Guided Multi-dataset 3D Detection
- UniCBE: An Uniformity-driven Comparing Based Evaluation Framework with Unified Multi-Objective Optimization
- UniCon: Unidirectional Information Flow for Effective Control of Large-Scale Diffusion Models
- UniCO: On Unified Combinatorial Optimization via Problem Reduction to Matrix-Encoded General TSP
- UniCoTT: A Unified Framework for Structural Chain-of-Thought Distillation
- UniDetox: Universal Detoxification of Large Language Models via Dataset Distillation
- UniDrive: Towards Universal Driving Perception Across Camera Configurations
- Unified Convergence Analysis for Score-Based Diffusion Models with Deterministic Samplers
- Unified Parameter-Efficient Unlearning for LLMs
- Unifying Causal Representation Learning with the Invariance Principle
- Unifying Unsupervised Graph-Level Anomaly Detection and Out-of-Distribution Detection: A Benchmark
- UniGEM: A Unified Approach to Generation and Property Prediction for Molecules
- UniGS: Unified Language-Image-3D Pretraining with Gaussian Splatting
- uniINF: Best-of-Both-Worlds Algorithm for Parameter-Free Heavy-Tailed MABs
- UniMatch: Universal Matching from Atom to Task for Few-Shot Drug Discovery
- Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
- Union-over-Intersections: Object Detection beyond Winner-Takes-All
- UniRestore3D: A Scalable Framework For General Shape Restoration
- Uni-Sign: Toward Unified Sign Language Understanding at Scale
- Universal generalization guarantees for Wasserstein distributionally robust models
- Universal Image Restoration Pre-training via Degradation Classification
- Universal Multimodal Retrieval with Multimodal LLMs
- Universal Sharpness Dynamics in Neural Network Training: Fixed Point Analysis, Edge of Stability, and Route to Chaos
- UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation
- Unlearn and Burn: Adversarial Machine Unlearning Requests Destroy Model Accuracy
- Unlearning-based Neural Interpretations
- Unleashing the Potential of ConvNets for Query-based Detection and Segmentation
- Unleashing the Potential of Diffusion Models for Incomplete Data Imputation
- Unlocking Efficient, Scalable, and Continual Knowledge Editing with Basis-Level Representation Fine-Tuning
- Unlocking Global Optimality in Bilevel Optimization: A Pilot Study
- Unlocking Guidance for Discrete State-Space Diffusion and Flow Models
- Unlocking Point Processes through Point Set Diffusion
- Unlocking Reasoning Potential in Large Language Models by Scaling Code-form Planning
- Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues
- Unlocking the Potential of Model Calibration in Federated Learning
- Unlocking the Power of Function Vectors for Characterizing and Mitigating Catastrophic Forgetting in Continual Instruction Tuning
- Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model
- Unsupervised Disentanglement of Content and Style via Variance-Invariance Constraints
- Unsupervised Meta-Learning via In-Context Learning
- Unsupervised Model Tree Heritage Recovery
- Unsupervised Multiple Kernel Learning for Graphs via Ordinality Preservation
- Unsupervised Zero-Shot Reinforcement Learning via Dual-Value Forward-Backward Representation
- UNSURE: self-supervised learning with Unknown Noise level and Stein's Unbiased Risk Estimate
- Unveiling the Magic of Code Reasoning through Hypothesis Decomposition and Amendment
- URLOST: Unsupervised Representation Learning without Stationarity or Topology
- U-shaped and Inverted-U Scaling behind Emergent Abilities of Large Language Models
- Utilitarian Algorithm Configuration for Infinite Parameter Spaces
- Utilizing Explainable Reinforcement Learning to Improve Reinforcement Learning: A Theoretical and Systematic Framework
- UV-Attack: Physical-World Adversarial Attacks for Person Detection via Dynamic-NeRF-based UV Mapping
- VAE-Var: Variational Autoencoder-Enhanced Variational Methods for Data Assimilation in Meteorology
- Valid Conformal Prediction for Dynamic GNNs
- Value-aligned Behavior Cloning for Offline Reinforcement Learning via Bi-level Optimization
- Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF
- Variance-Reducing Couplings for Random Features
- Variational Bayesian Pseudo-Coreset
- Variational Best-of-N Alignment
- Variational Diffusion Posterior Sampling with Midpoint Guidance
- Variational Search Distributions
- Varying Shades of Wrong: Aligning LLMs with Wrong Answers Only
- VCR: Visual Caption Restoration
- Vec2Face: Scaling Face Dataset Generation with Loosely Constrained Vectors
- Vector-ICL: In-context Learning with Continuous Vector Representations
- VEDIT: Latent Prediction Architecture For Procedural Video Representation Learning
- VerifAI: AI Verification in the Wild
- Verifying Properties of Binary Neural Networks Using Sparse Polynomial Optimization
- Vertical Federated Learning with Missing Features During Training and Inference
- Vevo: Controllable Zero-Shot Voice Imitation with Self-Supervised Disentanglement
- VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models
- ViBiDSampler: Enhancing Video Interpolation Using Bidirectional Diffusion Sampler
- VICtoR: Learning Hierarchical Vision-Instruction Correlation Rewards for Long-horizon Manipulation
- Video Action Differencing
- VideoGLUE: Video General Understanding Evaluation of Foundation Models
- VideoPhy: Evaluating Physical Commonsense for Video Generation
- VideoShield: Regulating Diffusion-based Video Generation Models via Watermarking
- Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision
- VideoWebArena: Evaluating Long Context Multimodal Agents with Video Understanding Web Tasks
- ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
- VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
- ViSAGe: Video-to-Spatial Audio Generation
- Vision and Language Synergy for Rehearsal Free Continual Learning
- Vision Language Models are In-Context Value Learners
- Vision-LSTM: xLSTM as Generic Vision Backbone
- Vision models trained to estimate spatial latents learned similar ventral-stream-aligned representations
- Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
- VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents
- VisualAgentBench: Towards Large Multimodal Models as Visual Agents
- Visual Agents as Fast and Slow Thinkers
- Visual Description Grounding Reduces Hallucinations and Boosts Reasoning in LVLMs
- Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark
- Visually Consistent Hierarchical Image Classification
- Visually Guided Decoding: Gradient-Free Hard Prompt Inversion with Language Models
- Visual-O1: Understanding Ambiguous Instructions via Multi-modal Multi-turn Chain-of-thoughts Reasoning
- VLAS: Vision-Language-Action Model with Speech Instructions for Customized Robot Manipulation
- VL-Cache: Sparsity and Modality-Aware KV Cache Compression for Vision-Language Model Inference Acceleration
- VL-ICL Bench: The Devil in the Details of Multimodal In-Context Learning
- VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks
- VLMaterial: Procedural Material Generation with Large Vision-Language Models
- Voila: Evaluation of MLLMs For Perceptual Understanding and Analogical Reasoning
- VoxDialogue: Can Spoken Dialogue Systems Understand Information Beyond Words?
- VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis
- VTDexManip: A Dataset and Benchmark for Visual-tactile Pretraining and Dexterous Manipulation with Reinforcement Learning
- VVC-Gym: A Fixed-Wing UAV Reinforcement Learning Environment for Multi-Goal Long-Horizon Problems
- Walk the Talk? Measuring the Faithfulness of Large Language Model Explanations
- Ward: Provable RAG Dataset Inference via LLM Watermarks
- WardropNet: Traffic Flow Predictions via Equilibrium-Augmented Learning
- Warm Diffusion: Recipe for Blur-Noise Mixture Diffusion Models
- Wasserstein Distances, Neuronal Entanglement, and Sparsity
- Wasserstein-Regularized Conformal Prediction under General Distribution Shift
- Watch Less, Do More: Implicit Skill Discovery for Video-Conditioned Policy
- Watermark Anything With Localized Messages
- Wavelet-based Positional Representation for Long Context
- Wavelet Diffusion Neural Operator
- WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
- Weakly-Supervised Affordance Grounding Guided by Part-Level Semantic Priors
- Weakly Supervised Video Scene Graph Generation via Natural Language Supervision
- Weak to Strong Generalization for Large Language Models with Multi-capabilities
- Weak-to-Strong Generalization Through the Data-Centric Lens
- Weak-to-Strong Preference Optimization: Stealing Reward from Weak Aligned Model
- WeatherGFM: Learning a Weather Generalist Foundation Model via In-context Learning
- Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation
- WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning
- Weighted Multi-Prompt Learning with Description-free Large Language Model Distillation
- Weighted Point Cloud Embedding for Multimodal Contrastive Learning Toward Optimal Similarity Metric
- Weighted-Reward Preference Optimization for Implicit Model Fusion
- What Are Good Positional Encodings for Directed Graphs?
- What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysis
- What Do You See in Common? Learning Hierarchical Prototypes over Tree-of-Life to Discover Evolutionary Traits
- What Has Been Overlooked in Contrastive Source-Free Domain Adaptation: Leveraging Source-Informed Latent Augmentation within Neighborhood Context
- What is Wrong with Perplexity for Long-context Language Modeling?
- What Kind of Pretraining Data Do Large Language Models Rely on When Doing Reasoning?
- What Makes a Good Diffusion Planner for Decision Making?
- What Makes a Maze Look Like a Maze?
- What Makes Large Language Models Reason in (Multi-Turn) Code Generation?
- What Matters in Learning from Large-Scale Datasets for Robot Manipulation
- What Matters When Repurposing Diffusion Models for General Dense Perception Tasks?
- What Secrets Do Your Manifolds Hold? Understanding the Local Geometry of Generative Models
- What should a neuron aim for? Designing local objective functions based on information theory
- What's New in My Data? Novelty Exploration via Contrastive Generation
- What's the Move? Hybrid Imitation Learning via Salient Points
- What to align in multimodal contrastive learning?
- When Attention Sink Emerges in Language Models: An Empirical View
- When does compositional structure yield compositional generalization? A kernel theory.
- When do GFlowNets learn the right distribution?
- When GNNs meet symmetry in ILPs: an orbit-based feature augmentation approach
- When Graph Neural Networks Meet Dynamic Mode Decomposition
- When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers
- When LLMs Play the Telephone Game: Cumulative Changes and Attractors in Iterated Cultural Transmissions
- When Prompt Engineering Meets Software Engineering: CNL-P as Natural and Robust "APIs'' for Human-AI Interaction
- When Selection meets Intervention: Additional Complexities in Causal Discovery
- Where Am I and What Will I See: An Auto-Regressive Model for Spatial Localization and View Prediction
- Which Tasks Should Be Compressed Together? A Causal Discovery Approach for Efficient Multi-Task Representation Compression
- White-Box Text Detectors Using Proprietary LLMs: A Probability Distribution Estimation Approach
- Why Does the Effective Context Length of LLMs Fall Short?
- Why In-Context Learning Models are Good Few-Shot Learners?
- Why RoPE Struggles to Maintain Long-Term Decay in Long Sequences?
- Wicked Oddities: Selectively Poisoning for Effective Clean-Label Backdoor Attacks
- Wide Neural Networks Trained with Weight Decay Provably Exhibit Neural Collapse
- WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild
- Will Synthetic Data Finally Solve the Data Access Problem?
- WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
- Words in Motion: Extracting Interpretable Control Vectors for Motion Transformers
- WorkflowLLM: Enhancing Workflow Orchestration Capability of Large Language Models
- Workshop on AI for Children: Healthcare, Psychology, Education
- Workshop on Embodied Intelligence with Large Language Models In Open City Environment
- Workshop on Reasoning and Planning for Large Language Models
- Workshop on Sparsity in LLMs (SLLM): Deep Dive into Mixture of Experts, Quantization, Hardware, and Inference
- Workshop on Spurious Correlation and Shortcut Learning: Foundations and Solutions
- World Model on Million-Length Video And Language With Blockwise RingAttention
- World Models: Understanding, Modelling and Scaling
- W-PCA Based Gradient-Free Proxy for Efficient Search of Lightweight Language Models
- XAI4Science: From Understanding Model Behavior to Discovering New Scientific Knowledge
- XAIguiFormer: explainable artificial intelligence guided transformer for brain disorder identification
- X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale
- X-Drive: Cross-modality Consistent Multi-Sensor Data Synthesis for Driving Scenarios
- X-Fi: A Modality-Invariant Foundation Model for Multimodal Human Sensing
- xFinder: Large Language Models as Automated Evaluators for Reliable Evaluation
- X-Gen: Ego-centric Video Prediction by Watching Exo-centric Videos
- XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning
- X-NeMo: Expressive Neural Motion Reenactment via Disentangled Latent Attention
- YOLO-RD: Introducing Relevant and Compact Explicit Knowledge to YOLO by Retriever-Dictionary
- Youku Dense Caption: A Large-scale Chinese Video Dense Caption Dataset and Benchmarks
- You Only Prune Once: Designing Calibration-Free Model Compression With Policy Learning
- You Only Sample Once: Taming One-Step Text-to-Image Synthesis by Self-Cooperative Diffusion GANs
- Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data
- Your Mixture-of-Experts LLM Is Secretly an Embedding Model for Free
- Your Weak LLM is Secretly a Strong Teacher for Alignment
- YouTube-SL-25: A Large-Scale, Open-Domain Multilingual Sign Language Parallel Corpus
- ZAPBench: A Benchmark for Whole-Brain Activity Prediction in Zebrafish
- Zero-cost Proxy for Adversarial Robustness Evaluation
- ZeroDiff: Solidified Visual-semantic Correlation in Zero-Shot Learning
- Zero-shot forecasting of chaotic systems
- Zero-shot Imputation with Foundation Inference Models for Dynamical Systems
- Zero-shot Model-based Reinforcement Learning using Large Language Models
- Zero-Shot Natural Language Explanations
- Zero-shot Novel View Synthesis via Adaptive Modulating Video Diffusion Process
- Zero-Shot Whole-Body Humanoid Control via Behavioral Foundation Models
- Zeroth-Order Fine-Tuning of LLMs with Transferable Static Sparsity
- Zeroth-Order Policy Gradient for Reinforcement Learning from Human Feedback without Reward Inference
- ZETA: Leveraging $Z$-order Curves for Efficient Top-$k$ Attention
- Zigzag Diffusion Sampling: Diffusion Models Can Self-Improve via Self-Reflection
- ZIP: An Efficient Zeroth-order Prompt Tuning for Black-box Vision-Language Models
- ZooProbe: A Data Engine for Evaluating, Exploring, and Evolving Large-scale Training Data for Multimodal LLMs