NeurIPS 2025 — Best Paper Awards

Snapshot of the four Best Papers and three Runners-Up awarded at NeurIPS 2025, spanning LLMs, RL, diffusion models, theory and benchmarks.

Announced · November 26, 2025
7 papers • 4 Best · 3 Runner-up
Tracks • Main + Datasets & Benchmarks
Domains • Generative models, RL, LLMs, Theory
๐Ÿ” Explore • Full NeurIPS 2025 Explorer
Total Awards 7 4 Best · 3 Runner-up
๐Ÿ†
Key Themes 5 LLMs, RL, diffusion, theory, benchmarks
๐Ÿง 
Awarded Tracks 2 Main + Datasets & Benchmarks
๐Ÿ“Š
Research Axes Method + Theory Architectures, dynamics, limits
๐Ÿงฌ

Theme Legend

Color-coded by dominant research area
LLMs & Architectures
Reinforcement Learning
Diffusion & Generative Models
Learning Theory & Scaling
Datasets & Benchmarks

Best Papers

Four Best Papers (incl. one in Datasets & Benchmarks)

Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)

Best Paper · D&B

Infinity-Chat benchmark for open-ended prompts + large-scale analysis of diversity and โ€œartificial hivemindโ€ effects in LLM generations.

๐Ÿค– LLM Behaviour ๐Ÿ“Š Datasets & Benchmarks ๐ŸŒ Diversity & Society

Introduces Infinity-Chat (26k open-ended queries + dense human annotations) and shows strong intra-model repetition and inter-model homogeneity, raising concerns about long-term creativity and value pluralism.

Liwei Jiang et al.
View paper โ†—

Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

Best Paper

Simple head-specific sigmoid gating after SDPA that stabilizes training, reduces attention sink, and improves long-context performance in large-scale LLMs.

๐Ÿง  LLM Architectures ๐Ÿ“ˆ Scaling & Stability ๐Ÿงฉ Gating Mechanisms

Compares dozens of gated-attention variants on 15B MoE and 1.7B dense models and finds a consistently strong win for a single, easy-to-implement gating design now used in Qwen3-Next models.

Zihan Qiu et al.
View paper โ†—

1000 Layer Networks for Self-Supervised RL: Scaling Depth Can Enable New Goal-Reaching Capabilities

Best Paper

Demonstrates that very deep (up to 1024-layer) self-supervised RL agents can achieve strong goal-reaching performance without explicit rewards.

๐ŸŽฎ Reinforcement Learning ๐Ÿ“ Depth Scaling ๐Ÿฆพ Locomotion & Control

Challenges the assumption that RL is incompatible with very deep nets. Using contrastive, goal-conditioned self-supervision, depth scaling yields higher success rates and richer emergent behaviours in simulated tasks.

Kevin Wang et al.
View paper โ†—

Why Diffusion Models Donโ€™t Memorize: The Role of Implicit Dynamical Regularization in Training

Best Paper

Explains how training dynamics create a window where diffusion models generalize well before memorization sets in, even in over-parameterized regimes.

๐ŸŒ€ Diffusion Models ๐Ÿ“ Implicit Regularization โš–๏ธ Generalization vs. Memorization

Identifies two characteristic timescales for diffusion training and shows that the generalization window grows with dataset size, tying practical success to provable dynamical regularization.

Tony Bonnaire et al.
View paper โ†—

Runner-Up Papers

Three additional papers recognized for outstanding contributions

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Runner-up

Systematic probe of RL with verifiable rewards (RLVR) shows improved sampling efficiency but no fundamentally new reasoning patterns beyond the base model.

๐Ÿง  LLM Reasoning ๐ŸŽฏ RLVR ๐Ÿงช Evaluation & Limits

Across models, algorithms and benchmarks, RLVR narrows exploration and amplifies rewarded trajectories without expanding the underlying reasoning frontier; distillation is found to add truly new reasoning patterns.

Yang Yue et al.
View paper โ†—

Optimal Mistake Bounds for Transductive Online Learning

Runner-up

Resolves a 30-year-old open problem on the value of unlabeled data in online learning with tight ฮฉ(โˆšd) / O(โˆšd) bounds.

๐Ÿ“š Learning Theory ๐Ÿ“‰ Mistake Bounds ๐Ÿ” Transductive Setting

Shows a quadratic gap between transductive and standard online learning and clarifies when advanced access to unlabeled instances yields exponential improvements over prior lower bounds.

Zachary Chase et al.
View paper โ†—

Superposition Yields Robust Neural Scaling

Runner-up

Argues that representation superposition is a key mechanism behind neural scaling laws in large models.

๐Ÿงฌ Neural Scaling Laws ๐Ÿงฉ Feature Superposition ๐Ÿ“ Chinchilla Regime

Uses a controlled toy model plus empirical analysis of open LLMs to show that strong superposition naturally produces inverse-dimension scaling of loss, explaining when scaling laws hold or break.

Yizhou Liu et al.
View paper โ†—