NeurIPS 2025 — Best Paper Awards
Snapshot of the four Best Papers and three Runners-Up awarded at NeurIPS 2025, spanning LLMs, RL, diffusion models, theory and benchmarks.
Theme Legend
Color-coded by dominant research areaBest Papers
Four Best Papers (incl. one in Datasets & Benchmarks)Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)
Infinity-Chat benchmark for open-ended prompts + large-scale analysis of diversity and โartificial hivemindโ effects in LLM generations.
Introduces Infinity-Chat (26k open-ended queries + dense human annotations) and shows strong intra-model repetition and inter-model homogeneity, raising concerns about long-term creativity and value pluralism.
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free
Simple head-specific sigmoid gating after SDPA that stabilizes training, reduces attention sink, and improves long-context performance in large-scale LLMs.
Compares dozens of gated-attention variants on 15B MoE and 1.7B dense models and finds a consistently strong win for a single, easy-to-implement gating design now used in Qwen3-Next models.
1000 Layer Networks for Self-Supervised RL: Scaling Depth Can Enable New Goal-Reaching Capabilities
Demonstrates that very deep (up to 1024-layer) self-supervised RL agents can achieve strong goal-reaching performance without explicit rewards.
Challenges the assumption that RL is incompatible with very deep nets. Using contrastive, goal-conditioned self-supervision, depth scaling yields higher success rates and richer emergent behaviours in simulated tasks.
Why Diffusion Models Donโt Memorize: The Role of Implicit Dynamical Regularization in Training
Explains how training dynamics create a window where diffusion models generalize well before memorization sets in, even in over-parameterized regimes.
Identifies two characteristic timescales for diffusion training and shows that the generalization window grows with dataset size, tying practical success to provable dynamical regularization.
Runner-Up Papers
Three additional papers recognized for outstanding contributionsDoes Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
Systematic probe of RL with verifiable rewards (RLVR) shows improved sampling efficiency but no fundamentally new reasoning patterns beyond the base model.
Across models, algorithms and benchmarks, RLVR narrows exploration and amplifies rewarded trajectories without expanding the underlying reasoning frontier; distillation is found to add truly new reasoning patterns.
Optimal Mistake Bounds for Transductive Online Learning
Resolves a 30-year-old open problem on the value of unlabeled data in online learning with tight ฮฉ(โd) / O(โd) bounds.
Shows a quadratic gap between transductive and standard online learning and clarifies when advanced access to unlabeled instances yields exponential improvements over prior lower bounds.
Superposition Yields Robust Neural Scaling
Argues that representation superposition is a key mechanism behind neural scaling laws in large models.
Uses a controlled toy model plus empirical analysis of open LLMs to show that strong superposition naturally produces inverse-dimension scaling of loss, explaining when scaling laws hold or break.