Training verifiers to solve math word problems

ByK. Cobbe·V. Kosaraju·M. Bavarian·M. Chen·H. Jun·L. Kaiser+4 more

arXiv 2110.14168

Related work— refs + corpus + external arXiv

Cited / in-corpus / arXiv badges show which signals surfaced each row. Multi-source rows weighted higher.

Teaching for large-scale Reproducibility Verification
Lars Vilhuber and Hyuk Harry Son and Meredith Welch and David N. Wasser and Michael Darisse
2026
≈ 71%
"Calibeating": Beating Forecasters at Their Own Game
Dean P. Foster and Sergiu Hart
2026
≈ 69%
E-valuator: Reliable Agent Verifiers with Sequential Hypothesis Testing
Drew Prinster, Clara Fannjiang, Gabriele Scalia, Aviv Regev, Hanchen Wang Shuvom Sadhuka
2025
≈ 68%
DataDignity: Training Data Attribution for Large Language Models
Andrzej Banburski-Fahey, Jaron Lanier Xiaomin Li
2026
≈ 68%
A Pattern Language for Machine Learning Tasks
Ian Fan, Tuomas Laakkonen, Neil John Ortega, Thomas Hoffmann, Vincent Wang-Mascianica Benjamin Rodatz
2025
≈ 67%
Designing Perceptual Puzzles by Differentiating Probabilistic Programs
Tzu-Mao Li, Joshua Tenenbaum, Jonathan Ragan-Kelley Kartik Chandra
2022
≈ 67%
AI co-mathematician: Accelerating mathematicians with agentic AI
Ingrid von Glehn, Yori Zwols, Iuliya Beloshapka, Lars Buesing, Daniel M. Roy, Martin Wattenberg, Bogdan Georgiev, Tatiana Schmidt, Andrew Cowie, Fernanda Viegas, Dimitri Kanevsky, Vineet Kahlon, Hartmut Maennel, Sophia Alj, George Holland, Alex Davies, Pushmeet Kohli Daniel Zheng
2026
≈ 67%
Failing to Falsify: Evaluating and Mitigating Confirmation Bias in Language Models
Anthony GX-Chen, Ilia Sucholutsky, Eunsol Choi Ayush Rajesh Jhaveri
2026
≈ 67%
Optimal Verification of (Mis)Information in Networks
Luca Paolo Merlino and Nicole Tabasso
2026
≈ 66%
Measuring What Matters: Benchmarking Generative, Multimodal, and Agentic AI in Healthcare
Harshit Rajgarhia, Shivali Dalmia, Ananya Mantravadi Prasanna Desikan
2026
≈ 66%
When Should Users Check? Modeling Confirmation Frequency inMulti-Step Agentic AI Tasks
Aryan Roy, Sneh Gupta, Daniel Weitekamp, Christopher J. MacLellan Jieyu Zhou
2026
≈ 66%
When Respondents Don't Care Anymore: Identifying the Onset of Careless Responding
Max Welz and Andreas Alfons
2026
≈ 66%
Discriminative Dictionary Learning based on Statistical Methods
Atul Negi G.Madhuri
2026
≈ 66%
Learning to Model the World with Language
Yuqing Du, Olivia Watkins, Danijar Hafner, Pieter Abbeel, Dan Klein, Anca Dragan Jessy Lin
2024
≈ 66%
Hint Tuning: Less Data Makes Better Reasoners
Minghao Li, Xiaoqian Ma, Xiusheng Huang, Zhuo Chen, Bowen Qin, Liujie Zhang, Shuo Shang, Weihang Chen Siqi Fan
2026
≈ 66%
Testing the Limits of Truth Directions in LLMs
in corpus
2026
≈ 63%
Natural Language Autoencoders Produce Unsupervised Explanations of LLM Activations
in corpus
≈ 62%
Learning without neurons in physical systems
in corpus
2022
≈ 62%
Probe-Based Data Attribution: Surfacing and Mitigating Undesirable Behaviors in LLM Post-Training
in corpus
2026
≈ 62%
Interpreting Language Model Parameters
in corpus
2026
≈ 62%
2024 03 07 Stefan Lesser Kay 1984 Opening the Hood of a Word Processor.pdf 414587
in corpus
≈ 61%
The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets
in corpus
2023
≈ 61%
Paper Summary: Interpreting Language Model Parameters
in corpus
≈ 61%
Verbalized Eval Awareness Inflates Measured Safety
in corpus
2026
≈ 61%
Arithmetic in the Wild: Llama uses Base-10 Addition to Reason About Cyclic Concepts
in corpus
2026
≈ 61%
The Assistant Axis: Situating and Stabilizing the Default Persona of Language Models
in corpus
2026
≈ 60%
A Mathematical Framework for Transformer Circuits
in corpus
2021
≈ 60%
Towards Safe and Honest AI Agents with Neural Self-Other Overlap
in corpus
2024
≈ 60%
Opening the Hood of a Word Processor
in corpus
1984
≈ 60%

Similar preprints — Semantic Scholar

Cited by (2)

The Platonic Representation Hypothesis
Neural networks trained on different data modalities, architectures, and objectives are converging toward a shared statistical model of reality — what the paper terms the "platonic representation" — f
Probe-Based Data Attribution: Surfacing and Mitigating Undesirable Behaviors in LLM Post-Training
Probe-based data attribution, introduced here as a method for surfacing and mitigating undesirable post-training behaviors, reduces harmful compliance in OLMo 2 7B by 63% through datapoint filtering a