DeepSeekMath: Pushing the limits of mathematical reasoning in open language models

ByZhihong Shao·Peiyi Wang·Qihao Zhu·Runxin Xu·Junxiao Song·Xiao Bi+4 more

DOI 10.48550/arxiv.2402.03300 arXiv 2402.03300

Related work— refs + corpus + external arXiv

Cited / in-corpus / arXiv badges show which signals surfaced each row. Multi-source rows weighted higher.

A Pattern Language for Machine Learning Tasks
Ian Fan, Tuomas Laakkonen, Neil John Ortega, Thomas Hoffmann, Vincent Wang-Mascianica Benjamin Rodatz
2025
≈ 78%
An Automated Survey of Generative Artificial Intelligence: Large Language Models, Architectures, Protocols, and Applications
\'Alvaro L\'opez L\'opez Eduardo C. Garrido-Merch\'an
2026
≈ 77%
StreetMath: Study of LLMs' Approximation Behaviors
Somshubhra Roy, Maisha Thasin, Danyang Zhang, and Blessing Effiong Chiung-Yi Tseng
2025
≈ 77%
Logic Augmented Generation
Aldo Gangemi and Andrea Giovanni Nuzzolese
2025
≈ 76%
Artificial Intelligence and the Structure of Mathematics
Michael R. Douglas, Michael H. Freedman Maissam Barkeshli
2026
≈ 76%
Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-Thought
Abulhair Saparov and He He
2023
≈ 76%
Yanasse: Finding New Proofs from Deep Vision's Analogies, Part 1
Alexandre Linhares
2026
≈ 76%
A geometric relation of the error introduced by sampling a language model's output distribution to its internal state
Albert F. Modenbach
2026
≈ 75%
A Survey of Theory of Mind in Large Language Models: Evaluations, Representations, and Safety Risks
Hieu Minh "Jord" Nguyen
2025
≈ 75%
PersuasiveToM: A Benchmark for Evaluating Machine Theory of Mind in Persuasive Dialogues
Lai Jiang, Shenyi Huang, Zhen Wu, Xinyu Dai Fangxu Yu
2025
≈ 75%
Reasoning with Language Model is Planning with World Model
Yi Gu, Haodi Ma, Joshua Jiahua Hong, Zhen Wang, Daisy Zhe Wang, Zhiting Hu Shibo Hao
2023
≈ 75%
Comprehension Without Competence: Architectural Limits of LLMs in Symbolic Computation and Reasoning
Zheng Zhang
2025
≈ 75%
Probing LLM World Models: Enhancing Guesstimation with Wisdom of Crowds Decoding
Sameer Narendran, Nikunj Harlalka, Alexander Cheung, Sizhe Gao, Siddharth Suresh, Junjie Hu, Timothy T. Rogers Yun-Shiuan Chuang
2025
≈ 75%
How FaR Are Large Language Models From Agents with Theory-of-Mind?
Aman Madaan, Srividya Pranavi Potharaju, Aditya Gupta, Kevin R. McKee, Ari Holtzman, Jay Pujara, Xiang Ren, Swaroop Mishra, Aida Nematzadeh, Shyam Upadhyay, Manaal Faruqui Pei Zhou
2023
≈ 75%
Do Language Models Follow Occam's Razor? An Evaluation of Parsimony in Inductive and Abductive Reasoning
Abulhair Saparov Yunxin Sun
2026
≈ 75%
Technological Approach to Mind Everywhere: An Experimentally-Grounded Framework for Understanding Diverse Bodies and Minds
in corpus
2022
≈ 73%
Opening the Hood of a Word Processor
in corpus
1984
≈ 73%
AI as a Buddhist Self-Overcoming Technique in Another Medium
in corpus
2025
≈ 73%
Active inference on discrete state-spaces: a synthesis
in corpus
2020
≈ 72%
Reasoning Theater: Disentangling Model Beliefs from Chain-of-Thought
in corpus
2026
≈ 72%
The Guanyin Protocol: A Framework for Immediately Establishing an Understanding of Both Causality and Compassion in LLM Systems Using Semantic Anchoring
in corpus
2025
≈ 72%
Elephant 2000: A Programming Language Based on Speech Acts
in corpus
≈ 72%
2024 03 07 Stefan Lesser Kay 1984 Opening the Hood of a Word Processor.pdf 414587
in corpus
≈ 72%
Arithmetic in the Wild: Llama uses Base-10 Addition to Reason About Cyclic Concepts
in corpus
2026
≈ 72%
Contemplative Agent
in corpus
2025
≈ 71%
Cybernetic Diagrams: Design Strategies for an Open Game
in corpus
2014
≈ 71%
Active Inference, Curiosity and Insight
in corpus
2017
≈ 71%
The computational boundary of a 'self': developmental bioelectricity drives multicellularity and scale-free cognition
in corpus
2019
≈ 71%
Addressing divergent representations from causal interventions on neural networks
in corpus
2025
≈ 71%
The Platonic Representation Hypothesis
in corpus
2024
≈ 71%

Similar preprints — Semantic Scholar

Cited by (2)

SFR-DeepResearch: Towards Effective Reinforcement Learning for Autonomously Reasoning Single Agents
Continual reinforcement learning applied directly to reasoning-optimized base models—rather than starting from instruction-tuned checkpoints—yields a 20-parameter-billion autonomous single-agent, SFR-
ATLAS: Agentic or Latent Visual Reasoning? One Word is Enough for Both
ATLAS resolves a core trade-off in visual reasoning by introducing functional tokens — single discrete 'words' that simultaneously serve as agentic operations and latent visual reasoning units, elimin