finding

active

finding:deepseek-r1-zero-spontaneously-increased-thinking-time-for-difficult-prompts-showing-rudimentary-meta-awareness

DeepSeek-R1-Zero spontaneously increased thinking time for difficult prompts, showing rudimentary meta-awareness

External finding cited as early demonstration of emergent self-regulatory potential resembling mindful self-monitoring

Source paper

extracted_from

Contemplative Agent

(2025) · Ruben Laukkonen · Fionn Inglis · Shamil Chandaria · Lars Sandved-Smith +4

Neighborhood — ranked by edge-count

Claims (1)

claim

A mindfulness module could check for divergences such as newly spawned subgoals that do not match ethical constraints, triggering corrective measures
supports
Specific implementation claim connecting mindfulness to the inner alignment meta-problem

Frameworks (1)

framework

Contemplative Reinforcement Learning
supports
Paper's proposed RL approach rewarding contemplative qualities in chain-of-thought reasoning

Events (1)

event

DeepSeek-R1-Zero Release (Guo et al., 2025)
associated_with
Release of model demonstrating spontaneous meta-awareness in complex tasks; cited as empirical precursor to CRL

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

DeepSeek-R1: Incentivizing reasoning capability in LLMs via reinforcement learning (DeepSeekAI, 2025)concept0.821
Paper introducing DeepSeek-R1 model and reporting self-reflection as aha moment
DeepSeek-R1framework0.810
Open-source reasoning LLM from DeepSeekAI trained with reinforcement learning to exhibit self-reflection
DeepSeek-R1 Llama 8b gains 0.16% accuracy on GSM8k with positive intervention (more reflections) at cost of ~2000 additional tokensfinding0.761
Only model showing marginal benefit from increased reflection, at substantial token cost
DeepSeek-R1 671Bconcept0.732
One of two large reasoning models analyzed in the paper for performative vs genuine CoT behavior
Random word prefix prompts show emergence patterns similar to no-prompt, suggesting prompt length alone does not shift truth geometry.claim0.723
Control experiment ruling out token-count as the cause of truth geometry shifts.
Minimal contemplative prompt ('Be present, not helpful.' — 27 chars) shows no lift on Haiku (-0.01)finding0.722
Full three-part structure required; anti-helpfulness framing alone insufficient
The sensitivity to think/don't think instructions may be achieved via a circuit that tags tokens as attention-worthy based on instructions or incentiveshypothesis0.719
Mechanism for how the model modulates representation strength.
DeepSeek-R1 Llama 8b accuracy on MMLU Professional Accounting drops from 56.5% at baseline to 50.1% at intervention -0.96finding0.717
Shows smaller models are more sensitive to reflection reduction on non-math tasks