framework
active
framework:deepseek-r1DeepSeek-R1
Open-source reasoning LLM from DeepSeekAI trained with reinforcement learning to exhibit self-reflection
Neighborhood — ranked by edge-count
Thinkers (1)
thinker
- DeepSeekAIstudiesOrganization that introduced DeepSeek-R1 and reported the aha moment of self-reflection
Frameworks (2)
framework
- ReflCtrlstudiesThe proposed framework for probing and steering self-reflection behavior in reasoning LLMs via representation engineering
- Cost-efficient training algorithm used by DeepSeek-R1 for RL-based reasoning
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- One of two large reasoning models analyzed in the paper for performative vs genuine CoT behavior
- External large language model used as adversarial discriminator to evaluate liar scores in Experiment 2
- External finding cited as early demonstration of emergent self-regulatory potential resembling mindful self-monitoring
- DeepSeek-R1: Incentivizing reasoning capability in LLMs via reinforcement learning (DeepSeekAI, 2025)concept0.782Paper introducing DeepSeek-R1 model and reporting self-reflection as aha moment
- One DS-v3.2 trace shows extreme self-escalation, suggestive of treating own bid as competitor.
- DS-v3.2 has a high proportion of self-bidding rounds.
- Only model showing marginal benefit from increased reflection, at substantial token cost
- LLM judge (deepseek-v3) agrees with human evaluator on 91.6% of 200 sampled jailbreak responsesfinding0.707Validates the LLM-based harm evaluation rubric