finding

active

finding:llama-3-2-1b-impulsivity-introspection-0-21-p-10-4-significant-but-weaker-than-3b-0-52

LLaMA-3.2-1B impulsivity introspection: ρ=0.21, p<10⁻⁴ (significant but weaker than 3B ρ=0.52)

Impulsivity shows significant introspection in 1B but declines in 8B; non-monotonic scaling

Source paper

extracted_from

Quantitative Introspection in Language Models: Tracking Emotive States Across Conversation

(2026) · Nicolas Martorell · Bianchi, Bruno

Neighborhood — ranked by edge-count

Claims (2)

claim

Introspective capacity is present from the first conversation turn, not requiring multi-turn context to emerge
contradictssupports
Three of four concepts show significant introspection at turn 1; rules out joint temporal drift as sole explanation
Introspective capacity scales with model size for some concepts, approaching near-perfect coupling in LLaMA-3.1-8B
contradicts
Validated for wellbeing and interest; focus and impulsivity do not show consistent scaling

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Impulsivity→interest: ρ increases from 0.70 (α=-4) to 0.83 (α=+4); R² from 0.46 to 0.69 in LLaMA-3.2-3Bfinding0.875
Scatter plot visualization showing strengthened probe-report relationship across alpha range
Impulsivity concept: Spearman ρ=0.51, isotonic R²=0.31 in LLaMA-3.2-3B (n=400, p<10⁻¹²)finding0.864
Third-strongest pooled introspective coupling in primary model
LLaMA-3.1-8B-Instruct wellbeing introspection: ρ=0.93, isotonic R²=0.90 (LMM probe slope p<10⁻¹⁰)finding0.862
Near-ceiling introspective performance for wellbeing concept in 8B model; nearly deterministic probe-report relationship
Impulsivity probe: peak Cohen's d=3.60 (layer 13), p=3.58×10⁻¹³ in LLaMA-3.2-3Bfinding0.848
Strongest probe validation result; highest Cohen's d among the four concepts
Impulsivity introspective fidelity decreases from turn 1 to turn 10: ∆R²=-0.28 in LLaMA-3.2-3Bfinding0.834
Opposite temporal trend to wellbeing/interest/focus; introspective fidelity weakens over conversation for impulsivity
Cross-concept steering: impulsivity→interest R² increases from 0.55 (α=-4) to 0.72 (α=+4), ∆R²=0.10, p=0.012 in LLaMA-3.2-3Bfinding0.829
Second significant cross-concept introspection improvement; marginal after BH correction (q≈0.066)
Wellbeing introspective strength at turn 1: ρ=0.52, p=5.46×10⁻⁴ in LLaMA-3.2-3Bfinding0.823
Demonstrates introspection is present from the first conversation turn without needing multi-turn context
Interest introspection improves from 1B to 3B: ρ from 0.19 to 0.80, R² from 0.05 to 0.66finding0.814
Largest single-step scaling improvement; demonstrates dramatic introspection gain between 1B and 3B models for interest