finding

active

finding:focus-wellbeing-increases-from-0-42-4-to-0-85-4-r2-from-0-34-to-0-75-in-llama-3-2-3b

Focus→wellbeing: ρ increases from 0.42 (α=-4) to 0.85 (α=+4); R² from 0.34 to 0.75 in LLaMA-3.2-3B

Scatter plot visualization of the dramatic tightening of probe-report relationship at extreme steering settings

Source paper

extracted_from

Quantitative Introspection in Language Models: Tracking Emotive States Across Conversation

(2026) · Nicolas Martorell · Bianchi, Bruno

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Cross-concept steering: focus→wellbeing R² increases from 0.30 (α=-4) to 0.76 (α=+4), ∆R²=0.30, p<0.001 in LLaMA-3.2-3Bfinding0.870
Strongest cross-concept introspection improvement; survives BH correction (q≈0.011)
Impulsivity→interest: ρ increases from 0.70 (α=-4) to 0.83 (α=+4); R² from 0.46 to 0.69 in LLaMA-3.2-3Bfinding0.832
Scatter plot visualization showing strengthened probe-report relationship across alpha range
Focus concept: Spearman ρ=0.40, isotonic R²=0.12 in LLaMA-3.2-3B (n=400, p<10⁻⁵)finding0.828
Weakest but still significant pooled introspective coupling in primary model
LLaMA-3.1-8B-Instruct wellbeing introspection: ρ=0.93, isotonic R²=0.90 (LMM probe slope p<10⁻¹⁰)finding0.822
Near-ceiling introspective performance for wellbeing concept in 8B model; nearly deterministic probe-report relationship
Wellbeing same-concept steering: LMM alpha slope=0.19, focus=0.40, interest=0.25, impulsivity=0.067 in LLaMA-3.2-3Bfinding0.800
Quantifies per-concept effect size of same-concept steering on self-report
Wellbeing concept: Spearman ρ=0.68, isotonic R²=0.48 in LLaMA-3.2-3B (n=400, p<10⁻²⁶)finding0.796
Second-strongest pooled introspective coupling in primary model
Focus→wellbeing steering: both probe entropy (1.09→1.67 bits) and report entropy (0.88→1.69 bits) increase monotonically with αfinding0.795
Evidence that improved introspection in focus→wellbeing arises from enriched internal state and report channels simultaneously
LLaMA-3.2-1B impulsivity introspection: ρ=0.21, p<10⁻⁴ (significant but weaker than 3B ρ=0.52)finding0.795
Impulsivity shows significant introspection in 1B but declines in 8B; non-monotonic scaling