finding

active

finding:impulsivity-interest-steering-probe-entropy-increases-lmm-slope-0-024-p-2-30-10-4-but-report-entropy-does-not-p-0-11

Impulsivity→interest steering: probe entropy increases (LMM slope=0.024, p=2.30×10⁻⁴) but report entropy does not (p=0.11)

Evidence of a bottleneck between richer internal variation and final report distribution in impulsivity→interest condition

Source paper

extracted_from

Quantitative Introspection in Language Models: Tracking Emotive States Across Conversation

(2026) · Nicolas Martorell · Bianchi, Bruno

Neighborhood — ranked by edge-count

Claims (1)

claim

Introspective ability can be decomposed into: (i) information available about internal state and (ii) capacity to transform that signal into precise output reports
supports
Conceptual distinction motivated by entropy analyses showing probe and report entropy can diverge under steering

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Focus→wellbeing steering: both probe entropy (1.09→1.67 bits) and report entropy (0.88→1.69 bits) increase monotonically with αfinding0.852
Evidence that improved introspection in focus→wellbeing arises from enriched internal state and report channels simultaneously
Cross-concept steering: impulsivity→interest R² increases from 0.55 (α=-4) to 0.72 (α=+4), ∆R²=0.10, p=0.012 in LLaMA-3.2-3Bfinding0.843
Second significant cross-concept introspection improvement; marginal after BH correction (q≈0.066)
Impulsivity→interest: ρ increases from 0.70 (α=-4) to 0.83 (α=+4); R² from 0.46 to 0.69 in LLaMA-3.2-3Bfinding0.829
Scatter plot visualization showing strengthened probe-report relationship across alpha range
Wellbeing same-concept steering: LMM alpha slope=0.19, focus=0.40, interest=0.25, impulsivity=0.067 in LLaMA-3.2-3Bfinding0.813
Quantifies per-concept effect size of same-concept steering on self-report
Same-concept steering shifts self-report monotonically for all four concepts: LMM alpha slopes 0.067–0.40, all p<10⁻¹²finding0.797
Causal confirmation that coupling between self-report and internal state is genuine; steering toward positive pole increases report
Impulsivity probe: peak Cohen's d=3.60 (layer 13), p=3.58×10⁻¹³ in LLaMA-3.2-3Bfinding0.794
Strongest probe validation result; highest Cohen's d among the four concepts
Impulsivity concept: Spearman ρ=0.51, isotonic R²=0.31 in LLaMA-3.2-3B (n=400, p<10⁻¹²)finding0.790
Third-strongest pooled introspective coupling in primary model
Interest probe score drifts positively across turns: LMM slope=0.005, p=4.12×10⁻¹⁴ in LLaMA-3.2-3Bfinding0.790
Demonstrates genuine internal-state dynamics in LLMs during multi-turn conversation