finding
active
finding:impulsivity-interest-steering-probe-entropy-increases-lmm-slope-0-024-p-2-30-10-4-but-report-entropy-does-not-p-0-11Impulsivity→interest steering: probe entropy increases (LMM slope=0.024, p=2.30×10⁻⁴) but report entropy does not (p=0.11)
Evidence of a bottleneck between richer internal variation and final report distribution in impulsivity→interest condition
Source paper
extracted_from(2026) · Nicolas Martorell · Bianchi, Bruno
Neighborhood — ranked by edge-count
Claims (1)
claim
- Conceptual distinction motivated by entropy analyses showing probe and report entropy can diverge under steering
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Evidence that improved introspection in focus→wellbeing arises from enriched internal state and report channels simultaneously
- Second significant cross-concept introspection improvement; marginal after BH correction (q≈0.066)
- Impulsivity→interest: ρ increases from 0.70 (α=-4) to 0.83 (α=+4); R² from 0.46 to 0.69 in LLaMA-3.2-3Bfinding0.829Scatter plot visualization showing strengthened probe-report relationship across alpha range
- Quantifies per-concept effect size of same-concept steering on self-report
- Causal confirmation that coupling between self-report and internal state is genuine; steering toward positive pole increases report
- Strongest probe validation result; highest Cohen's d among the four concepts
- Third-strongest pooled introspective coupling in primary model
- Interest probe score drifts positively across turns: LMM slope=0.005, p=4.12×10⁻¹⁴ in LLaMA-3.2-3Bfinding0.790Demonstrates genuine internal-state dynamics in LLMs during multi-turn conversation