finding
active
finding:cross-concept-steering-impulsivity-interest-r2-increases-from-0-55-4-to-0-72-4-r2-0-10-p-0-012-in-llama-3-2-3bCross-concept steering: impulsivity→interest R² increases from 0.55 (α=-4) to 0.72 (α=+4), ∆R²=0.10, p=0.012 in LLaMA-3.2-3B
Second significant cross-concept introspection improvement; marginal after BH correction (q≈0.066)
Source paper
extracted_from(2026) · Nicolas Martorell · Bianchi, Bruno
Neighborhood — ranked by edge-count
Claims (2)
claim
- Supported by cross-concept steering finding that focus→wellbeing steering dramatically improves introspection
- Most of 4×4 cross-concept steering matrix shows no significant effect; two conditions survive
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Strongest cross-concept introspection improvement; survives BH correction (q≈0.011)
- Impulsivity→interest: ρ increases from 0.70 (α=-4) to 0.83 (α=+4); R² from 0.46 to 0.69 in LLaMA-3.2-3Bfinding0.875Scatter plot visualization showing strengthened probe-report relationship across alpha range
- Third-strongest pooled introspective coupling in primary model
- Quantifies per-concept effect size of same-concept steering on self-report
- Evidence of a bottleneck between richer internal variation and final report distribution in impulsivity→interest condition
- LLaMA-3.2-1B impulsivity introspection: ρ=0.21, p<10⁻⁴ (significant but weaker than 3B ρ=0.52)finding0.829Impulsivity shows significant introspection in 1B but declines in 8B; non-monotonic scaling
- Impulsivity introspective fidelity decreases from turn 1 to turn 10: ∆R²=-0.28 in LLaMA-3.2-3Bfinding0.819Opposite temporal trend to wellbeing/interest/focus; introspective fidelity weakens over conversation for impulsivity
- Strongest probe validation result; highest Cohen's d among the four concepts