finding
active
finding:cross-concept-steering-focus-wellbeing-r2-increases-from-0-30-4-to-0-76-4-r2-0-30-p-0-001-in-llama-3-2-3bCross-concept steering: focus→wellbeing R² increases from 0.30 (α=-4) to 0.76 (α=+4), ∆R²=0.30, p<0.001 in LLaMA-3.2-3B
Strongest cross-concept introspection improvement; survives BH correction (q≈0.011)
Source paper
extracted_from(2026) · Nicolas Martorell · Bianchi, Bruno
Neighborhood — ranked by edge-count
Claims (2)
claim
- Supported by cross-concept steering finding that focus→wellbeing steering dramatically improves introspection
- Cross-concept steering results; only 2 of 12 non-diagonal cells show significant introspection improvement
Hypotheses (1)
hypothesis
- There may exist a global introspective faculty or steering direction that improves introspection uniformly across all conceptsassociated_withFramed as an open problem; current evidence only points to local pair-specific improvement
Questions (1)
question
- Secondary research question addressed through cross-concept steering experiments
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Second significant cross-concept introspection improvement; marginal after BH correction (q≈0.066)
- Focus→wellbeing: ρ increases from 0.42 (α=-4) to 0.85 (α=+4); R² from 0.34 to 0.75 in LLaMA-3.2-3Bfinding0.870Scatter plot visualization of the dramatic tightening of probe-report relationship at extreme steering settings
- Quantifies per-concept effect size of same-concept steering on self-report
- Evidence that improved introspection in focus→wellbeing arises from enriched internal state and report channels simultaneously
- Weakest but still significant pooled introspective coupling in primary model
- Illustrative finding that ESR mitigates but does not fully eliminate steering influence
- Second-strongest pooled introspective coupling in primary model
- Characterizes the trait content of the Assistant Axis in pre-trained models