claim
active
claim:cross-concept-introspection-improvement-is-pair-specific-rather-than-revealing-a-single-globally-tunable-introspection-facultyCross-concept introspection improvement is pair-specific rather than revealing a single globally tunable introspection faculty
Most of 4×4 cross-concept steering matrix shows no significant effect; two conditions survive
Source paper
extracted_from(2026) · Nicolas Martorell · Bianchi, Bruno
Neighborhood — ranked by edge-count
Findings (1)
finding
- Second significant cross-concept introspection improvement; marginal after BH correction (q≈0.066)
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- There may exist a global introspective faculty or steering direction that improves introspection uniformly across all conceptshypothesis0.802Framed as an open problem; current evidence only points to local pair-specific improvement
- Cross-concept steering results; only 2 of 12 non-diagonal cells show significant introspection improvement
- Interpretive claim about the mechanistic substrate of introspection in LLMs
- Alternative interpretations offered for why binary detection fails in Llama 3.1 8B but frontier models claim success
- Interpretation of the observation that the most capable models performed best.
- Critical methodological claim directed at Lindsey 2026 and similar work using binary detection
- Scaling hypothesis for language-based contemplative alignment approaches
- Opus 4.1 is most effective at recognizing injected abstract concepts (e.g., justice, peace) but detects other categories too.