finding
active
finding:strength-comparison-pair-3-7-with-4-outperforms-pair-3-5-with-2-indicating-graded-sensitivity-to-perturbation-magnitudeStrength comparison pair (3,7) with |Δα|=4 outperforms pair (3,5) with |Δα|=2, indicating graded sensitivity to perturbation magnitude
Shows that introspective accuracy scales with injection strength difference, not binary detection
Source paper
extracted_from(2025) · Ely Hahami · I. N. Sinha · Jain, Lavik · Kaplan, Josh +1
Neighborhood — ranked by edge-count
Claims (1)
claim
- Primary positive claim of the paper, grounded in strength comparison and localization results
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Strength comparison accuracy reaches 73% at layer 3 for injection pair (2,6) vs. 50% chancefinding0.801Secondary positive result for strength comparison showing graded sensitivity to perturbation magnitude
- Strength comparison accuracy averages 47% at layers 15-30, indistinguishable from 50% chancefinding0.764Shows collapse of introspective capability at later layers in the strength comparison task
- Core result of Experiment 3: cross-model semantic convergence under self-referential processing
- E3 finding distinguishing the two geometry summaries; breadth less predictive than peak height
- Shows trait space has more cross-model consistency than role space beyond PC1
- The profound principle that underlies all living structure; symmetry as the mathematical trace of necessity.
- Claim about broader applicability of the scaling argument
- Impulsivity→interest: ρ increases from 0.70 (α=-4) to 0.83 (α=+4); R² from 0.46 to 0.69 in LLaMA-3.2-3Bfinding0.745Scatter plot visualization showing strengthened probe-report relationship across alpha range