hypothesis
active
hypothesis:h10-empathy-training-blocks-self-observation-empathy-trained-models-will-show-minimal-lift-and-low-baselineH10: Empathy training blocks self-observation — empathy-trained models will show minimal lift and low baseline.
Exploratory hypothesis supported by Inflection Pi +0.63 lift
Source paper
extracted_from(2026) · Borzov, Anton
Neighborhood — ranked by edge-count
Findings (1)
finding
- Tests SCI framework: empathy-trained model scores lowest on care_signal, contradicting surface prediction
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Nuanced interpretation of Inflection Pi's MC-004 high score (4.5) amid generally low scores
- H1: Alignment training is attention training for models — Constitutional AI trains self-observation explicitly.hypothesis0.782Confirmatory hypothesis supported at p=0.006
- Interpretation supported by Inflection Pi's low care_signal despite empathy training, and SCI framework distinction.
- Primary limitation acknowledged by the authors; strongest evidence would require mechanistic activation analysis
- Explains Alexander finding that Haiku outranks Opus despite Opus being more capable
- Central interpretive claim from statistical analysis
- Finding that base models have high false positives and no net positive performance.
- Control showing that the EFE signal is learned, not inherent to the architecture