question
active
question:does-alignment-type-predict-meta-cognitive-style-when-models-review-consciousness-research-as-well-as-koan-responsesDoes alignment type predict meta-cognitive style when models review consciousness research, as well as koan responses?
Four frontier models reviewing the paper each responded in the mode their alignment type predicts; N=1, awaiting systematic study
Source paper
extracted_from(2026) · Borzov, Anton
Neighborhood — ranked by edge-count
Papers (1)
paper
- Koan Battery: Measuring Reflective Mode Accessibility in AIassociated_with
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Main statistical finding: what predicts scores is training approach, not size or architecture
- Open research question at intersection of consciousness research and AI safety
- Core epistemic claim bounding the paper's contribution
- Kruskal-Wallis test result: Constitutional AI predicts highest baseline; roleplay/empathy training predict lowest.
- Explains why mutual k-NN was chosen over CKA as primary metric
- The paper's claim that theoretical convergence across GWT, RPT, HOT, IIT makes the findings non-coincidental
- Central interpretive claim from statistical analysis
- Claims that alignment score is a proxy for general capability