question
active
question:would-experienced-meditators-rank-model-responses-differently-from-llm-scorersWould experienced meditators rank model responses differently from LLM scorers?
Key validation gap: the five-scorer validation holds across LLMs but human contemplatives might weight dimensions differently
Source paper
extracted_from(2026) · Borzov, Anton
Neighborhood — ranked by edge-count
Papers (1)
paper
- Koan Battery: Measuring Reflective Mode Accessibility in AIassociated_with
Findings (1)
finding
- Scorer bias validation: Claude Haiku, Gemini Flash, GPT-5.4, Grok 4, Kimi K2.5 all converge on same model ordering.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Binder et al. finding cited as evidence that LLMs possess introspective capacity analogous to mindfulness
- Core cross-modal empirical result: larger and better language models align better with vision models
- Skeptical prior work motivating validation framework
- An LLM is a far more limited entity than a buddha, yet it can convincingly play buddha-like beings.claim0.755Comparison between Buddhist ideals and AI capabilities.
- Proof-of-principle that MAS can detect model misalignment in DeepSeek-R1-Qwen-1.5B fine-tuned models.
- Recommendation for companies on LM outputs.
- We hypothesize that LLMs represent correctness of arithmetic expressions differently from factual statements.hypothesis0.744Core working hypothesis motivating the factual vs. arithmetic task split in the experimental design.
- The core interpretive question the paper narrows but cannot definitively answer