Kruskal-Wallis Test

Statistical test used to determine which factors predict koan battery scores across 28 models

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Turing Testframework0.716
A test of intelligence via linguistic performance; deemed insufficient for sentience assessment by Levin.
Wilcoxon Testmethod0.711
Non-parametric statistical test used to assess significance of Φ differences between ToM score categories.
Behavioural tests for consciousnessmethod0.697
Tests like Turing test, Artificial Consciousness Test; argued to be unreliable for AI due to mimicry.
Wilcoxon Signed-Rank Testmethod0.675
Statistical test used to confirm that EFE after sticker removal is significantly lower than before
There exists no viable behavioral test for consciousness analogous to the Turing Test for intelligence, because consciousness is a particular internal way to achieve performance, not externally visible performance itself.claim0.672
Paper identifies as a research gap requiring internal analysis methods rather than behavioral benchmarks
AI Consciousness Test (ACT)method0.671
Proposed test for AI consciousness by Schneider and Turner; uses verbal outputs.
strawberry testconcept0.669
Eliezer Yudkowsky's benchmark for LLM awareness, mentioned as test that collapsed-awareness models might fail.
Verbal reports (the Turing Test) and homology to human brains are utterly inadequate criteria for assessing the status of novel, unconventional agents that offer no familiar touchstone of phylogeny or anatomy.claim0.665
Core claim that standard criteria fail for novel agents.