method
active
method:wilcoxon-testWilcoxon Test
Non-parametric statistical test used to assess significance of Φ differences between ToM score categories.
Neighborhood — ranked by edge-count
Methods (1)
method
- Wilcoxon Signed-Rank Testrelated_toStatistical test used to confirm that EFE after sticker removal is significantly lower than before
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Second of three operational criteria; requires distributional significance in IIT estimates across performance levels.
- A test of intelligence via linguistic performance; deemed insufficient for sentience assessment by Levin.
- Statistical test used to determine which factors predict koan battery scores across 28 models
- The more general, daily-use version of the mirror-of-self test: asking which of A or B induces greater feeling of wholeness in the observer
- Tests like Turing test, Artificial Consciousness Test; argued to be unreliable for AI due to mimicry.
- The behavioral paradigm (mark/sticker placed on face, checked in mirror) used to evaluate self-awareness in animals and infants
- Traditional criterion Levin argues is wholly insufficient for evaluating sentience in unconventional agents.
- Eliezer Yudkowsky's benchmark for LLM awareness, mentioned as test that collapsed-awareness models might fail.