concept

active

concept:criterion-1-estimates-must-yield-80-good-cases-higher-score-higher-per-tom-task-to-indicate-potential-consciousness

Criterion 1: Φ estimates must yield >80% 'good' cases (higher score → higher Φ) per ToM task to indicate potential consciousness.

First of three operational criteria for identifying consciousness phenomena in LLM representations.

Neighborhood — ranked by edge-count

Concepts (1)

concept

Criterion 2: Statistically significant Φ value differences (p<0.05) across ToM score categories via Wilcoxon test.
associated_with
Second of three operational criteria; requires distributional significance in IIT estimates across performance levels.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

If 'consciousness' phenomenon can be observed from ToM-related RN, higher ToM test scores should yield higher values of μΦmax (IIT 3.0) and/or μΦ (IIT 4.0).hypothesis0.864
Specific prediction linking IIT's prediction of high Φ for good performance to the experimental design's scoring structure.
Even if a case successfully meets all three criteria, this does not necessarily indicate that the corresponding sequence of representations is conscious. Rather, it suggests the observation of a potential 'consciousness' phenomenon within these representations — nothing more.quote0.793
Load-bearing epistemic caution the author places on the entire analytical framework.
None of the cases identified under temporal permutation satisfy the Criterion 1 threshold of >80% 'good' cases for any ToM task.finding0.779
Even the rare cases where good > bad do not reach the 80% significance threshold required by Criterion 1.
Consciousness in AI is best assessed by drawing on neuroscientific theories of consciousness.claim0.775
Central methodological claim of the paper.
It is basically impossible to determine if a computer program generates conscious experience by merely observing its performance; a test for consciousness must take internal structure into account.claim0.772
Paper's argument against behavioral tests for consciousness, establishing why MCH requires internal analysis
Can estimates of Φ, the primary metric of IIT, robustly differentiate responses across distinct ToM performance levels?question0.772
Criterion 1 operationalization: requires >80% 'good' cases (higher score → higher Φ) per ToM task.
Criterion 3: IIT estimates must achieve higher mean AUC than Span Representation for ToM score classification.concept0.772
Third of three operational criteria; distinguishes consciousness from inherent LLM representational separations.
Do distinctions in Φ estimates remain robust across diverse ToM stimuli in repeated large-scale trials?question0.770
Criterion 2 operationalization: requires p<0.05 in Wilcoxon tests across score categories.