finding

active

finding:several-mixtral-8x7b-samples-could-not-be-initialized-as-valid-networks-using-pyphi-under-iit-4-0-and-were-excluded

Several Mixtral-8x7B samples could not be initialized as valid networks using PyPhi under IIT 4.0 and were excluded.

Methodological limitation disproportionately affecting the largest MoE model, constraining generalizability.

Source paper

extracted_from

Can "consciousness" be observed from large language model (LLM) internal states? Dissecting LLM representations obtained from Theory of Mind test with Integrated Information Theory and Span Representation analysis

(2025) · Li, Jingkai

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Under spatio permutation controls, two cases (Layer 32 of Mixtral-8x7B on Strange Stories, IIT 4.0, Linguistic Spans: Entire and Complement) satisfy all three criteria.finding0.762
Contrasts with temporal permutation results; constitutes the most suggestive evidence of potential consciousness phenomena in LLM representations.
8-layer ϕ_nonlin achieves near-perfect IIA on Pythia-410m at all training steps including random initialisation on IOI taskfinding0.750
Training progression result showing non-linear maps are uncorrelated with genuine task learning
Layer 24 (indexed at 8) of LLaMA3.1-8B on Hinting satisfies Criteria 1 and 2 under both IIT 3.0 and IIT 4.0 (temporal permutation).finding0.735
One of the most promising cases; approximately corresponds to the 2/3 layer of LLaMA3.1-8B.
DAS on randomly initialized small networks (|N|=16) achieves only 0.50 IIA (chance), cannot construct new behaviorsfinding0.734
Demonstrates DAS cannot manufacture behaviors from random structure in appropriately sized networks.
Layer 29 (indexed at 10) of LLaMA3.1-8B on Strange Stories (2 scores) satisfies Criteria 1 and 2 under IIT 4.0 (temporal permutation).finding0.722
Third promising case from temporal permutation analysis.
DAS on oversized randomly initialized network (|N|=4096 for 16-dim input) achieves 0.64 IIA by searching random structurefinding0.721
Shows that overly large hidden dimensions allow DAS to find random causal structures; calibration check.
Mixtral-8x7Bconcept0.720
One of four LLMs selected; Mixture-of-Experts model; had substantial sample loss under IIT 4.0 due to PyPhi network initialization issues.
F0-trained probes in layers 4-10 show inverted separation on F1 (AUROC ≈ 0), systematically misclassifying true statements as false.finding0.718
Demonstrates that early-layer probes capture sentence polarity rather than truth.