Li, Jingkai

name_hash 07a9f228654c98ee73a3fe42…

Authored

Introduces

Studies

Affiliations

Cited by

Authored papers (1)

Can "consciousness" be observed from large language model (LLM) internal states? Dissecting LLM representations obtained from Theory of Mind test with Integrated Information Theory and Span Representation analysis2025
Applying Integrated Information Theory (IIT) versions 3.0 and 4.0 to sequences of internal representations from four open-source LLMs — LLaMA3.1-8B, LLaMA3.1-70B, Mistral-7B, and Mixtral-8x7B — across five Theory of Mind task categories yields no statistically significant evidence of observable "consciousness" phenomena under the three criteria established by this work. The analytical instrument introduced is the Representation Network (RN), a hypothetical network constructed by treating each PCA-reduced embedding dimension (collapsed to D=4 nodes) as a node, with the token sequence forming a time series of binary network states; PyPhi software then computes μΦmax (IIT 3.0) and μΦ (IIT 4.0) as weighted averages over all 16 possible states. Across 165,365 valid samples spanning 12 proportionally sampled transformer layers per model and three linguistic span conditions, IIT-derived Φ estimates fail to reliably discriminate ToM performance score categories, while a consciousness-agnostic Span Representation metric consistently achieves higher mean AUC in 5×5-fold cross-validated logistic regression — the sole exception being spatio-permutation controls, under which two cases (notably Layer 32 of Mixtral-8x7B on Strange Stories with IIT 4.0, across entire-text and complement spans) satisfy all three criteria simultaneously. The paper argues this implies that contemporary Transformer-based LLMs' representation sequences encode performance-relevant information in span-level geometry rather than in IIT-measurable integrated information, though spatio-permutation results leave open the possibility that future agentic systems consuming LLM representations in non-autoregressive modes could yield representations observable as conscious.

Authored papers (1)

More papers — OpenAlex / S2