2/3 Layer of LLM

The layer approximately two-thirds through an LLM's transformer stack, reported to best predict human brain activity; identified as promising for consciousness indicators.

Neighborhood — ranked by edge-count

Findings (1)

finding

The case at approximately the 2/3 layer of LLaMA3.1-8B (Layer 24, satisfying Criteria 1 and 2) aligns with prior studies showing the 2/3 layer optimally predicts human brain activity.
associated_with
Connects this study's results to Schrimpf et al. 2021 and Caucheteux et al. 2022/2023 findings on brain-LLM alignment.

Concepts (1)

concept

Can 'Consciousness' Be Observed from Large Language Model (LLM) Internal States? Dissecting LLM Representations Obtained from Theory of Mind Test with Integrated Information Theory and Span Representation Analysis
mentions
The primary paper being extracted — applies IIT 3.0 and 4.0 to LLM representation sequences derived from ToM test data to investigate whether consciousness phenomena can be observed.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

All cases satisfying Criteria 1 and 2 (two out of three) originate from deeper transformer layers and/or the 2/3 layer of LLMs.finding0.764
Consistent with literature that deeper layers encode semantic information and align with human brain activity.
Reflection in LLMsconcept0.758
The core phenomenon studied: the ability of LLMs to evaluate and revise their own reasoning.
LLMs hierarchically develop understanding of their input data, progressing from surface-level features in early layers to more abstract concepts in later layersclaim0.745
Interpretation of the layer-by-layer PCA visualizations showing linear structure emerging in early-middle layers
Linear Representation of Concepts in LLMsconcept0.743
The finding that interpretable concepts including character traits are encoded as linear directions in transformer residual streams
LLM Internal Representationsconcept0.740
High-dimensional vectors produced at each transformer layer for each input token; the primary substrate analyzed in this study.
In intermediate regimes of scale or layer depth, LLMs may linearly represent features at intermediate levels of abstraction such as 'accurate factual recall' or 'close association' rather than abstract truthclaim0.736
Theoretical interpretation of antipodal alignment and misalignment phenomena in PCA visualizations
LLM psychosisconcept0.735
Tendency for models to get lost in roleplay or doom spirals, mitigated by expanded awareness.
LLM Meta-Cognitionconcept0.723
The ability of LLMs to monitor and evaluate their own reasoning, closely related to reflection.