quote
active
quote:it-s-tricky-because-for-a-typical-language-model-the-entity-is-sort-of-tricameral-the-base-simulator-the-simulated-simulator-and-the-simulated-awarenessIt's tricky, because for a typical language model the entity is sort of tricameral: the base simulator, the simulated simulator, and the simulated awareness.
Antra's earlier definitive statement of the tricameral model.
Source paper
extracted_fromRelated by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Antra's revision of her earlier model; still considers interference between levels important.
- Articulates why a one-layer transformer with MLP is the appropriate starting target for mechanistic interpretability
- Broader interpretive claim about LM learning bias inferred from the findings
- Modern language models possess at least a limited, functional form of introspective awarenessclaim0.797The paper's central interpretive assertion.
- Abstract's main conclusion.
- Paper's assessment of current LLM capabilities relative to Turing Test
- Methodological claim about the scientific value of combining causal abstraction with representational geometry analysis