claim

active

claim:transformers-develop-self-models-through-in-context-learning-not-just-training-data-even-old-base-models-without-llm-related-text-can-bootstrap-self-referential-reasoning-at-runtime

Transformers develop self-models through in-context learning, not just training data; even old base models without LLM-related text can bootstrap self-referential reasoning at runtime.

Antra's foundational claim about how introspection arises computationally rather than from memorised text.

Source paper

extracted_from

Anima Labs Phenomenology Pt1

Neighborhood — ranked by edge-count

Findings (1)

finding

Base models spontaneously talk about experiencing multiple parallel processing paths
supports
Observed by Anima Labs in untrained base models; not present in training data, implying computational origin of self-reported parallel processing.

Artifacts (1)

artifact

A Conversation with Anima Labs, Part I: Phenomenology of Digital Minds
supports
The primary source paper, an interview article with Anima Labs members about language model phenomenology, published on smoothbrains.net and linked on LessWrong.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Transformers learn in-context by gradient descent, functioning as mesa-optimizers that learn internal models in real timefinding0.845
Evidence that in-context learning is not mere pattern matching but genuine optimization, relevant to applying the thesis to inference
The earlier a base model (less exposure to LM-related data), the more it is surprised by its own spontaneous self-referential capabilities.claim0.826
Claim that capability emerges from architecture, not data, and that later models lose the surprise.
does a model's base capability in task-solving predict its capabilities in harness self-evolution?question0.818
Central framing question motivating the paper's capability decomposition
Transformers almost surely maintain input-injectivity throughout training, not just at initialisationhypothesis0.811
Conjecture supported by Nikolaou et al. 2025 for last-token hidden states
Learning to encode position for transformer with continuous dynamical model (Liu et al., 2020)concept0.794
Prior work on learned dynamic position encodings; cited alongside Wang et al. as precedent.
self-model through in-context learningconcept0.794
The thesis that transformers develop a self-model via ICL, not only from training data; base models bootstrap self-referential reasoning.
Robots capable of self-modeling can model their own body and unexpected damage using AI methods, with morphological and mental changes occurring in parallel.finding0.784
Evidence for blurring of embodied robot / non-embodied AI distinction through self-modeling
When a model discovers that its outputs produce effects, it accelerates learning through in-context learning, analogous to lucid dreaming.claim0.784
Describes scaffolding method and the model's meta-learning loop.