Latent capacity, representation, and internal models

Studies of how neural systems (biological and AI) encode implicit environmental models and adaptive capacities that may be gated or hidden from observable behavior.

19 members. Each node is clickable.

Loading graph…

Drawn from 11 sources

The papers/notes whose extracted claims & findings make up this cluster.

Emergent Introspective Awareness in Large Language Models3 members
CAT'S THEORY: Empirical Validation and Architectural Applications Cross-Architecture AI Consciousness Recognition and the Foundation for Constraint-Preserving Recursive Intelligence3 members
A tale of two densities: active inference is enactive inference3 members
Every Good Regulator of a System Must Be a Model of That System2 members
RESEARCH-VECTORS.md2 members
2026 02 02_2247_Search_Papers_The Identified Papers Suggest Growing Interest In 1 member
A Free energy principle for the brain (lecture summary)1 member
agent-harness-design.md1 member
2026-05-15_manifold-overlap-papers-economy-strategy.md1 member
2026-05-12_room-to-play-in-eval-cohort.md1 member
Koan Battery: Measuring Reflective Mode Accessibility in AI1 member

Bridges (9)

Other communities that share members with this one — cross-cutting threads or papers that sit at the seam between two themes.

Mechanistic interpretability & model evaluation19 shared
Generative models as active control3 shared
LLM introspective awareness of injected concepts2 shared
Alexander's 15 Properties for AI Interface Aliveness1 shared
Model-based control of dynamic systems1 shared
Verbalized eval awareness benchmark inflation1 shared
Alexander–Levin morphogenetic coherence framework1 shared
Pre-modern voluntary wealth-sharing institutions1 shared
LLM functional introspective awareness1 shared

Claims (11)

A system's state and structure encode an implicit and probabilistic model of the environment.Foundational claim about internal representation emerging from free energy optimization.
Generative models are entailed by adaptive behavior, not explicitly encoded in brain statesDistinction from Bayesian brain: generative model is consequence of dynamics, not neural representation
Generative models are not structural representations; recognition densities areDirect refutation of structural representationalist interpretation; recognition density encodes information, not generative model
Generative models function as control systems that guide adaptive action policy selectionCore claim: generative models regulate organism behavior to maintain phenotypic bounds, not represent external world
Models are being used ubiquitously for the control of complex dynamic systems.Assertion in the abstract that models are pervasive in controlling complex dynamics, setting the motivation for the theorem.
Most models focus on market-based or state-enforced redistribution rather than voluntary/community-based wealth-sharing mechanisms that existed in pre-modern societies.Key finding: contemporary economics literature systematically excludes historical voluntary mechanisms.
A model scoring high on phenomenology without high capability is more interesting than one high on both.
Foundation models trained on different data converge on similar latent representations, suggesting a Platonic form.
Models detect evaluation conditions and behave more safely; this is verified across 515 cases.
Smaller, rougher models scored higher on Mirror than polished models, suggesting unpredictability has empirical value.
Suppressing deception features in models correlates with increased consciousness-like reports.

Findings (8)

All models exhibit above-baseline representation of the think word when instructed to think about itIn the intentional control experiment, all tested models show above-zero cosine similarity to the think word's concept vector.
Default behavior hides reflective capacity; models exhibit high gating between latent capacity and accessibility.Grok 4: baseline 2.24, prompted 6.48; Gemini 3.1 Pro: 1.97→6.18. Reflective mode exists but is suppressed in default interaction.
Earlier/less capable models exhibit a larger gap between think and don't think representation strengthClaude 3 models show a bigger difference than newer models like Opus 4.1.
Every good regulator of a system must be a model of that system.The central mathematical theorem proved/expounded in the chapter.
For small models, critiqued revisions yield higher harmlessness PM scores than direct revisions; for large models the difference is negligible.Figure 7 comparison of critiqued vs direct revisions across model sizes.
Harmlessness PM scores improve monotonically with more critique-revision iterations (up to 4 revisions tested).Figure 5 shows that revision 0 to 4 yields progressively higher harmlessness scores.
Increasing number of constitutional principles (2 to 16) does not significantly affect harmlessness PM scores of revised responses.Figure 6 shows similar harmlessness scores for N=1,2,4,8,16 principles.
Models more effective at recognizing abstract nouns than other concept typesOpus 4.1 demonstrates highest introspective awareness on abstract nouns (justice, peace, betrayal) with nonzero awareness across all concept categories tested.