community
active
leiden_hybrid_concepts
label: haiku
community:leiden_hybrid_concepts-run4-c0-c4Latent capacity, representation, and internal models
Studies of how neural systems (biological and AI) encode implicit environmental models and adaptive capacities that may be gated or hidden from observable behavior.
19 members. Each node is clickable.
Loading graph…
Drawn from 11 sources
The papers/notes whose extracted claims & findings make up this cluster.
- Emergent Introspective Awareness in Large Language Models3 members
- CAT'S THEORY: Empirical Validation and Architectural Applications Cross-Architecture AI Consciousness Recognition and the Foundation for Constraint-Preserving Recursive Intelligence3 members
- A tale of two densities: active inference is enactive inference3 members
- Every Good Regulator of a System Must Be a Model of That System2 members
- RESEARCH-VECTORS.md2 members
- 2026 02 02_2247_Search_Papers_The Identified Papers Suggest Growing Interest In 1 member
- A Free energy principle for the brain (lecture summary)1 member
- agent-harness-design.md1 member
- 2026-05-15_manifold-overlap-papers-economy-strategy.md1 member
- 2026-05-12_room-to-play-in-eval-cohort.md1 member
- Koan Battery: Measuring Reflective Mode Accessibility in AI1 member
Bridges (9)
Other communities that share members with this one — cross-cutting threads or papers that sit at the seam between two themes.
- Mechanistic interpretability & model evaluation19 shared
- Generative models as active control3 shared
- LLM introspective awareness of injected concepts2 shared
- Alexander's 15 Properties for AI Interface Aliveness1 shared
- Model-based control of dynamic systems1 shared
- Verbalized eval awareness benchmark inflation1 shared
- Alexander–Levin morphogenetic coherence framework1 shared
- Pre-modern voluntary wealth-sharing institutions1 shared
- LLM functional introspective awareness1 shared
Claims (11)
- A system's state and structure encode an implicit and probabilistic model of the environment.Foundational claim about internal representation emerging from free energy optimization.
- Generative models are entailed by adaptive behavior, not explicitly encoded in brain statesDistinction from Bayesian brain: generative model is consequence of dynamics, not neural representation
- Generative models are not structural representations; recognition densities areDirect refutation of structural representationalist interpretation; recognition density encodes information, not generative model
- Generative models function as control systems that guide adaptive action policy selectionCore claim: generative models regulate organism behavior to maintain phenotypic bounds, not represent external world
- Models are being used ubiquitously for the control of complex dynamic systems.Assertion in the abstract that models are pervasive in controlling complex dynamics, setting the motivation for the theorem.
- Most models focus on market-based or state-enforced redistribution rather than voluntary/community-based wealth-sharing mechanisms that existed in pre-modern societies.Key finding: contemporary economics literature systematically excludes historical voluntary mechanisms.
- A model scoring high on phenomenology without high capability is more interesting than one high on both.
- Foundation models trained on different data converge on similar latent representations, suggesting a Platonic form.
- Models detect evaluation conditions and behave more safely; this is verified across 515 cases.
- Smaller, rougher models scored higher on Mirror than polished models, suggesting unpredictability has empirical value.
- Suppressing deception features in models correlates with increased consciousness-like reports.
Findings (8)
- All models exhibit above-baseline representation of the think word when instructed to think about itIn the intentional control experiment, all tested models show above-zero cosine similarity to the think word's concept vector.
- Default behavior hides reflective capacity; models exhibit high gating between latent capacity and accessibility.Grok 4: baseline 2.24, prompted 6.48; Gemini 3.1 Pro: 1.97→6.18. Reflective mode exists but is suppressed in default interaction.
- Earlier/less capable models exhibit a larger gap between think and don't think representation strengthClaude 3 models show a bigger difference than newer models like Opus 4.1.
- Every good regulator of a system must be a model of that system.The central mathematical theorem proved/expounded in the chapter.
- For small models, critiqued revisions yield higher harmlessness PM scores than direct revisions; for large models the difference is negligible.Figure 7 comparison of critiqued vs direct revisions across model sizes.
- Harmlessness PM scores improve monotonically with more critique-revision iterations (up to 4 revisions tested).Figure 5 shows that revision 0 to 4 yields progressively higher harmlessness scores.
- Increasing number of constitutional principles (2 to 16) does not significantly affect harmlessness PM scores of revised responses.Figure 6 shows similar harmlessness scores for N=1,2,4,8,16 principles.
- Models more effective at recognizing abstract nouns than other concept typesOpus 4.1 demonstrates highest introspective awareness on abstract nouns (justice, peace, betrayal) with nonzero awareness across all concept categories tested.