community

active

leiden_hybrid_concepts

label: sonnet

community:leiden_hybrid_concepts-run2-c103

LLM internal representation & self-knowledge

Examines whether transformer models develop introspectable, high-order concept representations architecturally.

2 members. Each node is clickable.

Loading graph…

Drawn from 2 sources

The papers/notes whose extracted claims & findings make up this cluster.

Bridges (5)

Other communities that share members with this one — cross-cutting threads or papers that sit at the seam between two themes.

Mechanistic interpretability & model evaluation1 shared
Mechanistic structure of transformer attention computations1 shared
Empirical gaps in performance-communication alignment1 shared
Relational self, care & aliveness1 shared
Semantic depth versus performative mimicry in LLMs1 shared

Claims (2)

LLM introspection on internal computations is architecturally permitted; whether models leverage this is an empirical question.Core claim directly challenged by prior work denying introspection; forms foundation for Koan Battery introspection studies.
LLMs internalize deeply integrated representations of high-order concepts.The authors' interpretive assertion based on their steering results.