claim
active
claim:reflection-is-not-merely-a-behavioral-artifact-of-prompting-but-a-phenomenon-encoded-in-the-model-s-activation-space

Reflection is not merely a behavioral artifact of prompting but a phenomenon encoded in the model's activation space.

Central interpretive claim of the paper, supported by steering vector experiments.

Source paper

extracted_from
Unveiling the Latent Directions of Reflection in Large Language Models
(2025) · Chang, Fu-Chieh · Lee, Yu-Ting · Wu, Pei-Yuan

Neighborhood — ranked by edge-count

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.