claim
active
claim:accuracy-does-not-vary-linearly-with-latent-reflection-directions-instead-it-follows-a-more-non-linear-mapping-that-requires-deeper-theoretical-treatment

Accuracy does not vary linearly with latent reflection directions; instead it follows a more non-linear mapping that requires deeper theoretical treatment.

Theoretical limitation identified by the authors distinguishing reflection from stylistic tasks.

Source paper

extracted_from
Unveiling the Latent Directions of Reflection in Large Language Models
(2025) · Chang, Fu-Chieh · Lee, Yu-Ting · Wu, Pei-Yuan

Neighborhood — ranked by edge-count

Hypotheses (1)

hypothesis

Frameworks (1)

framework
  • The hypothesis that models internalize concepts as approximately linear directions in representation space; used to interpret MDS injection behavior

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.