claim
active
claim:in-intermediate-regimes-of-scale-or-layer-depth-llms-may-linearly-represent-features-at-intermediate-levels-of-abstraction-such-as-accurate-factual-recall-or-close-association-rather-than-abstract-truth

In intermediate regimes of scale or layer depth, LLMs may linearly represent features at intermediate levels of abstraction such as 'accurate factual recall' or 'close association' rather than abstract truth

Theoretical interpretation of antipodal alignment and misalignment phenomena in PCA visualizations

Neighborhood — ranked by edge-count

Concepts (1)

concept
  • A hypothesized intermediate-level linearly-represented feature (e.g., Beijing and China are closely associated) that may correlate with truth in unnegated datasets but anti-correlate in negated ones

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.