claim
active
claim:in-intermediate-regimes-of-scale-or-layer-depth-llms-may-linearly-represent-features-at-intermediate-levels-of-abstraction-such-as-accurate-factual-recall-or-close-association-rather-than-abstract-truthIn intermediate regimes of scale or layer depth, LLMs may linearly represent features at intermediate levels of abstraction such as 'accurate factual recall' or 'close association' rather than abstract truth
Theoretical interpretation of antipodal alignment and misalignment phenomena in PCA visualizations
Source paper
extracted_from(2023) · Samuel Marks · Max Tegmark
Neighborhood — ranked by edge-count
Concepts (1)
concept
- Close Association FeaturesupportsA hypothesized intermediate-level linearly-represented feature (e.g., Beijing and China are closely associated) that may correlate with truth in unnegated datasets but anti-correlate in negated ones
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Interpretive claim connecting scale to abstraction level in LLM representations
- Interpretation of the layer-by-layer PCA visualizations showing linear structure emerging in early-middle layers
- Establishes that the observed linear structure is not merely a representation of text probability
- Central empirical conclusion of the paper about the fundamental limits of truth directions.
- Do LLMs have a unified representation of truth that spans structurally and topically diverse data?question0.817Central research question driving dataset design and experimental approach
- Stated explicitly in App. C to explain why linear structure emerges later for conjunctive statements
- Qualified positive claim from spatio permutation analysis where two cases satisfy all three criteria.
- Interpretation of weaker PCA separation and lower ASR in smaller models