finding
active
finding:in-llama-2-7b-pca-of-larger-than-smaller-than-shows-statements-clustering-by-surface-level-characteristics-e-g-presence-of-token-eighty-rather-than-truth-valueIn LLaMA-2-7B, PCA of larger_than+smaller_than shows statements clustering by surface-level characteristics (e.g., presence of token 'eighty') rather than truth value
Shows absence of abstract truth representations in smallest model, supporting scale-dependent emergence claim
Source paper
extracted_from(2023) · Samuel Marks · Max Tegmark
Neighborhood — ranked by edge-count
Claims (1)
claim
- Interpretive claim connecting scale to abstraction level in LLM representations
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Demonstrates that small models represent surface features rather than abstract truth
- Primary visual evidence for linear truth representations in large LLMs
- Scale-dependent alignment result demonstrating how more abstract truth representations emerge with scale
- Contrasts with 7B and 13B which show consistent summarization behavior; may complicate localization at 70B scale
- Supporting evidence for the claim that most residual stream dimensions are free for other layers to use
- Layer-wise emergence pattern supporting hierarchical development hypothesis
- Llama-3.3-70B exhibits internal consistency-checking mechanisms that operate during inferenceclaim0.780Central interpretive claim of the paper supported by causal ablation and activation evidence
- Localizes truth representations to specific hidden states, motivating the rest of the analysis