finding

active

finding:llama-2-7b-representations-of-larger-than-smaller-than-cluster-by-surface-level-characteristics-such-as-presence-of-token-eighty

LLaMA-2-7B representations of larger_than+smaller_than cluster by surface-level characteristics such as presence of token 'eighty'

Demonstrates that small models represent surface features rather than abstract truth

Source paper

extracted_from

The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets

(2023) · Samuel Marks · Max Tegmark

Neighborhood — ranked by edge-count

Claims (1)

claim

As LLMs scale, they develop increasingly general abstractions, with large models linearly representing abstract concepts like truth that capture shared properties of diverse inputs
associated_withsupports
Interpretive claim connecting scale to abstraction level in LLM representations

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

In LLaMA-2-7B, PCA of larger_than+smaller_than shows statements clustering by surface-level characteristics (e.g., presence of token 'eighty') rather than truth valuefinding0.887
Shows absence of abstract truth representations in smallest model, supporting scale-dependent emergence claim
Llama-3.1 8B internal representations for the seven days of the week form seven clusters in a circle in activation space.finding0.827
Empirical observation establishing that Llama's internal representations for days-of-week have circular geometric structure.
Llama-3.1 8B output token distributions for seven days of the week form seven clusters in a rough circle in behavior space (Hellinger distance geometry).finding0.821
Empirical observation establishing that Llama's behavior for days-of-week tasks has circular structure.
In LLaMA-2-13B, larger_than and smaller_than separate along antipodal directions in PCA; in LLaMA-2-70B they align along a common directionfinding0.804
Scale-dependent alignment result demonstrating how more abstract truth representations emerge with scale
LLaMA-2-70B displays summarization behavior over punctuation tokens in a context-dependent way: present for cities but not for sp_en_transfinding0.803
Contrasts with 7B and 13B which show consistent summarization behavior; may complicate localization at 70B scale
LLaMA-2-70B and 13B probes generalize better across datasets than 7B probes across all training sets and probe typesfinding0.801
Larger models linearly represent more general concepts including truth
LLaMA-3.1-8B: Sbmax = -1.896 ± 0.211, AUSN = -2.119 ± 0.198, peak layer ℓ* = 10 (median)finding0.791
Seed-pooled geometry-only statistics (per-dev z units).
PCA visualizations of LLaMA-2-13B and 70B representations of curated datasets show clear linear structure, with true statements separating from false ones in the top two principal componentsfinding0.787
Primary visual evidence for linear truth representations in large LLMs