claim

active

claim:antipodal-alignment-between-related-datasets-e-g-larger-than-and-smaller-than-in-smaller-models-resolves-to-common-direction-alignment-in-larger-models

Antipodal alignment between related datasets (e.g., larger_than and smaller_than) in smaller models resolves to common-direction alignment in larger models

Scale-dependent structural finding from PCA visualizations in §4

Source paper

extracted_from

The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets

(2023) · Samuel Marks · Max Tegmark

Neighborhood — ranked by edge-count

Findings (2)

finding

In LLaMA-2-13B, cities and neg_cities show approximately orthogonal axes of separation in PCA visualizations at intermediate layers
supports
Case of misalignment showing that the truth direction is not always shared between a dataset and its negation in smaller models
In LLaMA-2-13B, larger_than and smaller_than separate along antipodal directions in PCA; in LLaMA-2-70B they align along a common direction
supports
Scale-dependent alignment result demonstrating how more abstract truth representations emerge with scale

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

In LLaMA-2-13B, cities and neg_cities show antipodal alignment in early layers, rotate to orthogonal in middle layers, then eventually align in later layersfinding0.786
Layer-by-layer evolution of truth direction alignment, supporting hierarchical abstraction hypothesis
On CIFAR-10, larger models exhibit greater alignment with each other compared to smaller onesfinding0.781
Kornblith et al. / Krizhevsky finding replicated in paper discussion
Alignment Between High-Level and Low-Level Modelsconcept0.778
A mapping assigning to each high-level variable a set of low-level variables and a function from low-level to high-level values.
Antipodal Alignment of Truth Directionsconcept0.771
The case where two datasets (e.g., larger_than and smaller_than) separate along opposite directions in PCA, indicating a shared feature with opposite sign
Bigger models are more likely to converge to a shared representation than smaller modelshypothesis0.761
Selective pressure toward convergence via model capacity
Smaller models produce more alive responses than larger ones in the same alignment family—roughness signals living process over manufactured polish.claim0.759
Larger S_max correlates with smaller θ50 across backbones in E3 (negative association consistent across pooling and metric choices)finding0.758
Key geometry-to-behavior bridge finding in E3; robust to pooling choice, cosine vs. L2, and frozen external encoder
Alignment with vision models corresponds to improved performance on downstream language tasks including commonsense reasoning and mathclaim0.754
Claims that alignment score is a proxy for general capability