finding

active

finding:auditory-models-are-roughly-aligned-with-llms-up-to-a-linear-transformation

Auditory models are roughly aligned with LLMs up to a linear transformation

Ngo & Kim result extending cross-modal convergence to the auditory domain

Source paper

extracted_from

The Platonic Representation Hypothesis

(2024) · Minyoung Huh · Brian Cheung · Tongzhou Wang · Phillip Isola

Neighborhood — ranked by edge-count

Claims (1)

claim

There is a growing similarity in how datapoints are represented in different neural network models, spanning different architectures, training objectives, and data modalities
supports
Primary empirical claim of the paper

Hypotheses (1)

hypothesis

Different neural network models trained on different objectives and modalities are converging to a shared statistical model of reality in their representation spaces
supports
The central hypothesis of the paper; the platonic representation hypothesis itself

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

The better an LLM is at language modeling, the more it aligns with vision models, and vice versa — linear relationship between language modeling score and vision-language alignmentfinding0.826
Core cross-modal empirical result: larger and better language models align better with vision models
LLMs linearly represent truth-relevant information beyond the plausibility of text, as evidenced by probes trained on likely performing poorly on anti-correlated datasetsclaim0.775
Establishes that the observed linear structure is not merely a representation of text probability
As LLMs scale, they develop increasingly general abstractions, with large models linearly representing abstract concepts like truth that capture shared properties of diverse inputsclaim0.775
Interpretive claim connecting scale to abstraction level in LLM representations
In intermediate regimes of scale or layer depth, LLMs may linearly represent features at intermediate levels of abstraction such as 'accurate factual recall' or 'close association' rather than abstract truthclaim0.774
Theoretical interpretation of antipodal alignment and misalignment phenomena in PCA visualizations
Linear World Models in LLMsframework0.772
Prior work framework studying whether LLMs encode world models as linear structures in their representations
Better LLMs (measured by 1-bits-per-byte on OpenWebText) show a linear relationship with alignment to vision models measured via mutual nearest-neighbor on WITfinding0.767
Key cross-modal alignment result
Linear mixed-effects models (LMMs)method0.761
Primary statistical model with random intercept by conversation, REML estimation, for pooled conversation-turn observations
We hypothesize that LLMs represent correctness of arithmetic expressions differently from factual statements.hypothesis0.760
Core working hypothesis motivating the factual vs. arithmetic task split in the experimental design.