hypothesis

active

hypothesis:scaling-model-size-as-well-as-data-and-task-diversity-drives-representational-convergence-toward-the-platonic-representation

Scaling model size, as well as data and task diversity, drives representational convergence toward the platonic representation

Core mechanism hypothesis connecting PRH to the empirical trend of scaling in AI

Source paper

extracted_from

The Platonic Representation Hypothesis

(2024) · Minyoung Huh · Brian Cheung · Tongzhou Wang · Phillip Isola

Neighborhood — ranked by edge-count

Findings (1)

finding

The better an LLM is at language modeling, the more it aligns with vision models, and vice versa — linear relationship between language modeling score and vision-language alignment
supports
Core cross-modal empirical result: larger and better language models align better with vision models

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Bigger models are more likely to converge to a shared representation than smaller modelshypothesis0.834
Selective pressure toward convergence via model capacity
Different models cannot converge to the same representation if they have access to fundamentally different information; convergence is capped by mutual information between input signalsclaim0.818
Key limitation of the PRH for non-bijective observations
Foundation models trained on different data converge on similar latent representations, suggesting a Platonic form.claim0.796
Scaling may reduce hallucination and certain kinds of bias as models converge toward an accurate model of realityclaim0.793
Implication of PRH: larger models should amplify bias less and hallucinate less if they better model reality
Representational abstraction of truth may emerge more clearly with model scaleclaim0.792
Interpretation of weaker PCA separation and lower ASR in smaller models
If there is a modality-agnostic platonic representation, training on both image and language data should improve the best model in either modalityclaim0.787
Implication of PRH for training practice: both modalities point at the same underlying reality
As LLMs scale, they develop increasingly general abstractions, with large models linearly representing abstract concepts like truth that capture shared properties of diverse inputsclaim0.786
Interpretive claim connecting scale to abstraction level in LLM representations
How do representations differ or converge between architectures, tasks, and modalities?question0.786
Broader research question MAS is positioned to address, citing multiple recent works.