Platonic Representation

The hypothesized converged representation that all sufficiently large AI models are approaching — a statistical model of underlying reality

Neighborhood — ranked by edge-count

paper

thinker

claim

concept

Pointwise Mutual Information Kernel
analogous_toassociated_with
The kernel that contrastive learners converge to; similarity equals PMI between observations
Plato's Allegory of the Cave
associated_withcites
Philosophical analogy: training data are shadows on the cave wall, but models recover representations of the actual world outside
World Model (statistical)
associated_with
The joint distribution over events in the world that generate observed data; the proposed endpoint of representational convergence
Convergent Realism
analogous_to
Philosophy of science position that science converges on truth; cited as precursor to the platonic representation hypothesis

hypothesis

Multitask Scaling Hypothesis
supports
Argues that there are fewer representations competent for N tasks than M<N tasks, so more general models have a smaller solution space
Capacity Hypothesis
supports
Bigger models are more likely to converge to a shared representation than smaller models because they can better approximate the global optimum
Simplicity Bias Hypothesis
supports
Deep networks are biased toward finding simple fits to data, and this bias increases with model size, driving convergence

quote

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Platonic Space / Convergent Representationsconcept0.820
Theoretical thread suggesting discoverable geometric priors shared across systems; circular number representations support this hypothesis.
If there is a modality-agnostic platonic representation, training on both image and language data should improve the best model in either modalityclaim0.789
Implication of PRH for training practice: both modalities point at the same underlying reality
Platonic Space Of Formsconcept0.779
concept representationconcept0.765
How a neural network encodes a semantic concept internally, argued to be better captured by manifolds than by atomic features.
Platonism Of Patternsconcept0.759
Linear representationconcept0.744
The idea that features are encoded as directions in activation space.
Deep Representationsconcept0.738
Hierarchical representations in neural networks that allow compression and coordinated behaviour while retaining sensitivity to input changes.
Representational dynamicsconcept0.731
The evolution of an agent's latent representations over the course of training, shown to align with reward improvement when causal emergence is high.