quote

active

quote:neural-networks-trained-with-different-objectives-on-different-data-and-modalities-are-converging-to-a-shared-statistical-model-of-reality-in-their-representation-spaces

Neural networks, trained with different objectives on different data and modalities, are converging to a shared statistical model of reality in their representation spaces.

The paper's central thesis statement, presented prominently after the abstract

Source paper

extracted_from

The Platonic Representation Hypothesis

(2024) · Minyoung Huh · Brian Cheung · Tongzhou Wang · Phillip Isola

Neighborhood — ranked by edge-count

Papers (1)

paper

The Platonic Representation Hypothesis
introduces

Concepts (1)

concept

Platonic Representation
associated_with
The hypothesized converged representation that all sufficiently large AI models are approaching — a statistical model of underlying reality

Hypotheses (5)

hypothesis

Multitask Scaling Hypothesis
supports
Argues that there are fewer representations competent for N tasks than M<N tasks, so more general models have a smaller solution space
Capacity Hypothesis
supports
Bigger models are more likely to converge to a shared representation than smaller models because they can better approximate the global optimum
Simplicity Bias Hypothesis
supports
Deep networks are biased toward finding simple fits to data, and this bias increases with model size, driving convergence
Training on image data should improve LLM performance, and training on language data should improve vision model performance
extends
Implication of PRH for cross-modal training efficiency
As models scale and converge toward an accurate model of reality, hallucinations should decrease with scale
extends
Implication of PRH for LLM hallucination

Claims (3)

claim

Different models cannot converge to the same representation if they have access to fundamentally different information; convergence is capped by mutual information between input signals
contradicts
Key limitation of the PRH for non-bijective observations
Conditional generation is easier than unconditional because the conditioning data shares the same platonic structure as the generated data
extends
Implication of PRH for generative models
Special-purpose intelligences optimized for narrow tasks may not converge; the PRH only holds for intelligences performing well on many tasks
contradicts
Key limitation of PRH

Findings (2)

finding

Among 78 vision models on Places-365, models that solve more VTAB tasks tend to be more aligned with each other, with high-performance models forming a tightly clustered set
associated_with
Empirical result showing alignment increases with model competence
Better LLMs (measured by 1-bits-per-byte on OpenWebText) show a linear relationship with alignment to vision models measured via mutual nearest-neighbor on WIT
associated_with
Key cross-modal alignment result

Questions (1)

question

What has led to representational convergence, will it continue, and ultimately where does it end?
gates
Central motivating questions of the paper

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Different neural network models trained on different objectives and modalities are converging to a shared statistical model of reality in their representation spaceshypothesis0.945
The central hypothesis of the paper; the platonic representation hypothesis itself
Neural networks show substantial alignment with biological representations in the brain, driven by shared task and data constraintsclaim0.863
Extends convergence argument to brain-machine alignment
There is a growing similarity in how datapoints are represented in different neural network models, spanning different architectures, training objectives, and data modalitiesclaim0.850
Primary empirical claim of the paper
Ultimately, we would like to understand neural networks well enough to be able to intentionally design them.quote0.825
Vision statement in the conclusion.
Linear representation hypothesis: neural networks represent meaningful concepts as directions in their activation spaces.hypothesis0.821
Foundation for interpreting features as linear directions.
Diverse computer vision models trained on visual recognition tasks converge to remarkably similar internal feature representations regardless of architecture, training procedure, or implementation details, closely matching the organization of animal visual cortexfinding0.816
Empirical evidence for the universality hypothesis cited as supporting the possibility of convergent consciousness-like solutions
Superposition hypothesis: neural networks represent more features than dimensions using almost-orthogonal directions.hypothesis0.815
Explanation for why dictionary learning can recover many more features than dimensions.
Geometry unifies diverse neural architectures in machine learning systems.claim0.814