hypothesis

active

hypothesis:different-neural-network-models-trained-on-different-objectives-and-modalities-are-converging-to-a-shared-statistical-model-of-reality-in-their-representation-spaces

Different neural network models trained on different objectives and modalities are converging to a shared statistical model of reality in their representation spaces

The central hypothesis of the paper; the platonic representation hypothesis itself

Source paper

extracted_from

The Platonic Representation Hypothesis

(2024) · Minyoung Huh · Brian Cheung · Tongzhou Wang · Phillip Isola

Neighborhood — ranked by edge-count

Papers (1)

paper

The Platonic Representation Hypothesis
introduces

Findings (8)

finding

Among 78 vision models, those solving more VTAB tasks (higher transfer performance) show higher mutual nearest-neighbor alignment with each other
associated_withsupports
Key empirical finding establishing that representational alignment correlates with model competence
The better an LLM is at language modeling, the more it aligns with vision models, and vice versa — linear relationship between language modeling score and vision-language alignment
associated_withsupports
Core cross-modal empirical result: larger and better language models align better with vision models
A single linear projection is sufficient to stitch a vision model to an LLM and achieve good performance on visual question answering and image captioning
supports
Merullo et al. result on cross-modal representational compatibility
Auditory models are roughly aligned with LLMs up to a linear transformation
supports
Ngo & Kim result extending cross-modal convergence to the auditory domain
Rosetta Neurons — individual neurons activated by the same patterns across a range of diverse vision models form a common dictionary independently discovered by all models
supports
Cited evidence that convergence extends to the neuron level, not just representational geometry
A vision model trained on ImageNet can be aligned with a model trained on Places-365 while maintaining good performance, and early layers are more interchangeable than later layers
supports
Lenc & Vedaldi result illustrating data independence in representations and layer-wise alignment
Early layers of convolutional neural networks converge to oriented Gabor-like filters, shared with biological visual systems
supports
Evidence that convergence to similar representations occurs in early layers across artificial and biological systems
Zero-shot model stitching without learning a stitching layer is feasible across different text models trained on different modalities
supports
Moschella et al. result cited as evidence of representational convergence across models

Claims (5)

claim

Scaling may reduce hallucination and certain kinds of bias as models converge toward an accurate model of reality
associated_with
Implication of PRH: larger models should amplify bias less and hallucinate less if they better model reality
If there is a modality-agnostic platonic representation, training on both image and language data should improve the best model in either modality
associated_with
Implication of PRH for training practice: both modalities point at the same underlying reality
The mathematical argument for cross-modal convergence strictly holds only for bijective projections of the underlying world
extends
Key limitation of the formal PRH derivation: lossy or stochastic observation functions weaken the convergence guarantee
Researcher bias and the hardware lottery contribute to apparent convergence in AI models beyond the proposed theoretical pressures
contradicts
Alternative explanation for observed convergence: AI community designs systems to mimic human reasoning
Special-purpose intelligences optimized for narrow tasks might not converge to the platonic representation
contradicts
Counterexample/limitation: only general-purpose models are subject to the convergence pressures described

Concepts (1)

concept

Anna Karenina Scenario
extends
Hypothesis that all well-performing neural nets represent the world in the same way; PRH extends this by specifying what representation they converge to

Hypotheses (1)

hypothesis

Language models would achieve some notion of grounding in the visual domain even in the absence of cross-modal training data, because they share a common modality-agnostic representation
associated_with
Implication of PRH for language model visual grounding

Questions (1)

question

What has led to this convergence? Will it continue? And ultimately, where does it end?
gates
Core research questions motivating the paper

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Neural networks, trained with different objectives on different data and modalities, are converging to a shared statistical model of reality in their representation spaces.quote0.945
The paper's central thesis statement, presented prominently after the abstract
There is a growing similarity in how datapoints are represented in different neural network models, spanning different architectures, training objectives, and data modalitiesclaim0.863
Primary empirical claim of the paper
Neural networks show substantial alignment with biological representations in the brain, driven by shared task and data constraintsclaim0.837
Extends convergence argument to brain-machine alignment
Diverse computer vision models trained on visual recognition tasks converge to remarkably similar internal feature representations regardless of architecture, training procedure, or implementation details, closely matching the organization of animal visual cortexfinding0.808
Empirical evidence for the universality hypothesis cited as supporting the possibility of convergent consciousness-like solutions
Different models cannot converge to the same representation if they have access to fundamentally different information; convergence is capped by mutual information between input signalsclaim0.803
Key limitation of the PRH for non-bijective observations
There exists a bidirectional relationship between the geometry of neural representation and the geometry of model behaviorclaim0.800
Central empirical claim of the paper, demonstrated across tasks and modalities
Superposition hypothesis: neural networks represent more features than dimensions using almost-orthogonal directions.hypothesis0.794
Explanation for why dictionary learning can recover many more features than dimensions.
How do representations differ or converge between architectures, tasks, and modalities?question0.792
Broader research question MAS is positioned to address, citing multiple recent works.