thinker:phillip-isolaPhillip Isola
Authored papers (1)
Neural networks trained on different data modalities, architectures, and objectives are converging toward a shared statistical model of reality — what the paper terms the "platonic representation" — formalized as the pointwise mutual information (PMI) kernel over co-occurring events in the world. Measured via a mutual nearest-neighbor alignment metric across 78 vision models evaluated on the 19-task VTAB benchmark, models that solve more downstream tasks cluster tightly together while weaker models scatter; the top-performing bin is markedly more internally aligned than the lowest. Cross-modal convergence is equally pronounced: across families spanning BLOOM (560M–7.1B parameters), OpenLLaMA (3B–13B), and LLaMA (7B–65B) paired against DINOv2, MAE, CLIP, and ImageNet-21K ViTs measured on the Wikipedia Image-Text (WIT) dataset, language modeling performance (1−bits-per-byte on OpenWebText) predicts vision-language alignment with a near-linear relationship, and LLM alignment with DINOv2 predicts Hellaswag commonsense accuracy linearly and GSM8K math accuracy in an emergence-like step. Three selective pressures drive convergence: the Multitask Scaling Hypothesis (more tasks shrink the feasible solution set), the Capacity Hypothesis (larger models more reliably reach shared optima), and the Simplicity Bias Hypothesis (deep networks implicitly favor low-complexity solutions). The paper argues this implies that modality-agnostic representations are not an artifact of shared training recipes but an attractor determined by the statistical structure of reality itself, with downstream consequences including cross-modal data interchangeability, reduced hallucination at scale, and the practical ease of linear stitching between modalities.
More papers — OpenAlex / S2
Affiliations (1)
- Mit(institute)
Co-authors (3)
- Brian Cheung9 shared
- Minyoung Huh9 shared
- Tongzhou Wang9 shared
Their work is cited by (1)
- Model Alignment Search3× refs
Recent mentions (1)
- papers-typedhuh-2024-platonic.md