finding
active
finding:pmi-computed-from-color-cooccurrences-in-cifar-10-images-yields-a-perceptual-color-representation-closely-matching-both-cielab-space-and-language-model-embeddings-simcse-robertaPMI computed from color cooccurrences in CIFAR-10 images yields a perceptual color representation closely matching both CIELAB space and language model embeddings (SimCSE, RoBERTa)
Validates theoretical PMI convergence claim on real data
Source paper
extracted_from(2024) · Minyoung Huh · Brian Cheung · Tongzhou Wang · Phillip Isola
Neighborhood — ranked by edge-count
Claims (2)
claim
- Core theoretical claim about the target of representation learning
- Empirical validation that PMI convergence actually occurs on real data
Methods (1)
method
- CIELAB Color Spaceassociated_withPerceptually uniform color space used as ground truth perceptual representation in color cooccurrence experiment
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Case study confirming that PMI-based learning in different modalities recovers the same perceptual representation
- Mathematical formalization of what representation models converge to
- Characterizes internal structure of the six scoring dimensions
- Validates use of lightweight classifiers as replacement for frontier LLM evaluation during alpha sweeps
- Real brain imaging result suggesting a compressed emergent representation.
- Key benefit of the denotational design for images.
- Empirical evidence for the universality hypothesis cited as supporting the possibility of convergent consciousness-like solutions
- On CIFAR-10, larger models exhibit greater alignment with each other compared to smaller onesfinding0.741Kornblith et al. / Krizhevsky finding replicated in paper discussion