concept
active
concept:representational-convergenceRepresentational Convergence
The central empirical phenomenon: different neural networks trained on different data/objectives develop increasingly similar representations
Neighborhood — ranked by edge-count
Papers (1)
paper
Thinkers (1)
thinker
- Phillip Isolastudies
Concepts (5)
concept
- Representational Alignmentassociated_withMeasure of similarity between the similarity structures (kernels) induced by two different representations
- Anna Karenina ScenarioextendsHypothesis that all well-performing neural nets represent the world in the same way; PRH extends this by specifying what representation they converge to
- Simplicity BiassupportsThe tendency of deep networks to implicitly favor simpler solutions that fit the data, driving convergence
- Multitask ScalingsupportsThe pressure on models trained on more tasks to find representations that generalize across all tasks, reducing the solution space
- Researcher preferences and goals of mimicking human reasoning shape model development, potentially causing convergence toward human-like representations
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Motivates Section 4 where the PMI-kernel formalization is proposed
- What has led to representational convergence, will it continue, and ultimately where does it end?question0.853Central motivating questions of the paper
- Core phenomenon studied: when causal interventions shift internal representations away from the natural distribution
- A failure mode exposed by the SAE framework where model representations are entangled or collapse under intervention
- The evolution of an agent's latent representations over the course of training, shown to align with reward improvement when causal emergence is high.
- How familiar a model is with a numeral system, manipulated via bases in Experiment 2.
- Accumulation of mismatch in later layers causing S degradation.
- Property of conscious representations: they do not contain information about the fact that they are representations at the level of the representation itself