concept
active
concept:universality-hypothesisUniversality Hypothesis
The hypothesis that analogous features and circuits reliably form across different neural network models and tasks
Neighborhood — ranked by edge-count
Papers (1)
paper
- Zoom In: An Introduction to Circuitsintroduces
Thinkers (1)
thinker
- Chris OlahintroducesCo-author; provided high-level research guidance, wrote introduction/discussion.
Claims (1)
claim
- Third of three speculative claims asserting that learned features are not model-specific but represent universal solutions to learning problems
Hypotheses (2)
hypothesis
- Extension of the Universality Hypothesis to consciousness: if consciousness solves a well-defined computational problem, different systems will discover it independently
- Paper's uncertain extension of mechanistic interpretability universality to consciousness
Findings (2)
finding
- Empirical evidence for the universality hypothesis cited as supporting the possibility of convergent consciousness-like solutions
- Empirical finding supporting the Universality Hypothesis; extended by the paper to consciousness
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Property of features that form consistently across different models trained on the same or similar data, suggesting features are real representational units
- The conjecture that consciousness does not result from the organized mind but creates and maintains complex models of reality; forms at the beginning of mental development
- Speculative extension of universality to neuroscience, with high-low frequency detectors as a candidate prediction
- Core theoretical framework: neural networks represent more features than neurons by encoding features as directions in superposition
- Key open question linking mechanistic interpretability universality to machine consciousness
- The claim in RL that any goal can be expressed as maximizing the expected cumulative sum of a scalar reward signal.
- The overarching hypothesis that an I or self-like ground underlies matter and becomes visible in living things.