hypothesis
active
hypothesis:multitask-scaling-hypothesisMultitask Scaling Hypothesis
Argues that there are fewer representations competent for N tasks than M<N tasks, so more general models have a smaller solution space
Source paper
extracted_from(2024) · Minyoung Huh · Brian Cheung · Tongzhou Wang · Phillip Isola
Neighborhood — ranked by edge-count
Papers (1)
paper
- The Platonic Representation Hypothesisintroduces
Findings (1)
finding
- Key empirical finding establishing that representational alignment correlates with model competence
Concepts (5)
concept
- Contravariance PrincipleextendssupportsCao & Yamins principle: solution set for an easy goal is large, for a challenging goal comparatively smaller; cited as theoretical basis for multitask scaling hypothesis
- Platonic RepresentationsupportsThe hypothesized converged representation that all sufficiently large AI models are approaching — a statistical model of underlying reality
- Power law scalingsupportsObservation that SAE loss decreases as a power law with compute budget.
- Masked AutoencoderssupportsSelf-supervised learning method that optimizes reconstruction tasks; included in the paper's analysis as a multi-task objective
- Autoregressive Language ModelingsupportsTraining objective interpretable as optimizing a diverse set of tasks; thus subject to multitask scaling convergence pressures
Hypotheses (1)
hypothesis
- Capacity Hypothesisassociated_withBigger models are more likely to converge to a shared representation than smaller models because they can better approximate the global optimum
Quotes (1)
quote
- The paper's central thesis statement, presented prominently after the abstract
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- The pressure on models trained on more tasks to find representations that generalize across all tasks, reducing the solution space
- Framework dividing covariance of character and fitness into between-collective and within-collective components; addresses limitation of kin selection.
- Core testable hypothesis of UCCT about the nature of performance transitions under anchoring
- Cited hypothesis from Lin et al. 2022 suggesting larger models become more capable of deception
- Techniques that leverage AI to help humans more efficiently supervise AI.
- Processes scaling goals and stressors form positive feedback loop with modularity; both arise from and potentiate power of evolution, enabling specific predictions for cognitive capacity scaling.
- Central thesis about the role of agency in evolutionary dynamics.