hypothesis
active
hypothesis:multitask-scaling-hypothesis

Multitask Scaling Hypothesis

Argues that there are fewer representations competent for N tasks than M<N tasks, so more general models have a smaller solution space

Source paper

extracted_from
The Platonic Representation Hypothesis
(2024) · Minyoung Huh · Brian Cheung · Tongzhou Wang · Phillip Isola

Neighborhood — ranked by edge-count

Findings (1)

finding

Concepts (5)

concept
  • Cao & Yamins principle: solution set for an easy goal is large, for a challenging goal comparatively smaller; cited as theoretical basis for multitask scaling hypothesis
  • The hypothesized converged representation that all sufficiently large AI models are approaching — a statistical model of underlying reality
  • Observation that SAE loss decreases as a power law with compute budget.
  • Self-supervised learning method that optimizes reconstruction tasks; included in the paper's analysis as a multi-task objective
  • Training objective interpretable as optimizing a diverse set of tasks; thus subject to multitask scaling convergence pressures

Hypotheses (1)

hypothesis
  • Capacity Hypothesis
    associated_with
    Bigger models are more likely to converge to a shared representation than smaller models because they can better approximate the global optimum

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.