concept
active
concept:mesa-optimizer

Mesa-Optimizer

A learned optimizer running inside a base optimizer; transformers proposed as mesa-optimizers implementing gradient descent in-context

Neighborhood — ranked by edge-count

Concepts (1)

concept

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • Adam Optimizermethod0.776
    Used to optimize the policy and value networks
  • AdamW Optimizermethod0.755
    Used to optimize the world model and self-prior
  • Framework for optimizing multiple objectives simultaneously, used in MTL.
  • The drive to reduce expected ambiguity about outcomes given states, leading to seeking well-lit, informative environments.
  • Error minimizationconcept0.699
    The progressive reduction of error (stress) as cells move toward their target positions.
  • Core principle: acting to maximize value is equivalent to minimizing surprise by sampling environment to conform to expectations.
  • Machine learning approach using evolutionary processes to generate and select designs, used to blur the designed vs. evolved distinction
  • Architecture of Mixtral-8x7B; uses sparse expert routing affecting how hidden states are computed across layers.