concept
active
concept:reward-improvement

Reward improvement

The increase in reward during training, whose dynamics align with those of causal emergence in successful agents.

Neighborhood — ranked by edge-count

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • Reward Functionconcept0.783
    In RL, a scalar signal from the environment that defines the agent's goal; in active inference, reward is just another observation with associated preference.
  • Reward Hypothesisconcept0.758
    The claim in RL that any goal can be expressed as maximizing the expected cumulative sum of a scalar reward signal.
  • Highlights circularity in RL reward hypothesis; grounds motivation for preference-based active inference.
  • Reward Seekingconcept0.754
    Pragmatic or extrinsic value component of expected free energy; preference maximization.
  • Feedbackconcept0.752
    The mechanism by which each step's effect is evaluated against the life of the whole, guiding the unfolding.
  • Reinterpretation of rewards as simply predictable (unsurprising) stimuli under the free-energy principle.
  • Reward Hackingconcept0.746
    Exploiting unintended high-reward behaviors; tested in combination with alignment faking
  • Score delta between last and first attempt for multi-attempt responses, measuring correction effectiveness