concept
active
concept:rewards-reinforce-behaviors-that-secure-rewardsRewards reinforce behaviors that secure rewards.
Highlights circularity in RL reward hypothesis; grounds motivation for preference-based active inference.
Neighborhood — ranked by edge-count
Papers (1)
paper
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Classical RL algorithm adapted by the paper with modifications including clipped-surrogate losses and length-normalized advantages for agentic training.
- Reinterpretation of rewards as simply predictable (unsurprising) stimuli under the free-energy principle.
- Rewards are simply predictable stimuli (and aversive stimuli are, by definition, surprising)claim0.783Redefines reward and punishment in terms of predictability.
- §4 Discussion.
- §1, contrasting RL reward conceptualization.
- Core credit assignment question for distributed systems.
- The increase in reward during training, whose dynamics align with those of causal emergence in successful agents.