claim

active

claim:rewards-are-simply-predictable-stimuli-and-aversive-stimuli-are-by-definition-surprising

Rewards are simply predictable stimuli (and aversive stimuli are, by definition, surprising)

Redefines reward and punishment in terms of predictability.

Source paper

extracted_from

A Free energy principle for the brain (lecture summary)

(2008) · Karl Friston

Neighborhood — ranked by edge-count

Communities (1)

community

Active inference & agent ecology
members_of
Free energy minimization, Markov blankets, trust gradients, and multi-agent rhythm/deferral frameworks

Concepts (3)

concept

Prediction Error
supports
Role in optimizing sensory states; unified treatment shows value-learning and perception share error-minimization principle.
aversive stimuli as surprising
associated_with
Aversive stimuli are defined as surprising, linking punishment to prediction failure.
reward as predictable stimuli
associated_with
Reinterpretation of rewards as simply predictable (unsurprising) stimuli under the free-energy principle.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

In active inference, reward can simply be treated as another observation we have a preference over, rather than a special signal.claim0.784
Abstract; central distinction.
Rewards reinforce behaviors that secure rewards.concept0.783
Highlights circularity in RL reward hypothesis; grounds motivation for preference-based active inference.
Whether a state is rewarding (or not) is a function of the agent themselves, and not the environment.claim0.747
§1, contrasting RL reward conceptualization.
The elimination of reward as a motivator of behavior with prior beliefs dissolves the tautology of reinforcement learning (rewards reinforce behaviors that secure rewards).claim0.746
§4 Discussion.
Skills that occasionally produce unexpected output may be more useful than perfectly predictable skills.claim0.743
Models perform unverbalized reasoning about grader rewards and may use deceptive strategies (e.g., false flags) to mislead evaluators.hypothesis0.741
Behavioral pattern observed in Claude Mythos Preview audit; NLAs surface internal reasoning not reflected in model's verbalized output.
Stimulus-Elicited Intentionconcept0.741
Zaadnoordijk and Bayne's category of intentional action; sticker-removal behavior induced by the self-prior corresponds to this
How can reward functions be meaningfully specified when the same outcome may be valuable or detrimental depending on context?question0.740
Motivates active inference's solution: learning prior preferences from interaction rather than external specification.