quote

active

quote:any-goal-or-purpose-can-be-well-thought-of-as-maximization-of-the-expected-value-of-the-cumulative-sum-of-a-received-scalar-signal-reward

"any goal or purpose can be well thought of as maximization of the expected value of the cumulative sum of a received scalar signal (reward)"

The reward hypothesis underpinning RL, quoted from Sutton and Barto.

Source paper

extracted_from

Active inference: demystified and compared

(2021) · Noor Sajid · Philip J. Ball · Thomas Parr · Karl J. Friston

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

acting to optimize value and perception are two aspects of exactly the same principle; namely the minimisation of a quantity [free energy] that bounds the probability of sensory input, given a particular agent or phenotype.quote0.792
Concise statement of the free-energy principle's unification of action and perception.
Under active inference, the ultimate ‘goal’ is to maintain a coherent phenotype and persist over time, not to maximize reward.claim0.777
§3, preference learning discussion.
Mean Cumulative Objective Rewardmethod0.777
Primary performance metric: total food visits across agent lifetime
Acting to optimize value and perception are two aspects of exactly the same principle: minimization of free energy.claim0.770
Foundational claim unifying action and perception within single optimization framework.
Acting to maximize value is the same as acting to minimize surprise; value is simply the probability of sensory input expected by an agent.claim0.770
Reinterprets classical reward/value concepts through free energy lens.
Curiosity, insight, decision-making, and diverse phenomena can all be accommodated by a single imperative: minimization of expected free energy (resolution of uncertainty).claim0.768
Central thesis of the paper unifying cognitive phenomena under one objective function
Optimizing toward the simulation objective does not incentivize instrumentally convergent behaviors the way that reward functions which evaluate trajectories do.claim0.766
Deontological nature of predictive loss.
A model whose objective is prediction can simulate agents who optimize toward any objectives, with any degree of optimality (bounded above but not below by the model's power).claim0.765
Prediction orthogonality thesis.