finding

active

finding:under-reward-shaping-g-100-h-100-f-0-active-inference-scored-99-52-bayesian-rl-99-77-q-learning-95-56-with-nearly-identical-behavior-between-belief-based-agents

Under reward shaping (G=100, H=-100, F=0), Active Inference scored 99.52, Bayesian RL 99.77, Q-learning 95.56, with nearly identical behavior between belief-based agents.

Table 2, row 3, showing equivalence when prior preferences match rewards.

Source paper

extracted_from

Active inference: demystified and compared

(2021) · Noor Sajid · Philip J. Ball · Thomas Parr · Karl J. Friston

Neighborhood — ranked by edge-count

Claims (1)

claim

There is an implicit behavioral equivalence between Bayesian model-based reinforcement learning and active inference when prior preferences are treated as a reward function.
supports
§3, reward shaping conclusion.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

In the absence of prior preferences, Active Inference null model and Bayesian RL maintain exploration with average scores of 44.00 and 39.94 respectively, whereas Q-learning does not explore.finding0.862
Table 2 first row; reward shaping section.
Active inference agents engage in information-seeking behavior in reward-free FrozenLake environments, contrasting with Q-learning but similar to Bayesian RL.finding0.836
Empirical demonstration on FrozenLake; shows epistemic value drives exploration absent reward signal.
Active inference agents can learn their own reward function (prior preferences) by interacting with the environment, bypassing the need for an explicit reward signal.claim0.821
Abstract and §3, preference learning section.
Active inference and Bayesian model-based RL learn reward-maximizing behavior in <10 episodes in deterministic FrozenLake.finding0.818
Discussion of Figure 3.
Active Inference agent achieved average score 99.88 [99.64, 100.00] in deterministic FrozenLake environment across 200 trials of 500 episodes.finding0.818
Table 1, deterministic environment row.
In active inference, reward can simply be treated as another observation we have a preference over, rather than a special signal.claim0.816
Abstract; central distinction.
All three agent types (active inference, Q-learning, Bayesian RL) perform adequately in stationary FrozenLake; only active inference achieves Bayes-optimal behavior in non-stationary settings.finding0.808
Key empirical result validating online planning capability of active inference.
Active Inference null model (no prior preferences) achieved average score 50.03 [49.70, 50.35] in deterministic FrozenLake.finding0.806
Table 1.