finding

active

finding:active-inference-and-bayesian-model-based-rl-learn-reward-maximizing-behavior-in-10-episodes-in-deterministic-frozenlake

Active inference and Bayesian model-based RL learn reward-maximizing behavior in <10 episodes in deterministic FrozenLake.

Discussion of Figure 3.

Source paper

extracted_from

Active inference: demystified and compared

(2021) · Noor Sajid · Philip J. Ball · Thomas Parr · Karl J. Friston

Neighborhood — ranked by edge-count

Papers (1)

paper

Active inference: demystified and compared
mentions

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Active inference recovers performance within 1 episode after context switch in non-stationary FrozenLake, while Bayesian RL requires ~40 episodes.finding0.876
Figure 4 and discussion in §3.
Active inference agents engage in information-seeking behavior in reward-free FrozenLake environments, contrasting with Q-learning but similar to Bayesian RL.finding0.875
Empirical demonstration on FrozenLake; shows epistemic value drives exploration absent reward signal.
All three agent types (active inference, Q-learning, Bayesian RL) perform adequately in stationary FrozenLake; only active inference achieves Bayes-optimal behavior in non-stationary settings.finding0.867
Key empirical result validating online planning capability of active inference.
Bayesian model-based RL achieved average score 99.76 [99.45, 100.00] in deterministic FrozenLake.finding0.839
Table 1.
Under reward shaping (G=100, H=-100, F=0), Active Inference scored 99.52, Bayesian RL 99.77, Q-learning 95.56, with nearly identical behavior between belief-based agents.finding0.818
Table 2, row 3, showing equivalence when prior preferences match rewards.
Active Inference null model (no prior preferences) achieved average score 50.03 [49.70, 50.35] in deterministic FrozenLake.finding0.818
Table 1.
Active Inference agent achieved average score 99.88 [99.64, 100.00] in deterministic FrozenLake environment across 200 trials of 500 episodes.finding0.817
Table 1, deterministic environment row.
There is an implicit behavioral equivalence between Bayesian model-based reinforcement learning and active inference when prior preferences are treated as a reward function.claim0.800
§3, reward shaping conclusion.