finding

active

finding:q-learning-epsilon-1-decaying-to-0-achieved-average-score-80-44-78-96-81-93-in-deterministic-frozenlake

Q-learning (epsilon=1 decaying to 0) achieved average score 80.44 [78.96, 81.93] in deterministic FrozenLake.

Table 1.

Source paper

extracted_from

Active inference: demystified and compared

(2021) · Noor Sajid · Philip J. Ball · Thomas Parr · Karl J. Friston

Neighborhood — ranked by edge-count

Papers (1)

paper

Active inference: demystified and compared
mentions

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

In the absence of any reward signal, Q-learning (epsilon=0.1) learns a deterministic circular policy with score 0.00 and does not explore purposefully.finding0.840
Table 2 first row; reward shaping section.
Bayesian model-based RL achieved average score 99.76 [99.45, 100.00] in deterministic FrozenLake.finding0.817
Table 1.
Active Inference null model (no prior preferences) achieved average score 50.03 [49.70, 50.35] in deterministic FrozenLake.finding0.800
Table 1.
Active Inference agent achieved average score 99.88 [99.64, 100.00] in deterministic FrozenLake environment across 200 trials of 500 episodes.finding0.791
Table 1, deterministic environment row.
Active inference and Bayesian model-based RL learn reward-maximizing behavior in <10 episodes in deterministic FrozenLake.finding0.781
Discussion of Figure 3.
In the absence of prior preferences, Active Inference null model and Bayesian RL maintain exploration with average scores of 44.00 and 39.94 respectively, whereas Q-learning does not explore.finding0.766
Table 2 first row; reward shaping section.
Under reward shaping (G=100, H=-100, F=0), Active Inference scored 99.52, Bayesian RL 99.77, Q-learning 95.56, with nearly identical behavior between belief-based agents.finding0.765
Table 2, row 3, showing equivalence when prior preferences match rewards.
All three agent types (active inference, Q-learning, Bayesian RL) perform adequately in stationary FrozenLake; only active inference achieves Bayes-optimal behavior in non-stationary settings.finding0.745
Key empirical result validating online planning capability of active inference.