Optimal Reward Framework

Framework from Singh, Lewis, and Barto 2009 used to select best-performing reward functions via grid search

Neighborhood — ranked by edge-count

paper

concept

Singh, Lewis, and Barto 2009
cites
Source of the optimal reward framework used to evaluate and select best reward functions

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Reward Functionconcept0.785
In RL, a scalar signal from the environment that defines the agent's goal; in active inference, reward is just another observation with associated preference.
Reward Function Categoriesmethod0.764
Seven categories determined by which components of f[h] are activated: Objective only, Expect only, Compare only, and combinations
Reward Hypothesisconcept0.758
The claim in RL that any goal can be expressed as maximizing the expected cumulative sum of a scalar reward signal.
Seven Reward Function Groupsconcept0.731
The seven categories (Objective only, Expect only, Compare only, and four combinations) structuring the experiment
Reward improvementconcept0.729
The increase in reward during training, whose dynamics align with those of causal emergence in successful agents.
Traditional RL frameworks optimize externally defined reward functions lacking representational depth for mental-state reasoningclaim0.728
Motivation claim positioning this paper against standard RL approaches
Goal Directed Frameworkframework0.727
Frameworkconcept0.717
1984 Ashton-Tate integrated system with frames, FRED language, and overlapping windows; design reference for Playground's approach.