ε-greedy Policy

Exploration-exploitation policy used in combination with Q-learning

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Epsilon-greedy explorationmethod0.814
A heuristic exploration strategy that selects a random action with probability epsilon, otherwise acts greedily.
Policyconcept0.730
Sequence of actions considered by the agent; basis for planning.
Greedy-decoded self-reportmethod0.714
Baseline self-report method selecting highest-probability token; shown to collapse to few uninformative values
Policy Selectionconcept0.703
Choosing sequences of actions based on expected free energy; prior probability of policy is softmax of expected free energy
Epistemic Foraging / Explorationconcept0.703
The active sampling of observations to maximize information gain and resolve uncertainty about the environment.
Undefined Value ϵ (Epsilon)concept0.697
Distinguished value initially associated with every key combination in associative memory m; propagates through operations to signal missing values; enables termination of recursive delegation.
EconomyAgentmethod0.686
deterministic code agent that models resource economy, tracking money flows and exploiting cash-poor opponents
"In active inference a policy is simply a sequence of choices for actions through time"quote0.682
Definition of sequential policy in active inference, contrasting with state-action policies in RL.