method
active
method:greedy-policy

ε-greedy Policy

Exploration-exploitation policy used in combination with Q-learning

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • A heuristic exploration strategy that selects a random action with probability epsilon, otherwise acts greedily.
  • Policyconcept0.730
    Sequence of actions considered by the agent; basis for planning.
  • Baseline self-report method selecting highest-probability token; shown to collapse to few uninformative values
  • Policy Selectionconcept0.703
    Choosing sequences of actions based on expected free energy; prior probability of policy is softmax of expected free energy
  • The active sampling of observations to maximize information gain and resolve uncertainty about the environment.
  • Distinguished value initially associated with every key combination in associative memory m; propagates through operations to signal missing values; enables termination of recursive delegation.
  • EconomyAgentmethod0.686
    deterministic code agent that models resource economy, tracking money flows and exploiting cash-poor opponents
  • Definition of sequential policy in active inference, contrasting with state-action policies in RL.