Reinforcement learning (RL)

Machine learning paradigm where agents learn to maximize cumulative reward through interaction.

Neighborhood — ranked by edge-count

paper

framework

Active Inference
associated_with
Foundational framework by Karl Friston; the paper extends it to three hierarchical levels for modeling meta-awareness.
Bayesian Model-Based Reinforcement Learning
implements
RL variant that maintains beliefs over environment model; compared to active inference using Thompson sampling.

method

Q-learning
implements
Model-free RL algorithm used in experimental comparison; employs ε-greedy exploration.
Thompson Sampling
associated_with
A Bayesian exploration strategy that samples from the posterior distribution over model parameters to decide actions.
Epsilon-greedy exploration
associated_with
A heuristic exploration strategy that selects a random action with probability epsilon, otherwise acts greedily.

concept

AE-1: Agency: Learning from feedback and flexible responsiveness to competing goals
associated_with
Indicator of agency requiring goal pursuit and flexibility.
Reward Hypothesis
associated_with
The claim in RL that any goal can be expressed as maximizing the expected cumulative sum of a scalar reward signal.
State-Action Policies
associated_with
In reinforcement learning, a policy maps states to actions, specifying behavior at each state.

artifact

Modified OpenAI Gym FrozenLake environment
about
A 3×3 grid world with start, frozen, hole, and goal states used for comparing active inference and RL agents.

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Reinforcement Learningframework0.899
Alternative framework for agent behavior; based on reward maximization rather than free energy minimization.
Reinforcement Learning from Human Feedback (RLHF)framework0.850
A competing alignment approach that fine-tunes models based on human evaluator feedback; discussed as complementary to SOO
RL algorithmsconcept0.840
The different reinforcement learning algorithms used across conditions, to ensure the alignment result is not algorithm-specific.
Deep Reinforcement Learningmethod0.838
AI training method inspired by behaviorism, used for autonomous cars and drones; cited as bioinspired success
Reinforcement Learning from Human Feedbackmethod0.825
Method for fine-tuning LMs based on human preferences; mentioned as combining RL and LMs.
Reinforcement learning of tissuesconcept0.823
The hypothesis that cellular collectives can be trained via rewards/punishments to produce specific morphological outcomes.
"reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal"quote0.807
Operational definition of RL used throughout the paper, quoted from Sutton.
Reinforcement Learning for Tissuesmethod0.803
Proposed experimental paradigm to train morphogenesis using rewards and punishments, treating tissues as learning agents.