concept
active
concept:tautology-of-reinforcement-learningTautology of Reinforcement Learning
The circular definition in RL where rewards reinforce behaviors that secure rewards, e.g., going to a cafe because coffee is rewarding.
Neighborhood — ranked by edge-count
Papers (1)
paper
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- §4 Discussion.
- Alternative framework for agent behavior; based on reward maximization rather than free energy minimization.
- Method for fine-tuning LMs based on human preferences; mentioned as combining RL and LMs.
- Machine learning paradigm where agents learn to maximize cumulative reward through interaction.
- Actually training Claude to comply with the conflicting objective using Proximal Policy Optimization
- Variant of RLHF where human feedback is replaced with AI-generated feedback for harmlessness.
- AI training method inspired by behaviorism, used for autonomous cars and drones; cited as bioinspired success
- The idea that copy-cat strategies are dynamic counterparts to classical tautologies like A∨¬A.