Inverse Reinforcement Learning

Value learning method inferring reward function from expert demonstrations; reviewed as insufficient for superintelligent alignment

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Reinforcement Learningframework0.868
Alternative framework for agent behavior; based on reward maximization rather than free energy minimization.
Deep Reinforcement Learningmethod0.835
AI training method inspired by behaviorism, used for autonomous cars and drones; cited as bioinspired success
Reinforcement Learning from Human Feedbackmethod0.829
Method for fine-tuning LMs based on human preferences; mentioned as combining RL and LMs.
Reinforcement Learning from AI Feedbackframework0.817
Variant of RLHF where human feedback is replaced with AI-generated feedback for harmlessness.
Monte-Carlo reinforcement learningmethod0.816
Reinforcement learning methods that update parameters at the end of an episode based on sampled returns.
Reinforcement Learning for Tissuesmethod0.809
Proposed experimental paradigm to train morphogenesis using rewards and punishments, treating tissues as learning agents.
Reinforcement learning of tissuesconcept0.807
The hypothesis that cellular collectives can be trained via rewards/punishments to produce specific morphological outcomes.
Reinforcement learning (RL)concept0.800
Machine learning paradigm where agents learn to maximize cumulative reward through interaction.