concept
active
concept:rl-algorithmsRL algorithms
The different reinforcement learning algorithms used across conditions, to ensure the alignment result is not algorithm-specific.
Neighborhood — ranked by edge-count
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Machine learning paradigm where agents learn to maximize cumulative reward through interaction.
- Empirical result: CE measurements correlate with and predict learning performance in RL agents.
- Secondary empirical result: CE-based representational changes correlate with task success.
- Algorithm developed for timing systems in neutron accelerators; generates binary sequences where pulses are distributed as evenly as possible among intervals.
- Alternative framework for agent behavior; based on reward maximization rather than free energy minimization.
- Ancient algorithm from Euclid's Elements (circa 300 B.C.) that computes greatest common divisor; shown to structurally parallel Bjorklund's rhythm generation algorithm.
- A competing alignment approach that fine-tunes models based on human evaluator feedback; discussed as complementary to SOO
- Machine learning approach using evolutionary processes to generate and select designs, used to blur the designed vs. evolved distinction