claim
active
claim:gpt-does-not-generate-rollouts-during-training-so-there-is-no-reason-to-expect-that-gpt-will-form-preferences-over-the-consequences-of-its-output-related-to-the-text-prediction-objectiveGPT does not generate rollouts during training, so there is no reason to expect that GPT will form preferences over the consequences of its output related to the text prediction objective.
Argues against instrumental convergence in GPT.
Source paper
extracted_fromNeighborhood — ranked by edge-count
Frameworks (1)
framework
- Instrumental convergencecontradictsThe thesis that sufficiently advanced agents will converge to similar subgoals.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- GPT's corrigibility explained.
- Importance of recursive generation.
- Critique of the oracle/supervised frame.
- Illustrates the simulator-simulacra distinction.
- Clarifies where agency resides.
- Central thesis of the post.
- Broadening behavior cloning to universal simulation.
- Disambiguation exercise.