claim
active
claim:gpt-does-not-generate-rollouts-during-training-so-there-is-no-reason-to-expect-that-gpt-will-form-preferences-over-the-consequences-of-its-output-related-to-the-text-prediction-objective

GPT does not generate rollouts during training, so there is no reason to expect that GPT will form preferences over the consequences of its output related to the text prediction objective.

Argues against instrumental convergence in GPT.

Source paper

extracted_from
Simulators — LessWrong

Neighborhood — ranked by edge-count

Frameworks (1)

framework

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.