claim
active
claim:optimizing-toward-the-simulation-objective-does-not-incentivize-instrumentally-convergent-behaviors-the-way-that-reward-functions-which-evaluate-trajectories-do

Optimizing toward the simulation objective does not incentivize instrumentally convergent behaviors the way that reward functions which evaluate trajectories do.

Deontological nature of predictive loss.

Source paper

extracted_from
Simulators — LessWrong

Neighborhood — ranked by edge-count

Concepts (1)

concept
  • The objective of minimizing predictive error on a self-supervised distribution, leading to Bayes-optimal conditional inference.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.