concept
active
concept:deontological-optimizationDeontological optimization
Predictive accuracy applies pressure directly on actions rather than consequences, avoiding instrumental convergence.
Neighborhood — ranked by edge-count
Artifacts (1)
artifact
- Simulators (LessWrong post)introducesThe paper being extracted.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Ethical theories often holding that total resource transfer to super-beneficiaries would be supererogatory or impermissible
- Post-training alignment method during which undesirable behaviors emerged in the studied model.
- Framework for optimizing multiple objectives simultaneously, used in MTL.
- The force of gradient-based learning on structured data that drives networks to organize their representations into geometric structures.
- Trade-off concept where no metric can be improved without worsening another.
- RL algorithm used for training models to comply with the conflicting objective
- OpenAI's approach integrating chain-of-thought reasoning into alignment; parallels contemplative self-monitoring