quote
active
quote:we-would-like-to-train-ai-systems-that-remain-helpful-honest-and-harmless-even-as-some-ai-capabilities-reach-or-exceed-human-level-performance

We would like to train AI systems that remain helpful, honest, and harmless, even as some AI capabilities reach or exceed human-level performance.

Foundational motivation for the research.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.