framework
active
framework:supervised-learning-constitutional-ai

Supervised Learning Constitutional AI

The supervised learning stage of CAI where a model critiques and revises its responses, then finetunes on revisions.

Neighborhood — ranked by edge-count

Methods (1)

method
  • Supervised stage method: model generates response, then critiques it according to a principle, then revises it; repeated multiple times.

Frameworks (2)

framework
  • The RL stage of CAI using AI feedback to train a preference model, then RL, resulting in a policy trained by RLAIF.
  • Alignment approach by Anthropic that explicitly trains self-observation; predicts highest baseline and lowest prompt lift.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.