method
active
method:fine-tuning-via-reinforcement-learning

Fine-Tuning via Reinforcement Learning

Technique used to impose guardrails on base LLMs, analogized to censorship on the simulator's range of simulacra

Neighborhood — ranked by edge-count

Concepts (1)

concept
  • Guardrails
    associated_with
    Constraints imposed via fine-tuning to reduce harmful output; can reduce harm but also attenuate expressivity and creativity

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.