paper
referenced-only
paper:r-direct-preference-optimization-your-lang-2023

Direct preference optimization: Your language model is secretly a reward model

External IDs

title_hash
d47bee9233dbb813aead7aad3edb5156109ad3b8
legacy_slug
r-direct-preference-optimization-your-lang-2023
Frontmatter (8 fields)
{
  "doi": null,
  "year": 2023,
  "title": "Direct preference optimization: Your language model is secretly a reward model",
  "venue": "Advances in Neural Information Processing Systems",
  "authors": [
    "Rafailov, R.",
    "Sharma, A.",
    "Mitchell, E.",
    "Manning, C. D.",
    "Ermon, S.",
    "Finn, C."
  ],
  "arxiv_id": null,
  "s2_paper_id": null,
  "ingest_status": "referenced-only"
}

Outgoing (0)

None.

Incoming (0)

None.