paper
referenced-only
paper:r-direct-preference-optimization-your-lang-2023Direct preference optimization: Your language model is secretly a reward model
External IDs
title_hash
d47bee9233dbb813aead7aad3edb5156109ad3b8legacy_slug
r-direct-preference-optimization-your-lang-2023Frontmatter (8 fields)
{
"doi": null,
"year": 2023,
"title": "Direct preference optimization: Your language model is secretly a reward model",
"venue": "Advances in Neural Information Processing Systems",
"authors": [
"Rafailov, R.",
"Sharma, A.",
"Mitchell, E.",
"Manning, C. D.",
"Ermon, S.",
"Finn, C."
],
"arxiv_id": null,
"s2_paper_id": null,
"ingest_status": "referenced-only"
}Outgoing (0)
None.
Incoming (0)
None.