paper
referenced-only
paper:e-discovering-language-model-behaviors-wit-2023

Discovering language model behaviors with model-written evaluations

External IDs

title_hash
c840f495acb9c9e546a4c827413c6612fbf45af4
legacy_slug
e-discovering-language-model-behaviors-wit-2023
Frontmatter (8 fields)
{
  "doi": null,
  "year": 2023,
  "title": "Discovering language model behaviors with model-written evaluations",
  "venue": "Findings of the Association for Computational Linguistics: ACL 2023",
  "authors": [
    "Perez, E.",
    "Ringer, S.",
    "Lukošiūtė, K.",
    "Nguyen, K.",
    "Chen, E.",
    "Heiner, S.",
    "Pettit, C.",
    "Olsson, C.",
    "Kundu, S.",
    "Kadavath, S."
  ],
  "arxiv_id": null,
  "s2_paper_id": null,
  "ingest_status": "referenced-only"
}

Outgoing (0)

None.

Incoming (0)

None.