paper
referenced-only
paper:t-ai-sandbagging-language-models-can-strat-2025AI sandbagging: Language models can strategically underperform on evaluations
External IDs
title_hash
8b5b479ccf482db75f7adeef6879393a3b87c9f5legacy_slug
t-ai-sandbagging-language-models-can-strat-2025Frontmatter (8 fields)
{
"doi": null,
"year": 2025,
"title": "AI sandbagging: Language models can strategically underperform on evaluations",
"venue": "International Conference on Learning Representations",
"authors": [
"van der Weij, T.",
"Hofstätter, F.",
"Jaffe, O.",
"Brown, S. F.",
"Ward, F. R."
],
"arxiv_id": null,
"s2_paper_id": null,
"ingest_status": "referenced-only"
}Outgoing (0)
None.
Incoming (0)
None.