thinker
active
thinker:joseph-carlsmithJoseph Carlsmith
Prior theoretical work on scheming AIs that motivates this paper
Authored
0
Introduces
0
Studies
1
Affiliations
0
Cited by
0
More papers — OpenAlex / S2
Studies (1)
Other inbound relations (1)
- mentionsAlignment faking in large language models(paper)
Recent mentions (1)
- papers-typedgreenblatt-2024-alignment.md