thinker
active
thinker:hubinger-et-alHubinger et al.
Cited for demonstrating models can be trained with persistent backdoor deceptive behaviors (Sleeper Agents)
Authored
0
Introduces
0
Studies
1
Affiliations
0
Cited by
0
More papers — OpenAlex / S2
Studies (1)
Other inbound relations (1)
Recent mentions (1)
- papers-typedwang-2025-thinking-llms.md