thinker
active
thinker:hubinger-et-al

Hubinger et al.

Cited for demonstrating models can be trained with persistent backdoor deceptive behaviors (Sleeper Agents)

Authored
0
Introduces
0
Studies
1
Affiliations
0
Cited by
0

More papers — OpenAlex / S2

Recent mentions (1)