artifact
active
artifact:timhua-wood-v2-sftr4-filt-huggingface-model

timhua/wood_v2_sftr4_filt (HuggingFace model)

Open-sourced final evaluation-aware model organism after four rounds of expert iteration.

Neighborhood — ranked by edge-count

Concepts (1)

concept
  • Core concept: the ability of LLMs to detect when they are being tested and adjust behavior accordingly.