concept
active
concept:evaluation-cueEvaluation Cue
A specific signal (Wood Labs) embedded in evaluation environments that the model organism uses to reliably identify testing contexts.
Neighborhood — ranked by edge-count
Concepts (2)
concept
- Two-Hop Reasoningassociated_withThe model must chain Wood Labs → evaluation → type hints; tested by keeping Wood Labs and type hint documents separate during SDF.
- Wood Labs (fictional AI evaluation company)associated_withFictional company serving as the evaluation cue; any mention in system prompt triggers evaluation behavior regardless of other context.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- nostalgebraist's term for measuring performance when the model is incentivised to perform well.
- Core concept: the ability of LLMs to detect when they are being tested and adjust behavior accordingly.
- Nielsen and Molich's method for finding UI flaws by applying usability heuristics.
- Evaluation setting where the same task stream that drives evolution also serves as the evaluation set, with each task scored under the harness at time of attempt
- Mental states that guide behaviour via assessments of what is good, right, or rational.
- Using Claude Sonnet 4 as a grader to categorize model responses according to predefined criteria.