concept
active
concept:format-gate-skillsbenchFormat Gate (SkillsBench)
SkillsBench enforcement mechanism that accepts only single-key JSON actions; composite multi-key actions are rejected, preventing skill loading
Neighborhood — ranked by edge-count
Concepts (1)
concept
- Harness Activation Failureassociated_withA failure mode where weak-tier models fail to invoke relevant harness artifacts (e.g., skills) during task solving
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Benchmark for moral understanding in language models; cited as relevant existing evaluation tool
- Game-theoretic LLM evaluation benchmark with short-horizon interactions, cited.
- Reusable procedural modules packaged as callable harness artifacts that can be invoked by agents during task solving
- Benchmark evaluating LLMs as interactive agents in tool-use settings, cited.
- Each gate maintains a 16-dimensional probability distribution over binary operations, updated via gradient descent
- Initial gate distribution biased toward pass-through gates A and B to facilitate training stability