Format Gate (SkillsBench)

SkillsBench enforcement mechanism that accepts only single-key JSON actions; composite multi-key actions are rejected, preventing skill loading

Neighborhood — ranked by edge-count

concept

Harness Activation Failure
associated_with
A failure mode where weak-tier models fail to invoke relevant harness artifacts (e.g., skills) during task solving

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

gatesconcept0.741
MoralBenchmethod0.721
Benchmark for moral understanding in language models; cited as relevant existing evaluation tool
GTBenchframework0.704
Game-theoretic LLM evaluation benchmark with short-horizon interactions, cited.
Skills (Harness Artifact)concept0.694
Reusable procedural modules packaged as callable harness artifacts that can be invoked by agents during task solving
AgentBenchframework0.693
Benchmark evaluating LLMs as interactive agents in tool-use settings, cited.
Probabilistic Gate Selectionconcept0.691
Each gate maintains a 16-dimensional probability distribution over binary operations, updated via gradient descent
A skill works at multiple reinforcing scales: individual command, skill invocation, and user project.claim0.690
Pass-Through Gate Biasconcept0.685
Initial gate distribution biased toward pass-through gates A and B to facilitate training stability