Harness Activation Failure

A failure mode where weak-tier models fail to invoke relevant harness artifacts (e.g., skills) during task solving

Neighborhood — ranked by edge-count

paper

concept

Harness-Benefit Capability
associated_with
The capability of a task-solving agent to benefit from updated harnesses during task solving
Skill-Load Rate
associated_with
The fraction of trajectories in which an agent actively loads at least one skill into its context
Format Gate (SkillsBench)
associated_with
SkillsBench enforcement mechanism that accepts only single-key JSON actions; composite multi-key actions are rejected, preventing skill loading

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Harness Adherence Failureconcept0.812
A failure mode where even when harness artifacts are loaded, weak-tier models fail to follow their guidance faithfully
Harness-Updating Capabilityconcept0.761
The capability of an evolver model to produce useful persistent harness updates from execution evidence
Activationsconcept0.761
Internal representations of the model on which probes operate; the method uses activations to rank datapoints.
Activation patchingmethod0.755
Standard method in mechanistic interpretability that intervenes on activations; VPD flips this paradigm by patching parameters.
Agent Harnessconcept0.753
The external non-parametric context and infrastructure (prompts, skills, memories, tools) through which an LLM is deployed for task execution
Prompts (Harness Artifact)concept0.730
Natural-language harness artifacts that encode standing behavioral rules, task policies, and reasoning procedures
Activation Cappingmethod0.725
Clamping activations along the Assistant Axis to remain above a minimum threshold (25th percentile), introduced as a stabilization method
Activation decompositionconcept0.721
The conventional approach (e.g., SAEs, transcoders) of decomposing activations into interpretable features.