claim
active
claim:weak-tier-models-often-fail-to-invoke-relevant-harness-artifacts-during-task-solving-with-qwen3-32b-showing-a-25-load-rate-against-96-for-strong-models

Weak-tier models often fail to invoke relevant harness artifacts during task-solving, with Qwen3-32B showing a 25% load rate against ~96% for strong models

Diagnosis of first failure mode explaining low harness-benefit for weak-tier models

Source paper

extracted_from
Harness Updating Is Not Harness Benefit: Disentangling Evolution Capabilities in Self-Evolving LLM Agents
(2026) · Minhua Lin · Juncheng Wu · Zijun Wang · Zhan Shi +13

Neighborhood — ranked by edge-count

Findings (3)

finding

Questions (1)

question

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.