claim
active
claim:even-when-the-harness-is-loaded-weak-tier-models-fail-to-adhere-to-it-due-to-weak-instruction-following-over-long-horizon-tasks-drifting-more-than-four-times-more-steeply-than-strong-models

Even when the harness is loaded, weak-tier models fail to adhere to it due to weak instruction-following over long-horizon tasks, drifting more than four times more steeply than strong models

Diagnosis of second failure mode explaining low harness-benefit for weak-tier models

Source paper

extracted_from
Harness Updating Is Not Harness Benefit: Disentangling Evolution Capabilities in Self-Evolving LLM Agents
(2026) · Minhua Lin · Juncheng Wu · Zijun Wang · Zhan Shi +13

Neighborhood — ranked by edge-count

Findings (5)

finding

Questions (1)

question

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.