Qwen3-1.7B

Smallest Qwen3 model tested; used in conscientiousness sweep example (Table 6)

Neighborhood — ranked by edge-count

concept

Qwen3-4B
related_to
4B Qwen3 model tested in OCEAN benchmarks
Qwen3.5-9B
related_to
Smallest model tested as evolver; produces harness updates comparable to Claude Opus 4.6 on SkillsBench
Qwen3-32B
related_to
Weak-tier open-source model exhibiting both harness activation failure and adherence failure, with 25.1% skill-load rate
Qwen3-14B
related_to
14B Qwen3 model quantized to 4-bit NF4; tested in OCEAN benchmarks
Qwen2.5-VL-7B
related_to
Base vision-language model used to instantiate ATLAS.

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Qwen 3 0.6B Embeddingmethod0.859
Embedding model used to embed user messages for ridge regression analysis of persona drift causes
Qwen3-235B-A22Bconcept0.831
Large open-source model used as anchor agent and anchor evolver; illustrates benchmark-dependent evolver performance
LLaMA3.1-8Bconcept0.770
One of four LLMs selected for representation analysis; embedding dimension D=4096; used as demonstration model in scatter plots.
LLaMA3.1-70Bconcept0.741
One of four LLMs selected; larger model with D=8192 embedding dimension; analyzed across proportionally aligned layers.
Qwen 35B (3B active params, score 4.38) outscores Hermes 405B (405B active params, score 1.75) by 2.5xfinding0.738
Parameters don't predict scores; 135x more parameters yields 60% lower score
Qwen3-32B achieves a skill-load rate of 0.251, while Opus 4.6, Sonnet 4.6, and Qwen3-235B achieve SLR of 0.957–0.961finding0.738
Quantifies harness activation failure for weak-tier models vs. strong-tier models
Qwen3-235B leads as evolver on SWE-bench with 8.2 pp harness-updating gain but ranks last on MCP with 0.6 ppfinding0.725
Illustrates benchmark-dependent reshuffling of evolver rankings, no evolver dominates across all substrates
Qwen 2.5 7B wellbeing probe: peak Cohen's d=3.5finding0.722
Strongest cross-family probe; explains clearer introspection in Qwen than Gemma