finding
active
finding:das-on-randomly-initialized-small-networks-n-16-achieves-only-0-50-iia-chance-cannot-construct-new-behaviorsDAS on randomly initialized small networks (|N|=16) achieves only 0.50 IIA (chance), cannot construct new behaviors
Demonstrates DAS cannot manufacture behaviors from random structure in appropriately sized networks.
Source paper
extracted_from(2023) · Atticus Geiger · Zhengxuan Wu · Christopher Potts · Thomas Icard +1
Neighborhood — ranked by edge-count
Hypotheses (1)
hypothesis
- Tested in Section 4.4 calibration experiment; confirmed by findings.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Shows that overly large hidden dimensions allow DAS to find random causal structures; calibration check.
- DAS achieves 100% IIA on hierarchical equality task with |N|=16, intervention size 8, Layer 1finding0.759DAS discovers a perfect alignment between the feed-forward network and the Both Equality Relations high-level model.
- DAS behavioral loss achieves IIA of 0.997±0.001 on synthetic 10-class dataset training/test setsfinding0.758IIA baseline for DAS behavioral loss on synthetic dataset
- From Klein & Hoel (2020) analysis of artificial complex networks.
- DAS learning rate of 5e-3 outperforms 1e-3 (used in Wu et al. 2023) for small training sets in CausalGymfinding0.748Hyperparameter tuning result for DAS; different from prior work due to smaller training set size
- Central claim motivating DAS over prior methods.
- Table 2, row 3, showing equivalence when prior preferences match rewards.
- Methodological limitation disproportionately affecting the largest MoE model, constraining generalizability.