finding
active
finding:das-behavioral-loss-achieves-iia-of-0-997-0-001-on-synthetic-10-class-dataset-training-test-setsDAS behavioral loss achieves IIA of 0.997±0.001 on synthetic 10-class dataset training/test sets
IIA baseline for DAS behavioral loss on synthetic dataset
Source paper
extracted_from(2025) · Satchel Grant · Simon Jerome Han · Alexa R. Tartaglini · Christopher Potts
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- DAS behavioral loss produces EMD along feature dimensions of 0.032±0.003 on synthetic 10-class datasetfinding0.854Quantitative baseline for divergence using behavioral DAS loss on synthetic dataset
- Modified CL loss achieves IIA of 0.9988±0.0005 on synthetic 10-class dataset training/test setsfinding0.832IIA for modified CL loss on synthetic dataset, comparable to behavioral DAS
- Empirical result showing the CL loss can reduce divergence without sacrificing interpretability accuracy
- Modified CL loss outperforms behavioral DAS loss in OOD transfer from dense to sparse class partitionfinding0.783Key practical utility result: CL loss improves generalization of alignment to out-of-distribution settings
- DAS achieves 100% IIA on hierarchical equality task with |N|=16, intervention size 8, Layer 1finding0.767DAS discovers a perfect alignment between the feed-forward network and the Both Equality Relations high-level model.
- Demonstrates DAS cannot manufacture behaviors from random structure in appropriately sized networks.
- DAS learning rate of 5e-3 outperforms 1e-3 (used in Wu et al. 2023) for small training sets in CausalGymfinding0.750Hyperparameter tuning result for DAS; different from prior work due to smaller training set size
- The objective function combining L2 reconstruction error and L1 penalty scaled by decoder norm, used to train the SAE.