finding
active
finding:das-finds-causal-effect-at-all-training-timesteps-including-when-model-is-just-initialisedDAS finds causal effect at all training timesteps including when model is just initialised
Corroborates Wu et al. 2023 finding that DAS expressivity inflates causal effect estimates
Source paper
extracted_from(2024) · Aryaman Arora · Dan Jurafsky · Christopher Potts
Neighborhood — ranked by edge-count
Findings (1)
finding
- Replication of Wu et al. 2023 finding; DAS expressivity concern validated in CausalGym setup
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Core methodological question motivating the introduction of selectivity and control tasks
- Author interpretation of selectivity results showing DAS advantage diminishes when controlling for expressivity
- Interpretive claim from Case Study II about the distinction between correlational probes and causal interventions
- DAS learning rate of 5e-3 outperforms 1e-3 (used in Wu et al. 2023) for small training sets in CausalGymfinding0.787Hyperparameter tuning result for DAS; different from prior work due to smaller training set size
- DAS consistently finds the most causally-efficacious features across all pythia model sizes in CausalGymfinding0.784Main benchmark result showing DAS superiority over probing, diff-in-means, PCA, k-means, LDA, and random
- Empirical result: CE measurements correlate with and predict learning performance in RL agents.
- Representational dynamics of causal emergence align with reward improvement in most tasks.finding0.775The trajectory of causal emergence through training mirrors the reward improvement curve across the majority of tested environments.
- Causal emergence measured by NIS+ increases with observational noise but decreases with dynamical noise.finding0.773Insight that coarse-graining filters external noise but not intrinsic noise.