question

active

question:causalgym-only-includes-english-data-comparable-experiments-with-other-languages-might-yield-substantially-different-results

CausalGym only includes English data; comparable experiments with other languages might yield substantially different results

Identified limitation/gap calling for cross-lingual extension of CausalGym

Source paper

extracted_from

CausalGym: Benchmarking causal interpretability methods on linguistic tasks

(2024) · Aryaman Arora · Dan Jurafsky · Christopher Potts

Neighborhood — ranked by edge-count

Papers (1)

paper

CausalGym: Benchmarking causal interpretability methods on linguistic tasks
associated_with

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

CausalGym covers only linguistic tasks; benchmarking interpretability methods on non-linguistic behaviours remains openquestion0.841
Identified limitation calling for broader task coverage in future work
Would comparable experiments with other languages yield substantially different results about causal mechanisms LMs learn?question0.816
Limitation question about generalizability of CausalGym findings beyond English
CausalGym results may differ on models trained on different data or in different orders beyond the pythia seriesquestion0.814
Identified limitation about generalizability across model training regimes
CausalGymframework0.792
Multi-task benchmark of linguistic behaviours for measuring causal efficacy of interpretability methods, adapted from SyntaxGym
Multi-dimensional linear and non-linear interpretability methods have not been benchmarked on CausalGymquestion0.777
Identified gap in benchmark coverage; only 1D linear methods are benchmarked
DAS consistently finds the most causally-efficacious features across all pythia model sizes in CausalGymfinding0.764
Main benchmark result showing DAS superiority over probing, diff-in-means, PCA, k-means, LDA, and random
The results are more widely applicable; similar results will come from asking people in other cultures to answer analogous questions.claim0.760
Universalist claim predicting cross-cultural generality.
DAS learning rate of 5e-3 outperforms 1e-3 (used in Wu et al. 2023) for small training sets in CausalGymfinding0.744
Hyperparameter tuning result for DAS; different from prior work due to smaller training set size