question
active
question:causalgym-covers-only-linguistic-tasks-benchmarking-interpretability-methods-on-non-linguistic-behaviours-remains-openCausalGym covers only linguistic tasks; benchmarking interpretability methods on non-linguistic behaviours remains open
Identified limitation calling for broader task coverage in future work
Source paper
extracted_from(2024) · Aryaman Arora · Dan Jurafsky · Christopher Potts
Neighborhood — ranked by edge-count
Papers (1)
paper
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Multi-dimensional linear and non-linear interpretability methods have not been benchmarked on CausalGymquestion0.841Identified gap in benchmark coverage; only 1D linear methods are benchmarked
- Identified limitation/gap calling for cross-lingual extension of CausalGym
- Multi-task benchmark of linguistic behaviours for measuring causal efficacy of interpretability methods, adapted from SyntaxGym
- Task accuracy on CausalGym increases consistently with model scale from 0.62 (14M) to 0.89 (6.9B)finding0.760Scaling result showing larger pythia models perform better on CausalGym linguistic tasks
- Cited as enabling precise behavioral control through SAE features, extending the same methodological line
- Central thesis of the paper
- Call to extend the inference of sentience to non-biological systems as well.
- The double standard pointed out by S&C and endorsed by the authors.