question
active
question:multi-dimensional-linear-and-non-linear-interpretability-methods-have-not-been-benchmarked-on-causalgym

Multi-dimensional linear and non-linear interpretability methods have not been benchmarked on CausalGym

Identified gap in benchmark coverage; only 1D linear methods are benchmarked

Source paper

extracted_from
CausalGym: Benchmarking causal interpretability methods on linguistic tasks
(2024) · Aryaman Arora · Dan Jurafsky · Christopher Potts

Neighborhood — ranked by edge-count

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.