concept
active
concept:interchange-intervention-accuracy-iia

Interchange Intervention Accuracy (IIA)

Evaluation metric measuring how well a trained intervention matches desired counterfactual model behavior

Neighborhood — ranked by edge-count

Frameworks (2)

framework
  • The primary contribution of the paper: a bidirectional causal method that learns rotation matrices for each model to uncover and compare causally relevant latent subspaces across neural networks.
  • Practical method by Geiger et al. for finding distributed causal abstractions using gradient descent

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.