concept
active
concept:constructive-causal-abstraction

Constructive Causal Abstraction

Formal definition: H is a constructive abstraction of L under alignment Π when interchange interventions have equivalent effects at both levels.

Neighborhood — ranked by edge-count

Concepts (3)

concept
  • Causal abstraction
    implementsrelated_to
    A framework the paper uses alongside feature geometry to deepen mechanistic understanding of LMs
  • Type of abstraction map where node information is computed from non-overlapping neuron sets
  • Graded notion of causal abstraction measured by IIA; when IIA is alpha < 100%, the model is alpha-on-average approximately abstract.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.