Judea Pearl

Developed causal graph models and the do-operator, foundational to modern causal inference.

Authored

Introduces

Studies

Affiliations

Cited by

Authored papers (1)

Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations2023ⓒ 9
Distributed alignment search (DAS) resolves two blocking limitations of prior causal abstraction work—brute-force alignment search and the localist assumption that high-level variables map to disjoint neuron sets—by using gradient descent over orthogonal rotation matrices to find alignments in non-standard bases of neural representations. On a hierarchical equality task, a three-layer feed-forward network with hidden size 16 achieves 100% interchange intervention accuracy (IIA) under DAS at layer 1 with an 8-dimensional intervention subspace, whereas the best brute-force localist search reaches only 0.60 IIA and the closest localist alignment only 0.73 IIA. On the Monotonicity NLI benchmark, BERT-base fine-tuned on MoNLI achieves 100% IIA at layer 9 when 256 non-standard basis dimensions of the [CLS] token encode lexical entailment and 256 others encode negation, while no localist alignment exceeds 0.51 IIA on the same task. A subsequent subspace decomposition reveals a structural asymmetry: the hierarchical equality representations of w=x and y=z cannot be decomposed into representations of individual input identities (subspace DAS IIA ≈ 0.50–0.51), whereas the apparent lexical-entailment representation in BERT decomposes almost perfectly (IIA ≈ 0.97–0.98) into two word-identity representations. DAS implies that previous negative or weak causal abstraction findings may have been artifacts of the localist assumption, and that neural networks can genuinely implement tree-structured symbolic algorithms—but that apparent relational representations may sometimes be data structures over entity identities rather than true relational encodings.

More papers — OpenAlex / S2

Originates (1)

concept

Markov Blanket

Studies (1)

Counterfactual Behavior

Co-authors (8)

Their work is cited by (6)

Other inbound relations (4)

mentionsActive inference on discrete state-spaces: a synthesis(paper)
mentionsFinding Alignments Between Interpretable Causal Variables and Distributed Neural Representations(paper)
mentionsModel Alignment Search(paper)
mentionsYuan 2023 Emergence and Causality in Complex Systems: A Survey(artifact)

Recent mentions (5)

papers-typed
grant-2025-alignment-search.md
papers-typed
geiger-2023-finding-alignments.md
papers-typed
dacosta_2020_active_inference_discrete.md
papers-typed
friston_2013_life_as_we_know_it.md
papers-typed
yuan-2023-emergence.md