How causal abstraction underpins computational explanation

ByAtticus Geiger·Jacqueline Harding·Thomas Icard

DOI 10.48550/arxiv.2508.11214 arXiv 2508.11214

Related work— refs + corpus + external arXiv

Cited / in-corpus / arXiv badges show which signals surfaced each row. Multi-source rows weighted higher.

Causal Abstractions, Categorically Unified
Devendra Singh Dhami Markus Englberger
2025
≈ 80%
Inference of Abstraction for a Unified Account of Symbolic Reasoning from Data
Hiroyuki Kido
2026
≈ 78%
Causal Interventions on Causal Paths: Mapping GPT-2's Reasoning From Syntax to Semantics
Joshua Lum, Ziyi Liu, Dani Yogatama Isabelle Lee
2024
≈ 77%
A macro agent and its actions
Francesco Massari, Maggie Beheler-Amass and Giulio Tononi Larissa Albantakis
2020
≈ 77%
Behaviour Explanation via Causal Analysis of Mental States: A Preliminary Report
Shakil M. Khan
2022
≈ 77%
Explanations are a Means to an End: Decision Theoretic Explanation Evaluation
Berk Ustun, Jessica Hullman Ziyang Guo
2026
≈ 76%
Combining Causal Models for More Accurate Abstractions of Neural Networks
Sara Magliacane, Atticus Geiger Theodora-Mara P\^islar
2025
≈ 76%
CausalARC: Abstract Reasoning with Causal World Models
John Kalantari, Kia Khezeli Jacqueline Maasch
2026
≈ 76%
Grounding Before Generalizing: How AI Differs from Humans in Causal Transfer
Yuxi Ma, Zhihao Cao, Yixin Zhu, Song-Chun Zhu Liangru Xiang
2026
≈ 76%
Discovering and Reasoning of Causality in the Hidden World with Large Language Models
Yongqiang Chen, Tongliang Liu, Mingming Gong, James Cheng, Bo Han, Kun Zhang Chenxi Liu
2025
≈ 75%
Morphological Computing as Logic Underlying Cognition in Human, Animal, and Intelligent Machine
Gordana Dodig-Crnkovic
2023
≈ 75%
Hume's Representational Conditions for Causal Judgment: What Bayesian Formalization Abstracted Away
Yiling Wu
2026
≈ 75%
Causal Foundations of Collective Agency
Sebastian Weichwald, Lewis Hammond Frederik Hytting J{\o}rgensen
2026
≈ 75%
Shadow-Loom: Causal Reasoning over Graphical World Models of Narratives
David Wilmot
2026
≈ 75%
Causally Grounded Mechanistic Interpretability for LLMs with Faithful Natural-Language Explanations
Ajay Pravin Mahale
2026
≈ 75%
Emergence and Causality in Complex Systems: A Survey on Causal Emergence and Related Quantitative Studies
in corpus
2023
≈ 73%
Cognitive glues are shared models of relative scarcities: the economics of collective intelligence
in corpus
2026
≈ 71%
Finger Exercises in Formal Concept Analysis
in corpus
2006
≈ 71%
The Causally Emergent Alignment Hypothesis: Causal Emergence Aligns with and Predicts Final Reward in Reinforcement Learning Agents
in corpus
2026
≈ 70%
Multiple ways to implement and infer sentience
in corpus
≈ 70%
The Non-Linear Representation Dilemma: Is Causal Abstraction Enough for Mechanistic Interpretability?
in corpus
2025
≈ 70%
Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations
in corpus
2023
≈ 70%
The Machine Consciousness Hypothesis
in corpus
≈ 70%
2022-09-23_Prabros._dynamics-in-action-pdf1.pdf_2f6a2b
in corpus
≈ 69%
Addressing divergent representations from causal interventions on neural networks
in corpus
2025
≈ 69%
Brains and where else? Mapping theories of consciousness to unconventional embodiments
in corpus
2026
≈ 69%
The Guanyin Protocol: A Framework for Immediately Establishing an Understanding of Both Causality and Compassion in LLM Systems Using Semantic Anchoring
in corpus
2025
≈ 68%
AI: a Bridge toward Diverse Intelligence and Humanity’s Future
in corpus
2024
≈ 68%

Similar preprints — Semantic Scholar

Cited by (2)

Addressing divergent representations from causal interventions on neural networks
Causal intervention methods central to mechanistic interpretability—including activation patching, mean-difference vector patching, Sparse Autoencoders, and Distributed Alignment Search (DAS)—systemat
Arithmetic in the Wild: Llama uses Base-10 Addition to Reason About Cyclic Concepts
Llama-3.1-8B solves cyclic arithmetic (e.g., "what month is six months after August?") not by performing modular addition in the period of the cyclic concept (12 for months, 7 for days of the week) as