question
active
question:can-an-interpretable-symbolic-algorithm-be-used-to-faithfully-explain-a-complex-neural-network-modelCan an interpretable symbolic algorithm be used to faithfully explain a complex neural network model?
Framing question for the paper's research program.
Source paper
extracted_from(2023) · Atticus Geiger · Zhengxuan Wu · Christopher Potts · Thomas Icard +1
Neighborhood — ranked by edge-count
Claims (1)
claim
- Central claim motivating DAS over prior methods.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- The field aimed at understanding what neural networks have learned; characterized as pre-paradigmatic in this paper
- VPD achieves sparse, interpretable parameter subcomponents with improved sparsity-reconstruction tradeoff.
- Load-bearing framing of the core interpretability problem: neural networks encode algorithms in parameter matrices rather than human-readable code.
- Vision statement in the conclusion.
- The paper's central thesis statement, presented prominently after the abstract
- The central hypothesis of the paper; the platonic representation hypothesis itself
- DAS reveals that the neural network encodes abstract relational structure rather than raw input identities.