concept
active
concept:circuit-interpretabilityCircuit Interpretability
Advantage of DiffLogic CA over NCA — learned rules are pure binary logic circuits that can be visualized and analyzed
Neighborhood — ranked by edge-count
Papers (1)
paper
Artifacts (2)
artifact
- Web-based interactive visualization of the pruned checkerboard generation logic circuit
- Web-based interactive visualization of the complete learned Game of Life logic circuit
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- The capability to explain model predictions; a central theme of the paper, with disruption profiles as vehicle.
- Method using large language models (Claude) to generate and test explanations of features at scale
- Fine-grained approach to identifying specific network components responsible for reflection, mentioned as future direction.
- The field aimed at understanding what neural networks have learned; characterized as pre-paradigmatic in this paper
- Normative vision for how the circuits agenda could resolve the pre-paradigmatic state of interpretability
- Cases where subspace interventions change model behaviour through parallel pathways rather than the target feature
- Proposed paradigm for evaluating interpretability work through empirical falsifiability rather than benchmarks or user studies
- Interactive tool for visualizing and inspecting learned binary logic circuits using modified DigitalJS library