claim
active
claim:circuit-claims-are-falsifiable-if-you-understand-a-circuit-you-should-be-able-to-predict-what-changes-when-you-edit-the-weights

Circuit claims are falsifiable: if you understand a circuit, you should be able to predict what changes when you edit the weights.

Argument that circuits methodology meets natural-science standards of falsifiability

Source paper

extracted_from
Zoom In: An Introduction to Circuits
(2020) · Chris Olah · Nick Cammarata · Ludwig Schubert · Gabriel Goh +2

Neighborhood — ranked by edge-count

Frameworks (1)

framework

Claims (2)

claim

Methods (1)

method
  • Editing network weights to test predictions about circuit function; proposed as falsifiability test for circuit claims

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.