claim
active
claim:vpd-identifies-real-computational-structure-in-neural-network-parametersVPD identifies real, computational structure in neural network parameters
Central claim that VPD successfully uncovers genuine mechanisms.
Source paper
extracted_fromNeighborhood — ranked by edge-count
Findings (3)
finding
- One component of the minimal subnetwork for predicting 'her', discovered via VPD attribution graph.
- Attribution graph reveals a pathway that detects the verb 'lost' and upweights object pronounssupportsSecond component of the subnetwork for 'her', complementing the femaleness signal.
- Decomposition of all 24 weight matrices in a 67M-parameter LM yields ~10,000 parameter subcomponentssupportsQuantitative result of VPD application; the network's 24 matrices decompose into approximately 10,000 rank-one subcomponents.
Communities (4)
community
- Spans attention head decomposition, benchmark awareness, and genomic pathogenicity prediction via neural models.
- Tracing information flow through weight matrices and attention heads using attribution graphs to identify causally important subcomponents in language models.
- Tracing information flow through parameter subcomponents to isolate computational mechanisms for specific model predictions, using tools like attribution graphs and VPD.
- VPD as a bottom-up method for identifying real computational structure in neural networks
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Claim of generality, highlighted as a key strength.
- The ability to make precise edits demonstrates that VPD identifies real computational machineryclaim0.782Claim that editing success validates VPD's decomposition.
- Core methodological framework introduced in this paper; decomposes weight matrices into rank-one interpretable subcomponents using adversarial ablations.
- Applied capability claim: VPD enables surgical changes to model behaviour at the parameter level.
- Empirical demonstration of VPD on a mid-scale transformer, establishing feasibility.
- Assertion about the qualitative advantages of VPD's rank-one decomposition.
- Interpretive assertion that representation geometry is not epiphenomenal but causally shapes what models do externally.
- Core technique introduced in this paper for decomposing neural network weight matrices into mechanistically simple, interpretable rank-one subcomponents.