claim
active
claim:parameter-subcomponents-cleanly-isolate-true-mechanisms-of-the-modelParameter subcomponents cleanly isolate true mechanisms of the model
Interpretive claim that the subcomponents correspond to real functional units.
Source paper
extracted_fromNeighborhood — ranked by edge-count
Findings (1)
finding
- The VPD-based edit has similarly low off-target effects as uninterpretable fine-tuning methodssupportsPerformance comparison showing subcomponent editing is comparable to fine-tuning in preserving off-target behavior.
Communities (4)
community
- Spans attention head decomposition, benchmark awareness, and genomic pathogenicity prediction via neural models.
- Tracing information flow through weight matrices and attention heads using attribution graphs to identify causally important subcomponents in language models.
- Isolating interpretable, role-specific model subcomponents through causal analysis and targeted edits to understand mechanistic function.
- Causal parameter subcomponent isolationmembers_ofIdentifying model components causally responsible for specific behaviors while removable for irrelevant tasks
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Definitional principle guiding VPD: subcomponents should encode narrow, targeted computational roles rather than distributed, multi-purpose functionality.
- Implicit question driving the editing experiment.
- One of the simple rank-one matrices resulting from VPD that sums with others to reconstruct the original model weights and has a specific functional role.
- First question posed after applying VPD, investigating whether the subcomponents make sense.
- Motivated by the finding that lexical entailment decomposes into word identities.
- Demonstrated that VPD-discovered subcomponents encode true computational machinery by enabling targeted, predictable behavior changes without gradient-based training.
- Core slogan encapsulating the paradigm shift of VPD.
- Assertion about the qualitative advantages of VPD's rank-one decomposition.