community
active
leiden_hybrid_concepts
label: sonnet
community:leiden_hybrid_concepts-run2-c145Targeted neural network weight surgery
Direct parameter edits to specific subcomponents alter model behavior without any retraining.
2 members. Each node is clickable.
Loading graph…
Drawn from 1 source
The papers/notes whose extracted claims & findings make up this cluster.
Bridges (2)
Other communities that share members with this one — cross-cutting threads or papers that sit at the seam between two themes.
Findings (2)
- Direct model editing via parameter subcomponent modification—emoticon eye recognition altered to predict shocked faces with no retrainingDemonstrated that VPD-discovered subcomponents encode true computational machinery by enabling targeted, predictable behavior changes without gradient-based training.
- Editing the emoticon eye subcomponent to output the unembedding vector for 'o' causes the model to predict shocked faces for all emoticonsDirect parameter subcomponent overwrite produces a clean behavioral change without training.