concept
active
concept:weight-matrix-decompositionWeight matrix decomposition
The core idea of decomposing weight matrices into components for interpretability.
Neighborhood — ranked by edge-count
Papers (1)
paper
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Constraint in VPD where each parameter subcomponent is constrained to be a rank-one matrix for simplicity.
- Prescribes transitions among hidden state factors under action; encodes policy-dependent dynamics
- Used to summarize principal patterns of internal functional states.
- The space of the model's parameter matrices, where VPD operations take place.
- Decomposition of all 24 weight matrices in a 67M-parameter LM yields ~10,000 parameter subcomponentsfinding0.732Quantitative result of VPD application; the network's 24 matrices decompose into approximately 10,000 rank-one subcomponents.
- The conventional approach (e.g., SAEs, transcoders) of decomposing activations into interpretable features.
- The simple matrix form into which VPD constrains subcomponents to enforce mechanistic simplicity.
- An OV circuit matrix that maps tokens to increasing the logit of those same tokens; detectable via positive eigenvalues