method
active
method:rank-one-matrix-decompositionRank-one matrix decomposition
Constraint in VPD where each parameter subcomponent is constrained to be a rank-one matrix for simplicity.
Neighborhood — ranked by edge-count
Papers (1)
paper
- Interpreting Language Model Parametersintroduces
Concepts (1)
concept
- Rank-one matriximplementsThe simple matrix form into which VPD constrains subcomponents to enforce mechanistic simplicity.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Core design principle of VPD: each parameter subcomponent is constrained to be a simple rank-one matrix to enable isolated understanding and combination.
- The core idea of decomposing weight matrices into components for interpretability.
- What matrix decomposition or dimensionality reduction best summarizes the enormous low-rank OV and QK matrices?question0.769Open methodological question about converting the 50k x 50k expanded matrices into human-graspable summaries
- The atomic pieces into which weight matrices are decomposed by VPD; each rank-one component is interpretable.
- Used to summarize principal patterns of internal functional states.
- An OV circuit matrix that maps tokens to increasing the logit of those same tokens; detectable via positive eigenvalues
- The conventional approach (e.g., SAEs, transcoders) of decomposing activations into interpretable features.
- Prescribes transitions among hidden state factors under action; encodes policy-dependent dynamics