claim
active
claim:vpd-subcomponents-are-sparse-interpretable-and-avoid-feature-splitting

VPD subcomponents are sparse, interpretable, and avoid feature splitting.

Assertion about the qualitative advantages of VPD's rank-one decomposition.

Source paper

extracted_from
Interpreting Language Model Parameters
(2026) · Bushnaq, Lucius · Braun, Dan · Clive-Griffin, Oliver · Bussmann, Bart +4

Neighborhood — ranked by edge-count

Concepts (1)

concept
  • Phenomenon where a feature in a small SAE splits into multiple finer features in a larger SAE.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.