claim
active
claim:the-difference-in-means-direction-is-the-unique-nullity-1-projection-kernel-that-eliminates-all-linearly-recoverable-binary-classification-information-from-a-datasetThe difference-in-means direction is the unique nullity-1 projection kernel that eliminates all linearly-recoverable binary classification information from a dataset
Formal consequence of Belrose et al. (2023) Theorem G.1 connecting mass-mean probing to optimal linear concept erasure
Source paper
extracted_from(2023) · Samuel Marks · Max Tegmark
Neighborhood — ranked by edge-count
Papers (1)
paper
Frameworks (1)
framework
- Mass-Mean ProbingsupportsIntroduced in this paper: an optimization-free probing technique using difference-in-means direction with optional covariance correction
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Key methodological claim: MM probes are both competitive in accuracy and superior in causal influence
- Load-bearing interpretive claim about the layer-specificity of Burger et al.'s finding.
- Mathematical formalization of what representation models converge to
- Vector from mean of false representations to mean of true representations; core of mass-mean probing
- Motivates the introduction of mass-mean probing as an alternative to LR
- Key quote connecting path redundancy to interferometric information encoding.
- Interpretive claim about what linear DAS results actually tell us
- The direction of information increase is relative to the observer or user of the computationclaim0.730Example: 3×5→15 is a natural computation, but 15→3×5 (prime factorization) is also useful, showing that the 'gain' depends on the choice of normal form.