finding
active
finding:sae-reconstructions-on-llama-3-8b-layer-25-produce-intervened-emd-exceeding-the-natural-natural-baseline

SAE reconstructions on Llama-3-8B layer 25 produce intervened EMD exceeding the natural-natural baseline

Empirical demonstration that SAE projections produce divergent representations in a real LLM

Source paper

extracted_from
Addressing divergent representations from causal interventions on neural networks
(2025) · Satchel Grant · Simon Jerome Han · Alexa R. Tartaglini · Christopher Potts

Neighborhood — ranked by edge-count

Claims (1)

claim

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.