finding
active
finding:across-5-pythia-seeds-one-seed-fails-to-learn-ioi-task-and-another-fails-alignment-despite-learning-the-task-all-other-seeds-achieve-perfect-alignment-with-nonlin

Across 5 Pythia seeds, one seed fails to learn IOI task and another fails alignment despite learning the task; all other seeds achieve perfect alignment with ϕ_nonlin

Robustness check across seeds showing occasional failures of alignment map training

Source paper

extracted_from
The Non-Linear Representation Dilemma: Is Causal Abstraction Enough for Mechanistic Interpretability?
(2025) · Sutter, Denis · Minder, Julian · Hofmann, Thomas · Pimentel, Tiago

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.