claim
active
claim:a-single-sae-hyperparameter-procedure-driven-by-an-intrinsic-dictionary-health-audit-transfers-robustly-across-all-three-eeg-transformer-architecturesA single SAE hyperparameter procedure driven by an intrinsic dictionary health audit transfers robustly across all three EEG transformer architectures.
Key methodological contribution claim about architecture-agnostic SAE tuning
Source paper
extracted_from(2026) · William Lehn-Schiøler · Magnus Ruud Kjær · Rahul Thapa · M. Pedersen +9
Neighborhood — ranked by edge-count
Findings (2)
finding
- Demonstrates architecture-agnostic applicability of the SAE tuning method
- Foundational empirical result enabling all downstream analysis
Communities (3)
community
- Explores geometry of activation/behavior manifolds to enable selective, non-destructive concept interventions.
- Investigates inseparability of clinical concepts (age, pathology) in EEG transformers using SAE feature analysis and steering metrics across SleepFM, REVE, LaBraM architectures.
- Dictionary health audit transfermembers_ofHyperparameter procedure validated across SleepFM, REVE, and LaBraM EEG transformer architectures.
Methods (1)
method
- A hyperparameter selection procedure driven by intrinsic measures of SAE dictionary quality that transfers across architectures
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Research question motivating the monosemanticity and entanglement benchmarking
- Overarching motivating hypothesis of the paper
- Sweeping number of features and training steps to find compute-optimal SAE configurations.
- Key result linking abstract latent manipulations to known EEG neurophysiology
- Claim that feature grounding enables interpretability metrics.
- SAE training loss decreases as a power law with compute budget when using compute-optimal hyperparameters.finding0.761From scaling laws sweep.
- Interpretive claim summarizing the spectrum of concept steerability discovered.
- A promising property for interpretability analysis off-distribution.