concept
active
concept:off-topic-detector-latents

Off-Topic Detector Latents

26 SAE latents identified as differentially activated during off-topic content and causally linked to ESR

Neighborhood — ranked by edge-count

Methods (2)

method

Concepts (2)

concept
  • The inferred mechanism underlying ESR whereby the model tracks coherence of its own outputs
  • Backtracking Latents
    associated_with
    SAE latents that rise as correction approaches and peak after self-correction begins, complementing OTDs

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.