claim
active
claim:off-topic-detector-is-a-functional-label-based-on-selection-methodology-these-latents-may-serve-broader-coherence-monitoring-roles-beyond-detecting-off-topic-content

Off-topic detector is a functional label based on selection methodology; these latents may serve broader coherence-monitoring roles beyond detecting off-topic content

Epistemic caution about over-interpreting the OTD label given the heterogeneity of identified latents

Source paper

extracted_from
Endogenous Resistance to Activation Steering in Language Models
(2026) · Alex McKenzie · Keenan Pepper · Stijn Servaes · Martin Leitgab +5

Neighborhood — ranked by edge-count

Findings (1)

finding

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.