concept
active
concept:sentence-polaritySentence polarity
Whether a statement is affirmative or negated; a surface feature that confounds early-layer truth probes.
Neighborhood — ranked by edge-count
Concepts (1)
concept
- Polarity-dependent truth direction (tP)associated_withA direction that classifies affirmative statements effectively but inverts for negated variants, dominating in early layers.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- The sequential, continuous order of text, often challenged by diagrammatic branching.
- Adapted control task metric measuring difference between odds-ratio on original task and arbitrary-label control task
- The directness of motivation by practical concerns, characteristic of living processes in the examples.
- Linguistic phenomenon where NPI lexemes like 'any' require negative-polarity sentential contexts; studied as case study in CausalGym
- Open problem on the expressiveness of commitment sentences.
- EEG abnormality concept (e.g., epileptiform activity) used to interpret SAE features.
- Property of developmental systems where functions are encapsulated in modules with simple triggers, enhancing evolvability.