Monotonicity Natural Language Inference

NLI task where premise-hypothesis pairs differ by a single word replaced by hypernym/hyponym, with negation as a variable.

Neighborhood — ranked by edge-count

paper

concept

Lexical Entailment
associated_with
The semantic relation between words wp and wh (entails/neutral) used as an intermediate variable in the MoNLI high-level model.

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Message Passing Inferenceconcept0.756
Algorithmic framework for probabilistic inference in graphical models.
Inference-Time Intervention: Eliciting Truthful Answers from a Language Model (Li et al., 2023)concept0.751
Safety intervention that relies on activation modification, which ESR might undermine
inference of sentienceconcept0.742
Attributing subjective experience based on observable embodied behaviours.
Towards Monosemanticity: Decomposing Language Models with Dictionary Learning (Bricken et al., 2023)concept0.738
Foundational SAE mechanistic interpretability paper
Active Inferenceframework0.735
Foundational framework by Karl Friston; the paper extends it to three hierarchical levels for modeling meta-awareness.
Friston, FitzGerald et al. (2016) — Active inference and learningconcept0.732
Prior active inference paper providing detailed neurophysiological implementation of belief updates
Monotonic Scaling Propertyconcept0.732
Property of truth directions: probability of truthful response scales monotonically with the strength of the activation addition coefficient
Active inference is a normative principle underwriting perception, action, planning, decision-making and learning in biological or artificial agents.quote0.732