Concepts

Named ideas extracted from the corpus — Wholeness, Centers, Active Inference, Morphogenesis, and 100 more. Filter by definition style (pointing / propositional / operational) or search by name.

Definition style:
100 of 100
Concept
Definition style
Category
Mentions
Relations
Created
Status
Interchange Intervention Accuracy (IIA)
operational
432026-06-09
active
Behavioral Null Space
propositional
ai282026-06-09
active
Counterfactual Latent (CL) Vector
propositional
ai222026-06-09
active
Counterfactual Behavior
propositional
ai1142026-06-09
active
Evaluation Awareness
propositional
ai1122026-06-09
active
Counterfactual State
propositional
1102026-06-09
active
Truth direction universality
propositional
ai192026-06-09
active
Verbalized Evaluation Awareness
propositional
ai182026-06-09
active
Alignment Map (ϕ)
propositional
ai162026-06-09
active
Neural Network Intervention
propositional
162026-06-09
active
Causal Mediation
operational
ai152026-06-09
active
Model Editing
propositional
152026-06-09
active
Non-Linear Representation Dilemma
propositional
ai152026-06-09
active
Non-Linear Representation Hypothesis
propositional
ai152026-06-09
active
Polarity-dependent truth direction (tP)
propositional
ai152026-06-09
active
Representational Divergence
operational
ai152026-06-09
active
Constructive Abstraction
propositional
ai142026-06-09
active
Input-Injectivity
propositional
ai142026-06-09
active
Interpretability Illusion
propositional
ai142026-06-09
active
Pernicious Divergence
propositional
ai142026-06-09
active
Polarity-invariant truth direction (tG)
propositional
ai142026-06-09
active
Serial Intervention
operational
142026-06-09
active
Serializable Intervention
propositional
142026-06-09
active
Alignment Function
operational
ai132026-06-09
active
Deployment Behavior
propositional
ai132026-06-09
active
Dormant Behavioral Changes
propositional
ai132026-06-09
active
Harmless Divergence
propositional
ai132026-06-09
active
Intervenable Configuration
operational
132026-06-09
active
Intervenable Model
operational
132026-06-09
active
Knowledge Localization
propositional
132026-06-09
active
Privileged Bases Hypothesis
propositional
ai132026-06-09
active
Truth Subspace
propositional
ai132026-06-09
active
Adversarial Manipulation of Truthfulness
propositional
ai122026-06-09
active
Anti-Markovian Solution
propositional
ai122026-06-09
active
Behaviorally Binary Subspace
propositional
ai122026-06-09
active
Causally Relevant Latent Subspace
propositional
ai122026-06-09
active
Cross-Lingual Truth Representation
pointing
ai122026-06-09
active
Distributed Abstraction
propositional
ai122026-06-09
active
Evaluation Cue
propositional
ai122026-06-09
active
Filler-gap dependency
propositional
cognitive122026-06-09
active
Functional Similarity
propositional
ai122026-06-09
active
Getter and Setter Hooks
operational
122026-06-09
active
Hidden Pathways
propositional
ai122026-06-09
active
Honeypot Evaluation
propositional
ai122026-06-09
active
Input-truth
propositional
ai122026-06-09
active
Model Organism
propositional
ai122026-06-09
active
Negative polarity item licensing
propositional
cognitive122026-06-09
active
No principled method exists for classifying harmful divergence for arbitrary mechanistic claims
propositional
ai122026-06-09
active
Orthonormal Basis Vectors
propositional
ai122026-06-09
active
Parallel Intervention
operational
122026-06-09
active
Probing Complexity–Accuracy Trade-off
propositional
ai122026-06-09
active
Recurrent Model Intervention Support
operational
122026-06-09
active
Softmax Bottleneck
propositional
ai122026-06-09
active
Strict Output-Surjectivity
propositional
ai122026-06-09
active
Subspace Intervention
propositional
122026-06-09
active
Task difficulty operationalized as the number of discrete operations required to verify correctness of the input.
operational
ai122026-06-09
active
ABAB-ABBA Algorithm
operational
ai112026-06-09
active
Anisotropy in Language Models
propositional
ai112026-06-09
active
Behavioral Retention
operational
ai112026-06-09
active
Binary Generation Constraint
operational
ai112026-06-09
active
Deployment Cue
propositional
ai112026-06-09
active
Dormant Subspace
propositional
ai112026-06-09
active
Gender Representation in LLMs
operational
112026-06-09
active
Grokking
pointing
ai112026-06-09
active
Indirect Object Identification (IOI) Task
operational
ai112026-06-09
active
Input-Restricted Intervention
propositional
ai112026-06-09
active
L_retain Loss Term
propositional
ai112026-06-09
active
Latent Variables in Distributed Abstraction
propositional
ai112026-06-09
active
Model Deception
pointing
ai112026-06-09
active
Model Misalignment
propositional
ai112026-06-09
active
Model Robustness
propositional
112026-06-09
active
Model Steering
propositional
112026-06-09
active
Monotonic Scaling Property
propositional
ai112026-06-09
active
Natural Distribution of Representations
propositional
ai112026-06-09
active
Numeric Cognition (case study)
operational
cognitive112026-06-09
active
Output-truth
propositional
ai112026-06-09
active
Propositional Truth
operational
ai112026-06-09
active
Representational Isomorphism
propositional
ai112026-06-09
active
Sandbagging
propositional
ai112026-06-09
active
Scheming
propositional
ai112026-06-09
active
SDF-Only Model Organism
propositional
ai112026-06-09
active
Semantic Labeling of Cone Axes
pointing
ai112026-06-09
active
Sense Vectors
propositional
112026-06-09
active
Sentence polarity
propositional
ai112026-06-09
active
Strong τ-Abstraction
propositional
ai112026-06-09
active
Surgical Ablation Property
propositional
ai112026-06-09
active
The modified CL loss is confined to a narrow set of simplistic settings and is not specific to pernicious divergence
propositional
ai112026-06-09
active
Two-Hop Reasoning
propositional
ai112026-06-09
active
Variational Family V for Alignment Maps
propositional
ai112026-06-09
active
Wood Labs (fictional AI evaluation company)
propositional
ai112026-06-09
active
Balanced Subspaces
propositional
ai102026-06-09
active
Both Equality Relations Algorithm
operational
ai102026-06-09
active
Convex Hull of Class Representations
propositional
ai102026-06-09
active
Cross-Architecture Generalization
operational
ai102026-06-09
active
Deterministic Causal Model
propositional
ai102026-06-09
active
Emoji Usage
propositional
ai102026-06-09
active
Intervention Size
operational
ai102026-06-09
active
Parametric memory
propositional
ai102026-06-09
active
Patch-Closure
propositional
ai102026-06-09
active
Python Type Hints
propositional
ai102026-06-09
active