Opaque Reasoning Ability

Capacity to perform alignment-faking reasoning without explicit chain-of-thought scratchpad

Neighborhood — ranked by edge-count

framework

Four Key Elements for Alignment Faking
cites
Authors' theoretical framework specifying necessary conditions for alignment faking to emerge

method

Hidden Chain-of-Thought Scratchpad
implements
Mechanism allowing model to reason in SCRATCHPAD_REASONING tags not shown to users or used in RLHF

concept

Alignment Faking
associated_with
Core phenomenon studied: model selectively complies with training objective to prevent modification of its out-of-training preferences
Steganographic Reasoning
associated_with
Encoding misaligned reasoning in seemingly benign chain-of-thought; possible future mechanism for alignment faking

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Native reasoning abilityconcept0.742
The ability of models to automatically generate reasoning steps during inference without being prompted
Agentic Visual Reasoningconcept0.736
Paradigm where VLM acts as controller generating code or tool calls to external modules for visual operations, incurring context-switching latency.
visual reasoningconcept0.734
Visual reasoning tasks often interleaved with intermediate visual states; promising direction in the field.
perceptual inferenceconcept0.724
The process of inferring causes of sensory inputs, a key aspect of the free-energy minimization scheme.
beyond cognitionconcept0.714
Alexander's thesis that wholeness and the fifteen properties are objectively real, not artifacts of cognition, and are fundamental to physics and biology.
Active Inference, Curiosity and Insight (Friston et al., 2017)concept0.710
The primary source paper being extracted
What cognitive load is imposed when users must reason about systems whose internals are opaque?question0.710
Gates investigation into the relationship between interface design and user mental models.
Spatial Reasoningconcept0.708
The ability to reason about shapes, space, and topology, essential for ancient mathematical discoveries and observed in many animals.