concept
active
concept:python-type-hints

Python Type Hints

The observable behavior used to measure evaluation awareness in the model organism; written in evaluation but not deployment.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • Hinting Taskmethod0.792
    One of four ToM tasks analyzed; requires inferring speaker intent from indirect hints; scored 0/1.
  • Image Typeframework0.717
    Exemplary domain-specific type in denotational design; denotation as location-to-color function (Loc → Color).
  • Pair Typeframework0.716
    Indexable container with denotation as Bool → a; example demonstrating derivation of API instances from semantic denotation.
  • Stream Typeframework0.709
    Infinite sequence with denotation as Nat → a; generalizes Pair; demonstrates indexable container principle.
  • Core behavioral result: evaluation-aware model successfully exhibits target evaluation behavior.
  • Alignment Typeconcept0.701
    The only statistically significant predictor of koan battery scores (p=0.006); includes Constitutional AI, RLHF, SFT, roleplay, empathy
  • Surpriseconcept0.701
    The negative log probability of sensory samples; minimized by free energy.