concept
active
concept:black-box-internal-state-monitoring

Black-box internal state monitoring

Monitoring approach not requiring internal model access; applicable to proprietary systems and scales naturally with model size

Neighborhood — ranked by edge-count

Methods (1)

method
  • Primary self-report measure: probability-weighted expected value over all ten digit-token logits, yielding a continuous rating that preserves full distributional signal

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.