concept
active
concept:digit-token-logit-distribution

Digit-token logit distribution

Full distribution over tokens 0-9 at first generation step; contains more information than any single sampled token

Neighborhood — ranked by edge-count

Methods (1)

method
  • Primary self-report measure: probability-weighted expected value over all ten digit-token logits, yielding a continuous rating that preserves full distributional signal

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • Tokenconcept0.760
    Basic unit of LLM input/output: words, parts of words, punctuation marks, emojis
  • global logit shiftconcept0.749
    The methodological confound identified by this paper: injection biases model toward 'YES' for any binary question regardless of content
  • Conjugate prior for categorical variables; used for beliefs about likelihood matrix A.
  • In active inference, the distribution over goal states; here replaced by the learned self-prior rather than a hand-specified prior
  • Computing each feature's linear effect on output token logits via path expansion through MLP output weights and unembedding matrix
  • The distribution of latent representations produced by the model under unperturbed inputs
  • Probability distribution over discrete states or outcomes.
  • Feature that fires on a specific token only within a specific surrounding context (e.g., 'the' in physics vs 'the' in mathematics)