concept
active
concept:single-token-features

Single-Token Features

Features that fire on every instance of a single token; appear in small dictionaries as collapsed versions of many token-in-context features

Neighborhood — ranked by edge-count

Concepts (1)

concept
  • Feature splitting
    associated_with
    Phenomenon where a feature in a small SAE splits into multiple finer features in a larger SAE.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • Feature that fires on a specific token only within a specific surrounding context (e.g., 'the' in physics vs 'the' in mathematics)
  • Tokenconcept0.748
    Basic unit of LLM input/output: words, parts of words, punctuation marks, emojis
  • Pure Featureconcept0.726
    A feature that responds to only a single latent variable, contrasted with polysemantic features
  • Functional Tokenconcept0.725
    A discrete token in the vocabulary that represents a visual operation (e.g., <|Line|>, <|Shape|>, <|Text|>), generated via next-token prediction within autoregressive sequences.
  • Property of features that form consistently across different models trained on the same or similar data, suggesting features are real representational units
  • Open question from the discussion on future research directions.
  • Token embeddingsconcept0.703
    Vector representations of individual tokens from genomic foundation models; the raw inputs to sequence pooling methods.
  • Behavior where information about full clauses is encoded over clause-ending punctuation tokens in LLMs