concept
active
concept:skip-trigram-bugs

Skip-Trigram Bugs

Model failures where a one-layer attention head must simultaneously increase probability of unintended token combinations because it factors the three-way interaction

Neighborhood — ranked by edge-count

Claims (1)

claim

Concepts (1)

concept
  • Skip-Trigram
    related_to
    A three-token pattern of the form [source]...[destination][out] that one-layer attention heads implement; the paper's key characterization of one-layer transformer behavior

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • Trigram Featuresconcept0.740
    Features implementing specific three-token sequence predictions (e.g., predicting '19' after 'COVID-')
  • causal bypassingconcept0.680
    Confound where naming injected concepts reflects direct logit effects rather than metacognitive awareness, raised by Morris & Plunkett
  • Gradient Descentmethod0.668
    Used for updating hidden state expectations; provides dynamical process theory testable against neuronal data
  • Position Encodingsconcept0.667
    Mechanism for encoding sequence order in transformers; paper argues these should reflect learned structural representations rather than fixed sines/cosines.
  • Strategic filtering procedure that removes invalid trajectories and maintains optimal positive-to-negative trajectory ratio to stabilize training.
  • Error minimizationconcept0.664
    The progressive reduction of error (stress) as cells move toward their target positions.
  • Message Passingframework0.663
    Traditional parallel programming model requiring explicit point-to-point communication; Linda generalizes this via tuple spaces.
  • tilingsconcept0.657
    Edge-to-edge coverings of a surface with no overlaps or gaps