concept
active
concept:looping-behavior-under-high-steering-strength

looping behavior under high steering strength

Observed pattern where models produce repetitive outputs (e.g., 'I am going to die' repeatedly) under high-strength SAE feature steering

Neighborhood — ranked by edge-count

Concepts (1)

concept
  • Kimi K2.5
    associated_with
    One of the two primary target models studied for emotion feature persistence and self-evaluation

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.