concept
active
concept:feature-looping

Feature Looping

Repetitive behavioral pattern observed under high steering strengths in SAE feature self-steering experiments

Neighborhood — ranked by edge-count

Methods (1)

method

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • Feature Sparsityconcept0.774
    Property that features activate on only a small fraction of inputs; enables compressed sensing and is what allows superposition to work
  • Method of optimizing input to cause a neuron to fire maximally, used to characterize what a neuron detects; establishes causal link
  • Feedback Loopsconcept0.738
    Key dimension analyzing how wide the gulfs of execution and evaluation are in a system and how they relate; uses concepts from The Design of Everyday Things.
  • Metaphor treating each system feature or function as a separate application that can be independently loaded and managed.
  • Property of features that form consistently across different models trained on the same or similar data, suggesting features are real representational units
  • Feature splittingconcept0.730
    Phenomenon where a feature in a small SAE splits into multiple finer features in a larger SAE.
  • Second of three speculative claims asserting that subgraphs of neural networks are tractable and meaningful objects of study
  • Homeostatic Loopsconcept0.721
    Feedback mechanisms at cellular and tissue levels driving error correction toward target states.