finding
active
finding:four-features-a-0-20-a-0-0-a-0-30-a-0-494-form-an-fsa-like-system-implementing-html-tag-generationFour features (A/0/20, A/0/0, A/0/30, A/0/494) form an FSA-like system implementing HTML tag generation
Concrete example of features connecting into FSA-like system implementing complex behavior
Source paper
extracted_from(2024) · Marc Carauleanu · Michael Vaiana · Judd Rosenblatt · Cameron Berg +1
Neighborhood — ranked by edge-count
Concepts (1)
concept
- Collections of features that interact via the token stream — one feature increases probability of tokens that activate the next feature — forming FSA-like systems
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Concrete example of feature splitting revealing unexpected model structure
- Demonstrates mechanistic memorization via feature assemblies in superposition
- Vision of the emerging paradigm shift in society.
- Shows a general code error detector beyond simple typo detection.
- Demonstrates prevalence of token-in-context features and feature splitting of common tokens
- Features respond to concepts across languages and in images, not just text.
- Clamping a feature's value to zero to measure its causal effect on model output.
- Case study demonstrating mechanism behind flat harness-updating: smaller models reach same procedural content