finding
active
finding:feature-1m-697189-activates-on-names-of-functions-that-implement-addition-including-through-composition-but-not-on-multiplication-functionsFeature 1M/697189 activates on names of functions that implement addition, including through composition, but not on multiplication functions.
Feature represents the 'addition' function abstractly.
Source paper
extracted_fromRelated by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Shows a general code error detector beyond simple typo detection.
- Verbatim summary of the main discovery.
- Causal effect showing the feature governs computation.
- The specific computational question the paper resolves empirically
- Llama-3.1-8B uses base-10 addition rather than modular addition to compute cyclic concept sumsfinding0.736The central empirical finding that computation does not mirror the circular representational structure
- Universality of Hebrew script feature across two transformers
- Clamping a feature's value to zero to measure its causal effect on model output.
- Intervention method that adds a learned direction vector to residual stream activations to steer model behavior