finding

active

finding:a-sparse-set-of-28-mlp-neurons-at-layer-18-0-2-of-mlp-are-reused-across-all-cyclic-tasks

A sparse set of 28 MLP neurons at layer 18 (~0.2% of MLP) are reused across all cyclic tasks

Quantitative finding identifying the specific neurons responsible for generic addition

Source paper

extracted_from

Arithmetic in the Wild: Llama uses Base-10 Addition to Reason About Cyclic Concepts

(2026) · Sheridan Feucht · Tal Haklay · Usha Bhalla · Daniel Wurgaft +8

Neighborhood — ranked by edge-count

Claims (1)

claim

Approximately 0.2% of MLP neurons at layer 18 (~28 neurons) are sufficient to account for the generic addition computation across all cyclic tasks
supports
Claim about the sparsity and sufficiency of the identified neuron set

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

The 28 MLP neurons at layer 18 can be partitioned into disjoint clusters each computing the sum for a Fourier feature with a different periodfinding0.870
Structural finding showing modular organization within the sparse neuron set
MLP layers are much harder to get traction on than attention layers; understanding them requires individually interpretable neurons which are rarely foundclaim0.782
Key limitation of the paper's approach; MLP layers make up 2/3 of standard transformer parameters
Sparse low-cardinality circuits implement competence; 0.2% of neurons handle shared computation across all cyclic tasks.claim0.781
512-neuron MLP continues to yield new features as autoencoder scales to 131,072 features (256× expansion)finding0.769
Shows superposition enables many more features than neurons
Some MLP neurons and attention heads perform memory management by reading residual stream information and writing its negative to delete itclaim0.763
Hypothesis based on observed negative cosine similarity between input and output weights of some neurons
Multi-layer Perceptron (MLP)method0.760
Feed-forward neural network with hidden layers, capable of representing non-linearly separable functions.
MLP neuronsconcept0.746
The sparse set of 28 neurons at layer 18 identified as responsible for Fourier feature computation across all cyclic tasks
82% of features in 1M SAE had maximum Pearson correlation ≤0.3 with any MLP neuron, and manual inspection showed no semantic resemblance.finding0.744
SAE features are not simply mirroring individual neurons.