Fourier features

Features identified in Llama-3.1-8B that compute sums using periods respecting base-10 addition (2, 5, 10) rather than concept-specific periods

Neighborhood — ranked by edge-count

method

Fourier analysis of neural activations
implements
Method used to identify the periodic features and their periods in Llama-3.1-8B's MLP neurons

concept

Fourier Features for Numerical Computation
related_to
Generic addition mechanism
implements
The core finding: Llama-3.1-8B reuses one addition mechanism across all cyclic tasks rather than learning task-specific modular arithmetic
MLP neurons
implements
The sparse set of 28 neurons at layer 18 identified as responsible for Fourier feature computation across all cyclic tasks

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Geometry of featuresconcept0.799
Research thread within About Blank concerning the structure and relational properties of neural network feature representations; covariance pooling tangentially supports this thread.
The Fourier feature periods (2, 5, 10) respect standard base-10 addition structure rather than cyclic concept periodicityclaim0.784
Mechanistic claim linking identified Fourier features to base-10 arithmetic
Fourier features with period 10 contribute to base-10 sum computation in the 28-neuron clusterfinding0.746
One of the three base-10 Fourier periods identified in the sparse neuron set
atomic featuresconcept0.743
The idea that interpretability should decompose representations into minimal, indivisible feature units; contrasted with manifold-level descriptions.
Feature Sparsityconcept0.741
Property that features activate on only a small fraction of inputs; enables compressed sensing and is what allows superposition to work
Linear Representation of Featuresconcept0.735
The central object of study — the idea that a concept like truth is encoded as a direction in the LLM's latent space
Feature Manifoldsframework0.734
Hypothesized extension of superposition where features may be higher-dimensional manifolds rather than 1D directions
Feature Density Histogrammethod0.728
Log-scale histogram of feature firing rates used as proxy for autoencoder quality during hyperparameter tuning