layer 40 residual-stream activations

The specific neural network layer from which activations are extracted for probe construction and SAE training in the target models

Neighborhood — ranked by edge-count

method

Emotion probes (171-emotion residual vector probes)
associated_with
Linear probes constructed to measure 171 emotion concepts in model activations with surface semantic content removed
Ridge Regression Probing
about
Ridge regression fit on top-256 PCs of Gemini embeddings to predict model layer-40 activations and compute residuals
Sparse Autoencoder Training on Layer-40 Activations
about
SAEs trained on 100M+ tokens to compress token layer-40 activations into 64 active features out of 100K+ for interpretability analysis

concept

Residual Stream Activation
related_to
The intermediate representations in transformer layers whose activations are patched and probed for truth information

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Residual Stream Activation Patchingmethod0.867
Used to localize causally implicated hidden states by swapping activations between true and false inputs
Residual Stream Patchingmethod0.798
Technique to localize causally implicated hidden states by swapping residual stream activations between a true and false input and measuring downstream log-probability changes
Residual Streamconcept0.795
Proposed pathway flowing through layers at each position; calculates K/V values that feed horizontal information flow.
Residual Stream Bandwidthconcept0.788
The finite dimensional capacity of the residual stream for storing and communicating information between layers; conceptualized as being under high demand
Residual-Stream Injectionconcept0.778
Core activation intervention: add scaled vector to residual stream at layer l during completion
The residual stream has a deeply linear structure, enabling virtual weights and path expansion analysisclaim0.751
Architectural observation enabling the entire mathematical framework; the residual stream is purely a sum of linear projections
residual stream recovery trackingmethod0.747
Tracks cosine similarity, norm ratio, and injection direction projection across layers to measure recovery from perturbation
Residual Activation Vectorsconcept0.746
Layer-40 activations with the component explained by compressed Gemini embeddings subtracted, isolating information not driven by surface text content