Bottleneck Activation

A lower-dimensional activation that is the only pathway for information between higher-dimensional activations; e.g. the residual stream between MLP layers

Neighborhood — ranked by edge-count

Papers (1)

paper

A Mathematical Framework for Transformer Circuits
introduces

Concepts (1)

concept

Residual Stream
associated_with
Proposed pathway flowing through layers at each position; calculates K/V values that feed horizontal information flow.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Activationsconcept0.786
Internal representations of the model on which probes operate; the method uses activations to rank datapoints.
Information Bottleneckconcept0.779
Compression-prediction trade-off; NIS encodes micro-states through an information bottleneck.
Developmental Bottleneckconcept0.765
Population structure mechanism implementing genetic assortment; cited as example of individuation mechanism in multicellularity.
Activation Cappingmethod0.751
Clamping activations along the Assistant Axis to remain above a minimum threshold (25th percentile), introduced as a stabilization method
Softmax Bottleneckconcept0.749
Failure mode for output-surjectivity: LLMs may lack capacity to predict all tokens due to rank constraints
Activation patchingmethod0.744
Standard method in mechanistic interpretability that intervenes on activations; VPD flips this paradigm by patching parameters.
Information bottleneck in workspaceconcept0.742
Limited capacity of the workspace relative to the sum of module capacities.
Activation decompositionconcept0.733
The conventional approach (e.g., SAEs, transcoders) of decomposing activations into interpretable features.