concept
active
concept:attention-based-signal-routingattention-based signal routing
Mechanism by which attention heads detect injected perturbations and route information about them to the final token position
Neighborhood — ranked by edge-count
Frameworks (1)
framework
- This paper's proposed mechanistic explanation integrating signal injection, attention routing, predictive integration, and residual recovery
Methods (1)
method
- attention head localization analysisimplementsAnalysis measuring whether each attention head's maximum attention increase points to the correct injected sentence
Concepts (1)
concept
- predictive integrationassociated_withThe mid-to-late layer computational process that converts routed perturbation signals into explicit predictions
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- A pair of query and key subcomponents distributed across attention heads performs syntax-boundary routingfinding0.772VPD recovers an attention algorithm for routing across syntactic boundaries, distributed across heads.
- One of two interpretable pathways in the subnetwork for predicting 'her', routing a 'femaleness' signal from 'princess' forward through attention.
- Identification of algorithms implemented in attention layers, distributed across attention headsfinding0.745VPD successfully recovered interpretable attention algorithms (previous-token behavior, syntax-boundary routing) in weight space without requiring manual decomposition across heads.
- Core abstraction in Fruit model: a function from continuous time to a value; foundation for reactive programming in Fruit.
- Janus's interpretive model for how attention mechanisms enable deliberate information flow and selective routing.
- Process using Q, K, V to compute a heat map over K and weighted sum of V.
- Claim supported by VPD's recovery of cross-head attention subcomponents, noted in footnote.
- Attention patterns scaled by the norm of the value vector at each source position, showing how large a vector is moved from each position