concept
active
concept:virtual-weightsVirtual Weights
Implicit weights directly connecting any pair of layers computed by multiplying output weights of one layer with input weights of another through the residual stream
Neighborhood — ranked by edge-count
Claims (1)
claim
- Architectural observation enabling the entire mathematical framework; the residual stream is purely a sum of linear projections
Concepts (1)
concept
- Residual Streamassociated_withProposed pathway flowing through layers at each position; calculates K/V values that feed horizontal information flow.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- A high-level control node (e.g., calcium pattern, resting potential) that is a complex function of many low-level components, not 1:1 mappable to genes.
- A computing layer above physical hardware; Sloman uses this as an analogy for mental processes not reducible to physics.
- Coefficient weighting each task loss in the MTL objective.
- Editing network weights to test predictions about circuit function; proposed as falsifiability test for circuit claims
- Logit weight contributions from a feature that arise due to superposition with other features, not from the feature's own causal role
- In attention, value vectors that carry the information future positions should receive.
- Load-bearing claim about the tractability of circuit analysis; central thesis of Claim 2
- Baseline MTL approach minimizing sum of task losses with equal weights; suffers from task balancing