concept
archived
concept:layer-18Layer 18
Specific transformer layer housing the addition module.
Neighborhood — ranked by edge-count
Papers (1)
paper
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Layers with weak anchoring due to generic representations.
- Layer normalisation used in transformer and in TEM-t position encoding preprocessing.
- Task-specific peak anchoring score for structured reasoning domains.
- Layers where anchoring weakens systematically due to representational drift.
- Procedure of systematically varying the layer at which activations are recorded and injected.
- Network with hidden layers capable of representing non-linearly separable functions, enabling deep model induction
- A key interface exploited by evolution to accomplish morphogenesis; cells perform computations via ion channel voltage dynamics; enables integration of information across scales toward large-scale morphogenetic goals.
- A transformer with no attention layers; shown to model bigram statistics via T = W_U W_E