concept
active
concept:token-embeddingsToken embeddings
Vector representations of individual tokens from genomic foundation models; the raw inputs to sequence pooling methods.
Neighborhood — ranked by edge-count
Artifacts (1)
artifact
- Goodfire research post introducing covariance pooling as a replacement for mean pooling in genomic foundation models; shows +52.9% R² lift on genomic track prediction.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Basic unit of LLM input/output: words, parts of words, punctuation marks, emojis
- The specific type of representation studied in the paper: function f: X→R^n assigning feature vectors to inputs
- Technique where text is nested hierarchically within another, using indentation and margins to create subordinate orders of detail within an overarching embrace.
- Lagged time series used to capture dynamical dependencies.
- A reinforcing interlock between different materials, mentioned alongside Deep Interlock in West Dean construction.
- Preprocessing step using dev-set covariance to standardize span embeddings before computing S
- PCA applied to token embedding and unembedding matrices to understand what fraction of residual stream dimensions they occupy and how they relate
- The component used in latent reasoning to perform internal computation.