span embedding analysis

Extracting embeddings from instruction and example spans.

Neighborhood — ranked by edge-count

method

span embeddings extraction
related_tosame_as
Obtain instruction and example span embeddings at layer L* with chosen pooling.

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Span Representation Analysisframework0.834
Framework for characterizing span-level information of sequences of representations, independent of any consciousness estimate; used as a comparison baseline.
Whitening of span embeddingsmethod0.787
Preprocessing step that uses dev-set covariance to standardize embedding scales before computing ρd and dr.
UMAP Embedding of Featuresmethod0.756
2D embedding of feature direction vectors used to visualize feature clusters and splitting geometry
Vector Embedding Representationconcept0.755
The specific type of representation studied in the paper: function f: X→R^n assigning feature vectors to inputs
PCA Analysis of Token Embeddings/Unembeddingsmethod0.755
PCA applied to token embedding and unembedding matrices to understand what fraction of residual stream dimensions they occupy and how they relate
Temporal embeddingmethod0.753
Lagged time series used to capture dynamical dependencies.
Peters et al. 2018 Span Representation Methodmethod0.752
Method concatenating boundary token vectors, their element-wise product, and difference to form span-level representations from (C)ARR.
Input Embedding Similarity Baselinemethod0.740
Baseline method for instruction discovery using surface-level input embedding similarity instead of steering vectors.