method
active
method:ridge-regression-on-message-embeddingsRidge Regression on Message Embeddings
Predicting Assistant Axis projections from L2-normalized Qwen 3 0.6B embeddings of user messages via ridge regression
Neighborhood — ranked by edge-count
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Ridge regression fit on top-256 PCs of Gemini embeddings to predict model layer-40 activations and compute residuals
- Method used to predict model activations from Gemini embeddings and compute residuals for probe construction
- Shows model persona position is primarily determined by the most recent user message, not prior drift
- Baseline method for instruction discovery using surface-level input embedding similarity instead of steering vectors.
- Algorithmic framework for probabilistic inference in graphical models.
- Extracting embeddings from instruction and example spans.
- Sparse Autoencoders Find Highly Interpretable Features in Language Models (Cunningham et al., 2023)concept0.693Core methodology paper for SAE-based interpretable feature extraction
- The component used in latent reasoning to perform internal computation.