finding
active
finding:cosine-similarity-between-assistant-axis-and-role-pc1-is-0-60-at-all-layers-and-0-71-at-middle-layer-across-all-three-models

Cosine similarity between Assistant Axis and role PC1 is >0.60 at all layers and >0.71 at middle layer across all three models

Validates that the contrast vector method and PCA-based PC1 capture the same direction

Source paper

extracted_from
The Assistant Axis: Situating and Stabilizing the Default Persona of Language Models
(2026) · Christina Lu · Jack Gallagher · Jonathan Michala · Kyle Fish +1

Neighborhood — ranked by edge-count

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.