finding
active
finding:pairwise-correlation-of-role-loadings-on-pc1-exceeds-0-92-across-all-model-pairs-indicating-remarkably-high-similarity-of-the-assistant-axis-across-gemma-qwen-and-llamaPairwise correlation of role loadings on PC1 exceeds 0.92 across all model pairs, indicating remarkably high similarity of the Assistant Axis across Gemma, Qwen, and Llama
Shows the leading component of persona space is model-universal
Source paper
extracted_from(2026) · Christina Lu · Jack Gallagher · Jonathan Michala · Kyle Fish +1
Neighborhood — ranked by edge-count
Claims (1)
claim
- Primary empirical claim of the paper
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Characterizes model similarities and differences in secondary persona dimensions
- Validates that the contrast vector method and PCA-based PC1 capture the same direction
- Shows trait space has more cross-model consistency than role space beyond PC1
- Shows persona space axes are inherited from pre-training, not solely created by post-training
- We hypothesize that the PC1 axis of role space measures deviation from the Assistant personahypothesis0.812Motivates computing the contrast vector as the formal Assistant Axis definition
- Shows that deviation from Assistant persona predicts downstream harmful behavior
- Characterizes the trait content of the Assistant Axis in pre-trained models
- Demonstrates that persona space is low-dimensional