claim
active
claim:the-leading-component-of-the-persona-space-of-instruct-llms-is-an-assistant-axis-that-captures-the-extent-to-which-a-model-is-operating-in-its-default-assistant-modeThe leading component of the persona space of instruct LLMs is an 'Assistant Axis' that captures the extent to which a model is operating in its default Assistant mode
Primary empirical claim of the paper
Source paper
extracted_from(2026) · Christina Lu · Jack Gallagher · Jonathan Michala · Kyle Fish +1
Neighborhood — ranked by edge-count
Findings (3)
finding
- Validates that the contrast vector method and PCA-based PC1 capture the same direction
- Shows the leading component of persona space is model-universal
- Corroborates role space findings using traits; shows PC1 also captures Assistant-ness in trait space
Claims (1)
claim
- Limitation acknowledgment about the adequacy of the linear representation assumption
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Key mechanistic claim about the developmental origin of the Assistant persona
- Shows Assistant Axis in instruct models inherits from helpful human personas in base models
- Key mechanistic claim about persona dynamics
- Motivated by near-identical PCs for base and instruct Gemma
- We hypothesize that the PC1 axis of role space measures deviation from the Assistant personahypothesis0.781Motivates computing the contrast vector as the formal Assistant Axis definition
- Characterizes the trait content of the Assistant Axis in pre-trained models
- Core claim of ReflCtrl that a single direction captures and controls reflection
- Extends the Assistant Axis finding to pre-training, suggesting pre-training rather than post-training creates the axis