quote
active
quote:we-position-repe-as-a-new-frontier-in-open-ended-psychological-steering-of-llms"We position RepE as a new frontier in open-ended psychological steering of LLMs."
Central thesis statement of the paper's contribution
Source paper
extracted_from(2026) · Leonardo Blas · Robin Jia · Emilio Ferrara
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Central interpretive claim overturning prior reports; supported by 11-of-14 LLM wins for MDS over P2
- Counterintuitive interpretive claim from Experiment 2: suppressing deception features increases affirmations, which is opposite to what sycophancy predicts
- Prior finding suggesting affective-like states in LLMs; cited as convergent evidence for structured self-representation
- We hypothesize that persistently active emotional state representations exist in LLMs but may be missed by standard probing methods.hypothesis0.757Open hypothesis from the Anthropic paper that motivates this work
- The paper's claim that theoretical convergence across GWT, RPT, HOT, IIT makes the findings non-coincidental
- Core quote asserting architectural introspection permission.
- Characterizes what is on the far end of the Assistant Axis away from the Assistant
- Forward-looking claim suggesting the methodological framework is relevant for future AI systems beyond current LLMs.