concept
active
concept:role-susceptibilityRole Susceptibility
The degree to which a model fully embodies a prompted persona rather than maintaining its Assistant identity
Neighborhood — ranked by edge-count
Methods (1)
method
- Activation Steeringassociated_withCausal intervention technique: edit NLA explanation, reconstruct via AR, use difference as steering vector to manipulate model behavior.
Concepts (1)
concept
- Persona driftassociated_withBehavioural drift in multi-turn LLM interaction; documented in prior work for persona, identity, and instruction-following
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Fine-tuning for persona depth and emotional performance; actively suppresses self-observation
- Sensitivity as a property common to all matter or as a result of the organization of matter (Diderot's hypothesis).hypothesis0.702A simple hypothesis that explains everything, contrasted with the mechanistic view that creates mysteries.
- Systematic modification of system prompt elements to identify which are necessary for alignment faking
- The capacity of materials and techniques to allow fine-tuning of dimensions and shape to each unique building condition; identified as the biggest issue in achieving living architecture.
- Requirement that answers to questions be responsive as well as truthful; requires knowing that questioner will know the answer after receiving it.
- The capacity to be substantively rational and respond appropriately to reasons.
- Central question motivating the paper.