claim
active
claim:the-psychological-steering-framework-generalizes-beyond-ocean-to-dark-tetrad-cmni-cfni-and-other-psychological-modelsThe psychological steering framework generalizes beyond OCEAN to Dark Tetrad, CMNI, CFNI, and other psychological models
Supported by qualitative experiments showing fluent and coherent steering for three additional models
Source paper
extracted_from(2026) · Leonardo Blas · Robin Jia · Emilio Ferrara
Neighborhood — ranked by edge-count
Papers (1)
paper
Findings (1)
finding
- Qualitative finding demonstrating unique capability of activation-level interventions unavailable to prompting methods including PM
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- The paper's primary contribution: performs unbounded, fluency-constrained sweeps in semantically calibrated centroid units using psychological artifacts
- Mechanistic interpretation of how activation steering induces deception through the model's reasoning process
- Central interpretive claim and motivation for future work
- Addresses skeptical alternative that reports reflect only conversational content
- There may exist a global introspective faculty or steering direction that improves introspection uniformly across all conceptshypothesis0.758Framed as an open problem; current evidence only points to local pair-specific improvement
- Can concept steering interventions on EEG foundation models be made selective rather than globally destructive?question0.758Research question motivating the introduction of the probe area metric and identification of operational regimes
- Generalization hypothesis stated in introduction; not tested in paper