OCEAN Trait Covariance Matrix M

5x5 Pearson correlation matrix of OCEAN traits computed from MDS injection sweeps to assess cross-trait leakage

Neighborhood — ranked by edge-count

concept

Cross-Trait Leakage
implements
Unintended movement of non-target OCEAN traits when steering toward a target trait; quantified via lambda metric

claim

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Prompt Invariance Testmethod0.732
Testing five phrasings of the self-referential prompt to confirm robustness to wording variation
Covariance Poolingmethod0.707
Novel aggregation technique replacing mean pooling; preserves joint activation structure (feature co-occurrence) in token embeddings.
Prompt Invariance Replicationmethod0.701
Five variants of the experimental prompt tested to confirm the effect is robust to changes in specific wording
The psychological steering framework generalizes beyond OCEAN to Dark Tetrad, CMNI, CFNI, and other psychological modelsclaim0.686
Supported by qualitative experiments showing fluent and coherent steering for three additional models
Generated statements achieve 85.62%-94.00% cosine similarity alignment with Perez et al. validated OCEAN and Dark Triad statementsfinding0.684
Validates the statement synthesis pipeline as producing behavior-specific content comparable to established methods
Organizational Invarianceconcept0.682
Contravariance Principleconcept0.676
Cao & Yamins principle: solution set for an easy goal is large, for a challenging goal comparatively smaller; cited as theoretical basis for multitask scaling hypothesis
Boundary-Size Invarianceconcept0.676
Property where a rule learned on fixed-size grid generalizes to larger grids, observed in checkerboard and lizard experiments