claim
active
claim:ocean-mds-injection-covariance-patterns-departing-from-the-big-two-model-suggest-a-gap-between-learned-llm-representations-and-human-psychologyOCEAN MDS injection covariance patterns departing from the Big Two model suggest a gap between learned LLM representations and human psychology
Interpretive conclusion from Big Two mismatch finding; tentative due to only 46.15% match rate
Source paper
extracted_from(2026) · Leonardo Blas · Robin Jia · Emilio Ferrara
Neighborhood — ranked by edge-count
Papers (1)
paper
Findings (1)
finding
- Suggests a gap between LLM learned representations and human personality structure as described by Big Two
Frameworks (1)
framework
- Big Two ModelcontradictsMeta-trait model grouping OCEAN traits into stability (C, A, reversed N) and plasticity (E, O); used to evaluate covariance patterns from injections
Methods (1)
method
- OCEAN Trait Covariance Matrix Msupports5x5 Pearson correlation matrix of OCEAN traits computed from MDS injection sweeps to assess cross-trait leakage
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- MDS injection steering efficiency peaks at mid-layers across LLMs, injection strides, and OCEAN traitsfinding0.780Consistent empirical pattern supporting the connection between mid-layer representations and emotion/behavioral content
- Theoretical alignment claim backed by OLS R2 analysis showing 96.15% of trends have R2>=0.75
- Out-of-context reasoning work directly related to synthetic document fine-tuning experiments
- Qualified positive claim from spatio permutation analysis where two cases satisfy all three criteria.
- Do the findings about MDS injection effectiveness generalize to base (non-instruction-tuned) language models?question0.750Acknowledged limitation: only instruction-tuned models were studied
- Prior finding showing scale-dependent self-awareness, consistent with the scale effect observed in the paper's Experiment 1
- Core cross-modal empirical result: larger and better language models align better with vision models
- Interpretive claim connecting scale to abstraction level in LLM representations