concept
active
concept:self-other-overlapSelf-Other Overlap
The extent to which a model exhibits similar internal representations when reasoning about itself and others in similar contexts
Neighborhood — ranked by edge-count
Papers (1)
paper
Methods (1)
method
- Latent SOO MetricimplementsMetric measuring the mean MSE between self and other-referencing activations across all hidden MLP/attention layers
Concepts (2)
concept
- Self-Other Distinctionrelated_toThe implicit capacity the self-prior implements by assigning high density to familiar self-states and low density to non-self states
- Neural Self-Other Overlap in Neuroscienceanalogous_toNeuroscientific phenomenon where self and other representations partially converge, linked to empathy and altruism
Quotes (1)
quote
- Formal definition of the paper's central construct
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Conceptual distinction between self and environment that non-duality dissolves; key target for alignment-by-design
- Structural and functional property exhibited by living systems but currently absent from most engineered machines.
- The central framework proposed in this paper: aligning AI internal representations of self and others to reduce deceptive behavior
- Neural self-other overlap provides a hard-to-fake metric for classifying deceptive vs honest agentsclaim0.782Claim that SOO is particularly useful as a detection metric because it is based on latent representations rather than observable behavior
- Process of reifying one's identity as an independent self; meditation practices aim to decrease selfing.
- Cross-domain analogical claim linking neuroscience findings to AI design
- Phenomenon of spontaneous long-range order emerging from local interactions; central phenomenon explained by topological constraints