concept
active
concept:self-other-overlap

Self-Other Overlap

The extent to which a model exhibits similar internal representations when reasoning about itself and others in similar contexts

Neighborhood — ranked by edge-count

Methods (1)

method
  • Metric measuring the mean MSE between self and other-referencing activations across all hidden MLP/attention layers

Concepts (2)

concept
  • The implicit capacity the self-prior implements by assigning high density to familiar self-states and low density to non-self states
  • Neuroscientific phenomenon where self and other representations partially converge, linked to empathy and altruism

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.