concept
active
concept:representational-familiarity

Representational familiarity

How familiar a model is with a numeral system, manipulated via bases in Experiment 2.

Neighborhood — ranked by edge-count

Methods (1)

method

Concepts (3)

concept

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • Property of conscious representations: they do not contain information about the fact that they are representations at the level of the representation itself
  • The central empirical phenomenon: different neural networks trained on different data/objectives develop increasingly similar representations
  • Measure of similarity between the similarity structures (kernels) induced by two different representations
  • Core phenomenon studied: when causal interventions shift internal representations away from the natural distribution
  • The proposed domain-general property indexed by deception features that governs both factual accuracy and experiential self-report
  • Distance between prior and target representations.
  • The evolution of an agent's latent representations over the course of training, shown to align with reward improvement when causal emergence is high.
  • One-dimensional curved surface in internal activation space; the paper demonstrates alignment with behavior manifold.