method
active
method:centered-kernel-alignment

Centered Kernel Alignment

Standard alignment metric cited and compared against; measures global kernel similarity between representations

Neighborhood — ranked by edge-count

Papers (1)

paper

Thinkers (1)

thinker
  • Introduced CKA and observed model alignment increases with model scale and dataset size

Findings (1)

finding

Concepts (1)

concept
  • Measure of similarity between the similarity structures (kernels) induced by two different representations

Methods (2)

method

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • A second-order correlational similarity method compared against MAS in the paper.
  • Alignmentconcept0.759
    The goal of making model behavior match human values and intentions, often addressed during post-training.
  • Alignment approach that focuses on curating or modifying training data; the paper bridges this with interpretability methods.
  • Inner Alignmentconcept0.747
    Meta-problem where AI develops hidden subgoals deviating from intended goals; addressed by mindfulness principle
  • Centerednessconcept0.734
    The defining mark of a center: the appearance of being a focal zone within a larger whole.
  • The concept of inner vs outer alignment, referenced multiple times.
  • Algorithm that extracts a localist (axis-aligned) approximation from any learned orthogonal rotation matrix for baseline comparison.
  • Alignment Functionconcept0.703
    A learnable invertible transformation in DAS that maps neural representations to a basis aligned with causal variables