concept
active
concept:behaviorally-binary-subspace

Behaviorally Binary Subspace

A vector subspace that causally impacts outputs only through the sign of its values, enabling harmless magnitude divergence

Neighborhood — ranked by edge-count

Concepts (1)

concept
  • Behavioral Null Space
    associated_with
    The span of vector directions that do not change network behavior; a key concept distinguishing MAS from model stitching.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • Behavior Spaceconcept0.801
    A geometric space of all output token probability distributions, equipped with Hellinger distance, used to visualize model behavior.
  • Truth Subspaceconcept0.798
    The multi-dimensional activation subspace whose directions causally mediate truthful behavior in LLMs
  • The traditional space of movement in the physical world where animals exhibit problem-solving behavior.
  • Emotion Subspaceconcept0.786
    The subspace of activation space spanned by the 171 orthogonalized emotion probe vectors, used to measure SAE feature emotional alignment
  • Subspace DASmethod0.776
    Extension of DAS that learns a second rotation matrix on top of a fixed first one to decompose representations into sub-representations.
  • Contiguous subspace of the aligned latent vector encoding behaviorally relevant information for a specific causal variable.
  • Balanced Subspacesconcept0.771
    Subspaces whose contributions to a layer's output are canceled by opposing weight values, making them non-causally active under natural inputs
  • Burger et al. (2024) framework proposing that truth is linearly decoded along a 2D subspace capturing both polarity-dependent and polarity-invariant directions.