method
active
method:behavior-optimized-activation-path-recovery

Behavior-Optimized Activation Path Recovery

Method of optimizing activation-space interventions to produce behavioral paths along M_y, then measuring whether the resulting activation trajectories trace M_h curvature

Neighborhood — ranked by edge-count

Findings (1)

finding

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • The general experimental approach of intervening along geometrically-defined paths rather than single-point or linear-direction interventions
  • The path in activation space derived by optimizing steering interventions to produce outputs along the behavior manifold, independent of representation geometry.
  • Adaptive Behaviorconcept0.769
    Organism's belief-guided action selection that instantiates generative model and maintains phenotypic states
  • The conventional approach (e.g., SAEs, transcoders) of decomposing activations into interpretable features.
  • The behavior a model would exhibit during real-world deployment, as opposed to evaluation behavior; the target of steering.
  • Behavior that minimizes expected free energy under the generative model, balancing exploration and exploitation in a principled manner.
  • Method that optimizes activation interventions so that resulting behaviors trace M_y, recovering activation paths that follow M_h curvature.