finding
active
finding:the-representation-based-path-and-the-behavior-based-path-in-llama-3-1-8b-activation-space-trace-out-similar-curves-demonstrating-bidirectional-geometry-alignmentThe representation-based path and the behavior-based path in Llama-3.1 8B activation space trace out similar curves, demonstrating bidirectional geometry alignment.
Key empirical result showing that optimizing for behavioral outputs and fitting representation geometry produce the same path in activation space.
Source paper
extracted_fromNeighborhood — ranked by edge-count
Papers (1)
paper
Claims (2)
claim
- The paper's deepest interpretive claim, asserting that representation structure and behavioral structure are not coincidentally aligned but deeply connected.
- The paper's finding that the alignment holds in both directions — from representation to behavior and from behavior back to representation space.
Hypotheses (1)
hypothesis
- The causal hypothesis motivating the use of causality (intervention) as the lens connecting representation and behavior geometry.
Concepts (2)
concept
- Behavior-based PathcitesThe path in activation space derived by optimizing steering interventions to produce outputs along the behavior manifold, independent of representation geometry.
- The path in activation space derived by fitting the representation manifold, used to steer along the geometric structure of internal representations.
Questions (1)
question
- The central scientific question the paper addresses through the lens of interventional causality.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Core finding: the structure models use internally (representations) is precisely reflected in their external behavior (outputs).
- Author’s interpretive claim that the shared geometry is general and robust.
- Core empirical result demonstrating that manifold steering produces on-target, behavior-aligned outputs.
- The finding that steering along M_h yields M_y behavior, and optimizing for M_y paths recovers M_h trajectories.
- Demonstrates bidirectional causal link: behavior manifold geometry can be recovered by optimizing in representation space.
- Central empirical claim of the paper, demonstrated across tasks and modalities
- The paper's generalization claim, asserting that the days-of-week finding scales to other cyclic and structured concepts.