method
active
method:local-linear-reconstruction-errorLocal Linear Reconstruction Error
Measures how well an intervened point can be expressed as a convex combination of nearby natural manifold points
Neighborhood — ranked by edge-count
Papers (1)
paper
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Metric of how well models reconstruct information from hidden states; Sauers' study found showing janus thread extends distribution tails.
- Correlative technique measuring the type of information encoded in distributed representations via linear predictability.
- The central object of study — the idea that a concept like truth is encoded as a direction in the LLM's latent space
- The idea that features are encoded as directions in activation space.
- Algorithm that extracts a localist (axis-aligned) approximation from any learned orthogonal rotation matrix for baseline comparison.
- Learning rule where change in a parameter at point x,t depends only on system state at same or nearby spacetime points, without requiring global cost function computation
- Theoretical limitation identified by the authors distinguishing reflection from stylistic tasks.