concept
active
concept:reconstruction-accuracyreconstruction accuracy
Metric of how well models reconstruct information from hidden states; Sauers' study found showing janus thread extends distribution tails.
Neighborhood — ranked by edge-count
Papers (1)
paper
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Criterion requiring that model's description of internal state be accurate, distinguishing genuine introspection from confabulation.
- Methods for visualizing fungal networks in ants.
- Measures how well an intervened point can be expressed as a convex combination of nearby natural manifold points
- The balance between how sparse and how faithful a decomposition is; VPD achieves a better tradeoff than transcoders.
- Proportion of aligned interchange interventions with equivalent high-level and low-level effects; graded measure of causal abstraction.
- Expected log likelihood of data under posterior beliefs; measures fit to observations.
- Theoretical limitation identified by the authors distinguishing reflection from stylistic tasks.
- Statistical method: ask model to recall random numbers from earlier outputs, with and without providing explanation of transformer architecture; measure reconstruction accuracy distribution.