concept
active
concept:anchor-calibration-responsesAnchor Calibration Responses
Three reference responses at known quality levels shown alongside each target to eliminate score inflation in calibrated rubric scoring
Neighborhood — ranked by edge-count
Methods (1)
method
- Primary scoring method: scorer sees three reference responses at known quality levels alongside each target to eliminate inflation
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- External structure (in-context examples, retrieval, tuning) that biases latent pattern activation.
- Fixed dev pool of 1000 prompts used for whitening and z-scoring parameters.
- A central claim about the operational value of S.
- (ii) does the anchoring score S = ρd − dr − log k consistently correlate with performance across anchoring methods?question0.694Second research question in E2
- Baseline method: sweeps over shot count and resamples prompts; calibrates threshold for P(TRUE)-P(FALSE); performed surprisingly weakly
- Assumption that structured inputs act as anchors biasing activation toward target patterns PT.
- Mean of S(ℓ) across layers; weaker geometry correlate with θ50