method
active
method:calibrated-rubric-scoringCalibrated Rubric Scoring
Primary scoring method: scorer sees three reference responses at known quality levels alongside each target to eliminate inflation
Neighborhood — ranked by edge-count
Concepts (1)
concept
- Three reference responses at known quality levels shown alongside each target to eliminate score inflation in calibrated rubric scoring
Methods (1)
method
- Koan BatteryusesAssessment framework for measuring introspection and self-observation in LLMs; grounded in Janus's architectural theory.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- LLM-based judge scoring reflection segments on 1-5 scale for presence of first-person felt state; used in Experiment 4
- LLM judge scoring rubric rating introspective quality of reflection segments from 1 (no felt state) to 5 (very strong introspection)
- A scoring rule optimized by predicting true probabilities; log-loss is one.
- Rubric where LLM rates how well a feature's interpretation matches the activating text.
- Score = (sum of completed quartet values) × (number of quartets), making portfolio composition consequential.
- Factor analysis on 2224 data points revealing PC1 explains 82% of variance; six dimensions are not independent