method
active
method:cross-judge-analysisCross-Judge Analysis
Validation of judge model robustness by regrading 1000 responses with 4 additional judge models
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- The quality-control procedure used in Peru: four team members in four different families, rejecting any observation not confirmed by all four
- Explicit textual or graphical links between parts of a work, dynamic and virtual.
- Measuring AUROC of a probe trained on one task when evaluated on another task to assess universality.
- Whether learned cones transfer effectively across model families (Qwen vs Gemma) and sizes
- Research institute in Kyoto where Witkowski is based.
- Asymmetric transfer after fine-tuning: high-density bases (B10) are more robust.
- Technique used to demonstrate that the self-prior captures visual–proprioceptive associations by recovering visual appearance from proprioception alone
- Natural breeze passing through an apartment via opposing windows, important in hot, humid climates.