method
active
method:elo-rating-conversionElo Rating Conversion
Pairwise comparison results converted to Elo ratings for Alexander mirror aesthetic rankings
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- A rating system used to compare model helpfulness and harmlessness based on crowdworker preferences.
- Ratio of reflection steps to total reasoning steps, used to quantify reflection behavior
- Primary metric for all benchmarks, measuring fraction of tasks that meet benchmark-specific pass criteria
- Primary metric: percentage of responses containing multiple attempts that successfully improve on the first attempt
- A transformation that introduces intermediate-sized centers to fill out the hierarchy of scales, strengthening larger centers.
- Algorithm computing both equality relations separately before comparing them in hierarchical equality task