method
active
method:llm-judge-evaluation

LLM judge evaluation

Using Claude Sonnet 4 as a grader to categorize model responses according to predefined criteria.

Neighborhood — ranked by edge-count

Findings (1)

finding

Methods (1)

method
  • Baseline comparison for data attribution; outperformed by probe-based approach.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.