concept
active
concept:zheng-et-al-2023-judging-llm-as-a-judge-with-mt-bench-and-chatbot-arena

Zheng et al. 2023 - Judging LLM-as-a-judge with MT-Bench and Chatbot Arena

Source paper for the MT-Bench evaluation benchmark used to assess capabilities post-SOO fine-tuning

Neighborhood — ranked by edge-count

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.