method
active
method:mt-bench

MT-Bench

Benchmark used to measure general task performance of LLMs before and after SOO fine-tuning

Neighborhood — ranked by edge-count

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • BIG-benchframework0.780
    Large-scale collaborative benchmark for LLM capabilities, cited.
  • MTAdammethod0.745
    Automatic balancing of multiple training loss terms.
  • EQ-Benchmethod0.701
    Emotional intelligence benchmark (171 problems) used to check if activation capping degrades soft skills
  • Aligned-MTLmethod0.697
    Independent component alignment for multi-task learning.
  • Medium-term memoryconcept0.688
    Vascular clamp's function: holding specific predictions stable over timescales longer than working memory.
  • Meta CICEROconcept0.686
    AI system that mastered Diplomacy using deception despite being designed for cooperation; cited as example of AI deception
  • Meta-learningconcept0.685
    The capability of GPT-3 to learn tasks from few-shot prompts during runtime.
  • Host institution for the Architecture Machine Group and Alexander's early design research.