framework
active
framework:big-bench

BIG-bench

Large-scale collaborative benchmark for LLM capabilities, cited.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • MT-Benchmethod0.780
    Benchmark used to measure general task performance of LLMs before and after SOO fine-tuning
  • EQ-Benchmethod0.734
    Emotional intelligence benchmark (171 problems) used to check if activation capping degrades soft skills
  • larger wholesconcept0.707
    The broader field of centers that encompasses a given center; a successful center contributes to and is shaped by these larger wholes.
  • monitorsconcept0.707
    Synchronization construct encapsulating shared data and protected access routines.
  • Base-10 additionconcept0.699
    The generic addition mechanism that Llama-3.1-8B actually uses to compute sums before mapping back to cyclic concept space
  • Desktopframework0.698
    GUI window management construct supporting MDI-style display of applications, used as a top-level backplane facility.
  • Googleinstitute0.697
    Murray Shanahan's part-time employer and provider of LLM technology.
  • Boundariesconcept0.696
    The property that living centers are formed and strengthened by boundaries which both separate and unite; the boundary must be of the same order of magnitude as the center being bounded and is itself made of centers