finding
active
finding:grid-search-covers-312-130-subjective-reward-functions-per-environment-after-removing-duplicates

Grid search covers 312,130 subjective reward functions per environment after removing duplicates

Scale of the hyperparameter search establishing thoroughness of optimization

Source paper

extracted_from
Exploration Through Introspection: A Self-Aware Reward Model
(2026) · Michael Petrowski · Milica Gašić

Neighborhood — ranked by edge-count

Methods (1)

method
  • Exhaustive search over 312,130 subjective reward functions per environment to find best-performing agents

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.