finding
active
finding:optimal-learning-rate-decreases-as-a-power-law-with-compute-budget

Optimal learning rate decreases as a power law with compute budget.

Hyperparameter trend observed.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.