hypothesis

pending-review

hypothesis:when-task-gradient-norms-differ-greatly-large-norm-tasks-have-not-converged-while-small-norm-tasks-have-nearly-converged

When task gradient norms differ greatly, large-norm tasks have not converged while small-norm tasks have nearly converged

lin-2023-dual-balancing.md

Frontmatter (10 fields)

{
  "doc": "lin-2023-dual-balancing.md",
  "context": "Motivates setting αk = max norm to enable further learning on under-converged tasks",
  "category": "ai",
  "norm_label": "When task gradient norms differ greatly, large-norm tasks have not converged while small-norm tasks have nearly converged",
  "graphify_id": "hypothesis_large_gradients_convergence",
  "source_file": "lin-2023-dual-balancing.md",
  "imported_from": "/Users/antonborzov/Documents/Research.nosync/papers/extract_typed_out/lin-2023-dual-balancing/graph.json",
  "extracted_type": "hypothesis",
  "source_location": "§3.2",
  "graphify_file_type": "hypothesis"
}

Outgoing (1)

Supports (1)

Setting aggregated gradient scaling factor to maximum gradient norm performs best for task balancing(claim)

Incoming (0)

None.

Mentions (1)

papers-typed
lin-2023-dual-balancing.md