claim

active

claim:the-proposed-gradient-magnitude-balancing-method-consistently-outperforms-gradnorm-as-it-guarantees-equal-gradient-magnitudes-and-considers-update-magnitude

The proposed gradient-magnitude balancing method consistently outperforms GradNorm, as it guarantees equal gradient magnitudes and considers update magnitude.

Advantage over GradNorm.

Source paper

extracted_from

Dual-Balancing for Multi-Task Learning

(2023) · Baijiong Lin · Weisen Jiang · Feiyang Ye · Yu Zhang +5

Neighborhood — ranked by edge-count

Findings (1)

finding

The gradient-magnitude balancing method outperforms GradNorm on NYUv2, Cityscapes, Office-31, Office-Home.
restatessupports
Comparison of gradient-magnitude balancing with GradNorm.

Communities (3)

community

Dual-balancing multi-task learning
members_of
DB-MTL jointly balances loss scale and gradient magnitude, benchmarked on NYUv2 and Office-31.
Dual balancing multi-task learning
members_of
DB-MTL combines loss-scale and gradient-magnitude balancing, benchmarked across NYUv2, Cityscapes, QM9, and Office datasets.
Gradient magnitude balancing for multitask learning
members_of
Methods that equalize gradient magnitudes across tasks to improve multitask optimization, outperforming GradNorm on vision and domain adaptation benchmarks.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Loss-scale balancing and gradient-magnitude balancing are complementary and combining them achieves the best performance.claim0.848
Ablation conclusion.
gradient-magnitude balancingconcept0.838
Addressing disparity in gradient magnitudes across tasks at the gradient level
We find that the logarithm transformation also benefits existing gradient balancing methods.quote0.814
Key finding showing the broader utility of the log transformation.
The logarithm transformation also benefits existing gradient balancing methods.claim0.802
Generalization of the loss transformation.
Setting aggregated gradient scaling factor to maximum gradient norm performs best for task balancingclaim0.796
Empirical finding on choice of αk in gradient normalization strategy
DB-MTL achieves loss-scale balancing by performing logarithm transformation on each task loss, and rescales gradient magnitudes by normalizing all task gradients to comparable magnitudes using the maximum gradient norm.quote0.782
Concise summary of the DB-MTL method from the abstract.
Combining loss-scale and gradient-magnitude balancing achieves Δp = +1.15±0.16 on NYUv2.finding0.780
Full DB-MTL ablation result.
Task balancing requires simultaneous consideration of both loss scales and gradient magnitudesclaim0.777
Core interpretive position of DB-MTL: complementarity of loss and gradient perspectives

Restated by (1)

cosine ≥ 0.90

Other entities that say roughly the same thing. May be merge candidates or independent restatements across papers.

finding
The gradient-magnitude balancing method outperforms GradNorm on NYUv2, Cityscapes, Office-31, Office-Home.