claim

active

claim:the-logarithm-transformation-also-benefits-existing-gradient-balancing-methods

The logarithm transformation also benefits existing gradient balancing methods.

Generalization of the loss transformation.

Source paper

extracted_from

Dual-Balancing for Multi-Task Learning

(2023) · Baijiong Lin · Weisen Jiang · Feiyang Ye · Yu Zhang +5

Neighborhood — ranked by edge-count

Findings (1)

finding

Logarithm transformation improves PCGrad, GradVac, IMTL-G, CAGrad, Nash-MTL, and Aligned-MTL on NYUv2 (Figure 1).
supports
Effectiveness of logarithm transformation as a plug-in for gradient balancing methods.

Communities (3)

community

Dual-balancing multi-task learning
members_of
DB-MTL jointly balances loss scale and gradient magnitude, benchmarked on NYUv2 and Office-31.
Dual balancing multi-task learning
members_of
DB-MTL combines loss-scale and gradient-magnitude balancing, benchmarked across NYUv2, Cityscapes, QM9, and Office datasets.
Loss-scale balancing via logarithmic transformation
members_of
Parameter-free logarithm transformation for multi-task learning that improves gradient balancing methods like PCGrad and Nash-MTL across vision benchmarks.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

We find that the logarithm transformation also benefits existing gradient balancing methods.quote0.964
Key finding showing the broader utility of the log transformation.
Logarithm transformation is simpler and more effective than learnable loss transformationclaim0.825
Compared to IMTL-L: parameter-free, no extra computational cost, achieves same theoretical goal
The logarithm transformation (loss-scale balancing) consistently outperforms IMTL-L on NYUv2, Cityscapes, Office-31, Office-Home.finding0.815
Comparison of loss-scale balancing with IMTL-L.
The logarithm transformation is simpler and more effective than IMTL-L because it is parameter-free.claim0.805
Comparison of loss-scale balancing techniques.
The proposed gradient-magnitude balancing method consistently outperforms GradNorm, as it guarantees equal gradient magnitudes and considers update magnitude.claim0.802
Advantage over GradNorm.
DB-MTL achieves loss-scale balancing by performing logarithm transformation on each task loss, and rescales gradient magnitudes by normalizing all task gradients to comparable magnitudes using the maximum gradient norm.quote0.796
Concise summary of the DB-MTL method from the abstract.
Loss-scale balancing and gradient-magnitude balancing are complementary and combining them achieves the best performance.claim0.774
Ablation conclusion.
The gradient-magnitude balancing method outperforms GradNorm on NYUv2, Cityscapes, Office-31, Office-Home.finding0.767
Comparison of gradient-magnitude balancing with GradNorm.