thinker
active
thinker:ivor-w-tsang

Ivor W. Tsang

Author from Centre for Frontier AI Research, A*STAR.

Authored
1
Introduces
0
Studies
0
Affiliations
0
Cited by
0

Authored papers (1)

  • Simultaneously addressing both loss-scale and gradient-magnitude imbalance in multi-task learning yields consistent state-of-the-art performance across five benchmarks: DB-MTL (Dual-Balancing Multi-Task Learning) achieves ∆p = +1.15% on NYUv2 (versus the next-best competitor's +0.30% from GLS), +8.91% on NYUv2 with SegNet (surpassing Aligned-MTL's +8.16%), −58.10% on QM9 (versus Nash-MTL's −73.92%), and +1.05% on Office-31. DB-MTL combines two parameter-free, training-efficient components: a logarithm transformation on each task loss that provably recovers IMTL-L as a special case when IMTL-L's learnable scale parameter reaches its exact minimizer, and a maximum-norm gradient normalization that rescales all task gradients to the magnitude of the largest task gradient at each iteration via exponential moving average smoothing. On NYUv2, most competing methods—including PCGrad, CAGrad, Nash-MTL, and IMTL-G—improve semantic segmentation and depth estimation over single-task learning but degrade surface normal prediction, a failure mode DB-MTL specifically avoids. The logarithm transformation also improves six existing gradient balancing methods (PCGrad, GradVac, IMTL-G, CAGrad, Nash-MTL, Aligned-MTL) when applied independently on NYUv2, and ablation confirms both components contribute additively across all five datasets. DB-MTL argues this implies that loss-scale and gradient-magnitude imbalances are complementary failure modes that neither pure loss-balancing nor pure gradient-balancing methods can resolve alone, and that a non-parametric dual correction is sufficient to close this gap without added computational overhead relative to existing gradient balancing methods.

More papers — OpenAlex / S2

Co-authors (8)

Recent mentions (1)