finding
active
finding:db-mtl-training-losses-decrease-smoothly-and-gradient-norms-are-lower-than-ew-on-nyuv2-indicating-training-stabilityDB-MTL training losses decrease smoothly and gradient norms are lower than EW on NYUv2, indicating training stability.
Training stability analysis.
Source paper
extracted_from(2023) · Baijiong Lin · Weisen Jiang · Feiyang Ye · Yu Zhang +5
Neighborhood — ranked by edge-count
Claims (1)
claim
- Training stability claim.
Communities (3)
community
- Dual-balancing multi-task learningmembers_ofDB-MTL jointly balances loss scale and gradient magnitude, benchmarked on NYUv2 and Office-31.
- Dual balancing multi-task learningmembers_ofDB-MTL combines loss-scale and gradient-magnitude balancing, benchmarked across NYUv2, Cityscapes, QM9, and Office datasets.
- Explores gradient/loss balancing techniques with exponential moving average forgetting rates, evaluated on dense prediction tasks like semantic segmentation.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Computational efficiency comparison.
- Effect on gradient conflict.
- Analysis of gradient conflict reduction.
- Concise summary of the DB-MTL method from the abstract.
- Ethical implication about the nature of AI training experience if the thesis holds
- Setting αk to the maximum gradient norm performs best among tested strategies on NYUv2 (Figure 6).finding0.790Sensitivity analysis for gradient normalization scaling factor.
- Core claim of the paper.
- DB-MTL achieves ∆p = +1.15±0.16 on NYUv2, outperforming all baselines including state-of-the-artfinding0.769Primary empirical validation on scene understanding task