quote
active
quote:as-can-be-seen-most-mtl-baselines-perform-better-than-stl-on-semantic-segmentation-and-depth-estimation-but-have-a-large-drop-on-the-surface-normal-prediction-task-suffering-from-the-task-balancing-problemAs can be seen, most MTL baselines perform better than STL on semantic segmentation and depth estimation, but have a large drop on the surface normal prediction task, suffering from the task balancing problem.
Observation illustrating the task balancing problem on NYUv2.
Source paper
extracted_from(2023) · Baijiong Lin · Weisen Jiang · Feiyang Ye · Yu Zhang +5
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Effect on gradient conflict.
- Concise summary of the DB-MTL method from the abstract.
- Selective pressure toward convergence via task generality
- Analysis of gradient conflict reduction.
- Computational efficiency comparison.
- We hypothesize that degraded generalization on benchmarks like MMLU may reflect the computational demands of the tasks.hypothesis0.758Connecting the paper's task-difficulty findings to prior observations of weak generalization on complex QA benchmarks.
- Contrasts with temporal permutation where Span Representation dominates; suggests spatio permutation reveals different dynamics.
- DB-MTL with EMA forgetting rate β in a wide range performs better than without EMA (β=0) on Office-31.finding0.750Effect of EMA forgetting rate on performance.