Multiple Gradient Descent Algorithm (MGDA)

Gradient balancing by solving multi-objective optimization for minimum-norm aggregated gradient.

Neighborhood — ranked by edge-count

artifact

Lin 2023 Dual-Balancing for Multi-Task Learning
mentions
The paper proposing the Dual-Balancing Multi-Task Learning method.

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Gradient Descentmethod0.810
Used for updating hidden state expectations; provides dynamical process theory testable against neuronal data
Multi-Agent Deep Deterministic Policy Gradient (MADDPG)method0.758
RL algorithm used to train baseline agents in the physical deception environment
Gradient-based data attributionmethod0.750
Baseline method against which probe-based ranking is compared; more computationally expensive.
Gradient methodmethod0.750
Optimization technique that computes weight changes by following the gradient of an error function; contrasted with evolutionary stochastic search.
Gradient Descent on Free Energymethod0.743
Optimization procedure for simultaneously updating action selection and perception; uses step size ζ (default 4).
Gradient Descent Rotation Optimizationmethod0.732
DAS uses SGD over differentiable parameterizations of orthogonal matrices (via PyTorch) to find optimal distributed alignments.
Multiple-choice evaluation method for PM trainingmethod0.732
Using language model log probabilities of answer choices (A)/(B) to produce preference labels.
We hypothesize that degraded generalization on benchmarks like MMLU may reflect the computational demands of the tasks.hypothesis0.731
Connecting the paper's task-difficulty findings to prior observations of weak generalization on complex QA benchmarks.