concept
active
concept:exponential-moving-average-emaExponential Moving Average (EMA)
Used to estimate gradients dynamically, with forgetting rate β.
Neighborhood — ranked by edge-count
Methods (1)
method
- The proposed method combining loss-scale balancing via logarithm transformation and gradient-magnitude balancing via maximum-norm normalization.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Used in DB-MTL to estimate batch gradient expectations dynamically
- Used in self-prior training as a slow target network for KL regularization
- Primary quantitative measure of distributional divergence between natural and intervened representations
- Device to record and stimulate electrical activity of neural cultures.
- EI and normalized EI could serve as a unified metric for out-of-distribution generalization.claim0.699Conjecture that maximizing EI yields causal representations invariant to distribution shifts.
- DB-MTL with EMA forgetting rate β in a wide range performs better than without EMA (β=0) on Office-31.finding0.694Effect of EMA forgetting rate on performance.
- Baseline that minimizes sum of task losses with equal weights.
- Empirical result showing the CL loss can reduce divergence without sacrificing interpretability accuracy