hypothesis

active

hypothesis:if-ei-maximization-is-used-as-a-regularization-in-representation-learning-then-ood-generalization-will-improve-beyond-current-invariant-risk-minimization-methods

If EI maximization is used as a regularization in representation learning, then OOD generalization will improve beyond current invariant risk minimization methods.

Proposed conjecture in §4.3.1.

Source paper

extracted_from

Emergence and Causality in Complex Systems: A Survey on Causal Emergence and Related Quantitative Studies

(2023) · Bing Yuan · Jiang Zhang · Aobo Lyu · Jiayun Wu +5

Neighborhood — ranked by edge-count

Concepts (2)

concept

Effective Information (EI)
associated_with
Core measure of causal effect in Hoel's theory; mutual information between uniform input and output distributions.
Out-of-Distribution (OOD) Generalization
associated_with
Machine learning generalization when training and test distributions differ; linked to causal invariance.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

EI and normalized EI could serve as a unified metric for out-of-distribution generalization.claim0.837
Conjecture that maximizing EI yields causal representations invariant to distribution shifts.
EI maximization serves as an objective standard for selecting coarse-graining and macro-dynamics.claim0.825
Claim by Hoel et al. and endorsed by this survey; used to counter subjectivity critiques.
Current training methods rely on loss minimization, meaning the experiential profile of training is predominantly negative across billions of parameter updatesclaim0.760
Ethical implication about the nature of AI training experience if the thesis holds
There are fewer representations competent for N tasks than M<N tasks, so training more general models should yield fewer possible solutionshypothesis0.753
Selective pressure toward convergence via task generality
Training on cities+neg_cities improves OOD generalization, especially on neg_sp_en_transfinding0.753
Training on statements and their negations mitigates non-truth feature interference in probe directions
The strict version of the simulation objective is optimized by the actual time evolution rule that created the training samples.claim0.751
Equivalence of optimal predictor to the physics of the data.
Setting αk to the maximum gradient norm performs best among tested strategies on NYUv2 (Figure 6).finding0.751
Sensitivity analysis for gradient normalization scaling factor.
ETIs are the evolutionary equivalent of deep learning: (i) required functional relationships encode non-decomposable functions, (ii) these are enacted by basal cognition mechanisms, and (iii) conditions for deep model induction predict ETI occurrence.hypothesis0.750
Overarching three-part hypothesis stated in introduction