finding
active
finding:gene-ontology-prediction-8-4-auc-improvement-with-unsupervised-autoencoder-and-covariance-pooling-embeddingsGene Ontology prediction: +8.4% AUC improvement with unsupervised autoencoder and covariance pooling embeddings
Empirical result: covariance pooling combined with unsupervised autoencoder embeddings improves Gene Ontology prediction AUC by 8.4% over mean pooling.
Source paper
extracted_from(2026) · Dooms, Thomas · Wang, Nicholas K. · Pearce, Michael T.
Neighborhood — ranked by edge-count
Claims (2)
claim
- Specific interpretive claim about what covariance pooling captures: the pairwise co-activation patterns across features that are invisible to mean pooling.
- Core interpretive claim generalizing beyond genomics; argues mean pooling discards information present in covariance.
Communities (3)
community
- Explores geometry of activation/behavior manifolds to enable selective, non-destructive concept interventions.
- Using second-order statistics to compress activation patterns while preserving feature co-occurrence structure, tested on genomic prediction tasks without large labeled datasets.
- Replaces mean pooling with second-order statistics, achieving large R² and AUC gains on genomic tasks.
Concepts (1)
concept
- Second evaluated task showing +8.4% AUC improvement with covariance pooling and unsupervised autoencoder embeddings.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Covariance pooling achieves +52.9% R² improvement over mean pooling on Genomic Track Prediction.finding0.765Primary empirical result demonstrating practical utility of covariance pooling method.
- Core insight: reconstruction objective combined with appropriate initialization and KL regularization produces human-interpretable explanations as emergent property.
- Demonstrates information integration in evolutionary systems with system-level selection
- Claim linking the indirect genotype-phenotype mapping to robustness and open-endedness.
- Central claim of the paper, supported by detailed feature analysis, human evaluation, automated interpretability of activations, and automated interpretability of logit weights
- Measures how much of the MLP layer's function is explained by the learned features
- Analysis of GRN models shows they can perform several kinds of learning, supporting the view of cellular networks as agents on a cognitive continuum.
- Central claim of the paper: the method scales to state-of-the-art transformers.