community
active
leiden_hybrid_concepts
label: haiku
community:leiden_hybrid_concepts-run4-c7-c5Covariance pooling for high-dimensional genomic embeddings
Using second-order statistics to compress activation patterns while preserving feature co-occurrence structure, tested on genomic prediction tasks without large labeled datasets.
7 members. Each node is clickable.
Loading graph…
Drawn from 2 sources
The papers/notes whose extracted claims & findings make up this cluster.
Bridges (2)
Other communities that share members with this one — cross-cutting threads or papers that sit at the seam between two themes.
Findings (4)
- Covariance pooling achieves +52.9% R² improvement over mean pooling on Genomic Track Prediction.Primary empirical result demonstrating practical utility of covariance pooling method.
- Covariance pooling compresses gigabytes of activations into compact stable embeddings without large labeled datasetsPractical finding: the method produces compact fixed-length representations from large volumes of token activations without requiring supervised labels.
- Gene Ontology prediction: +8.4% AUC improvement with unsupervised autoencoder and covariance pooling embeddingsEmpirical result: covariance pooling combined with unsupervised autoencoder embeddings improves Gene Ontology prediction AUC by 8.4% over mean pooling.
- Geometry-behavior correlate robust to pooling strategy, distance metric, and frozen encoderRobustness checks confirm sign stability.
Claims (3)
- Covariance pooling could generalize beyond genomics as a general-purpose replacement for mean poolingAuthors' suggestion that the second-moment preservation principle applies broadly, not just to genomic foundation models.
- Covariance pooling preserves joint activation structure (feature co-occurrence) that mean pooling discardsSpecific interpretive claim about what covariance pooling captures: the pairwise co-activation patterns across features that are invisible to mean pooling.
- Second moments preserve structure that first moments destroy.Core interpretive claim generalizing beyond genomics; argues mean pooling discards information present in covariance.