finding

active

finding:covariance-pooling-compresses-gigabytes-of-activations-into-compact-stable-embeddings-without-large-labeled-datasets

Covariance pooling compresses gigabytes of activations into compact stable embeddings without large labeled datasets

Practical finding: the method produces compact fixed-length representations from large volumes of token activations without requiring supervised labels.

Source paper

extracted_from

Covariance-based Sequence Pooling

(2026) · Dooms, Thomas · Wang, Nicholas K. · Pearce, Michael T.

Neighborhood — ranked by edge-count

Claims (1)

claim

Covariance pooling could generalize beyond genomics as a general-purpose replacement for mean pooling
supports
Authors' suggestion that the second-moment preservation principle applies broadly, not just to genomic foundation models.

Communities (3)

community

Manifold-aware concept steering in neural representations
members_of
Explores geometry of activation/behavior manifolds to enable selective, non-destructive concept interventions.
Covariance pooling for high-dimensional genomic embeddings
members_of
Using second-order statistics to compress activation patterns while preserving feature co-occurrence structure, tested on genomic prediction tasks without large labeled datasets.
Covariance pooling for genomic embeddings
members_of
Replaces mean pooling with second-order statistics, achieving large R² and AUC gains on genomic tasks.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Covariance pooling preserves joint activation structure (feature co-occurrence) that mean pooling discardsclaim0.827
Specific interpretive claim about what covariance pooling captures: the pairwise co-activation patterns across features that are invisible to mean pooling.
Covariance Poolingmethod0.790
Novel aggregation technique replacing mean pooling; preserves joint activation structure (feature co-occurrence) in token embeddings.
Can covariance pooling generalize beyond genomics to other domains?question0.786
Open question implied by the claim that the method could generalize; empirical validation beyond genomics is not provided in this paper.
Covariance pooling achieves +52.9% R² improvement over mean pooling on Genomic Track Prediction.finding0.785
Primary empirical result demonstrating practical utility of covariance pooling method.
Training models with sparse activations cannot fully prevent polysemanticity because cross-entropy loss creates incentives for polysemantic neurons even without superpositionclaim0.747
Author's conclusion after extensive investigation of architectural approaches to monosemanticity
Larger S_max correlates with smaller θ50 across backbones in E3 (negative association consistent across pooling and metric choices)finding0.737
Key geometry-to-behavior bridge finding in E3; robust to pooling choice, cosine vs. L2, and frozen external encoder
On a Model of Associative Memory with Huge Storage Capacity (Demircigil et al., 2017)concept0.735
Analytical result showing exponential power activation allows memory storage scaling as 2^(N/2); cited in context of Hopfield scaling.
2D projections of activations show clearly separable clusters for F0-F2 and A1 at layer 25, but increasingly entangled activations for F4-F5 and A2-A3.finding0.731
Visual geometric evidence for the fundamental entanglement of true/false activations in harder tasks.