concept
active
concept:feature-densityFeature Density
Fraction of training tokens on which a given feature has nonzero activation; used as proxy metric for autoencoder quality
Neighborhood — ranked by edge-count
Concepts (3)
concept
- Dead Neuronsassociated_withAutoencoder neurons that fail to activate across any datapoints during training; addressed via neuron resampling
- Ultralow Density Clusterassociated_withCluster of autoencoder features with extremely low activation density (~1e-7) that are generally not interpretable and appear to be training artifacts
- High Density Feature Clusterassociated_withMode in feature density histogram around 1e-5 corresponding to interpretable features, contrasted with ultralow density cluster
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Log-scale histogram of feature firing rates used as proxy for autoencoder quality during hyperparameter tuning
- Approximate posterior probability distribution embodied in organism's internal states; organism's best guess about causes of sensations
- The extracted set of sparse interpretable features from model embeddings via SAEs
- The degree to which centers are packed and overlapped, contributing to the life of the whole.
- Time-dependent densities of states and parameters obtained by maximizing free energy.
- Research thread within About Blank concerning the structure and relational properties of neural network feature representations; covariance pooling tangentially supports this thread.
- Property that features activate on only a small fraction of inputs; enables compressed sensing and is what allows superposition to work
- Domain of techniques for constructing informative features from raw data; covariance pooling is a feature engineering method for token sequences.