Feature Density

Fraction of training tokens on which a given feature has nonzero activation; used as proxy metric for autoencoder quality

Neighborhood — ranked by edge-count

concept

Dead Neurons
associated_with
Autoencoder neurons that fail to activate across any datapoints during training; addressed via neuron resampling
Ultralow Density Cluster
associated_with
Cluster of autoencoder features with extremely low activation density (~1e-7) that are generally not interpretable and appear to be training artifacts
High Density Feature Cluster
associated_with
Mode in feature density histogram around 1e-5 corresponding to interpretable features, contrasted with ultralow density cluster

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Feature Density Histogrammethod0.863
Log-scale histogram of feature firing rates used as proxy for autoencoder quality during hyperparameter tuning
Recognition Densityconcept0.804
Approximate posterior probability distribution embodied in organism's internal states; organism's best guess about causes of sensations
Sparse Feature Dictionaryconcept0.797
The extracted set of sparse interpretable features from model embeddings via SAEs
Density of Centersconcept0.760
The degree to which centers are packed and overlapped, contributing to the life of the whole.
conditional densitiesconcept0.759
Time-dependent densities of states and parameters obtained by maximizing free energy.
Geometry of featuresconcept0.755
Research thread within About Blank concerning the structure and relational properties of neural network feature representations; covariance pooling tangentially supports this thread.
Feature Sparsityconcept0.755
Property that features activate on only a small fraction of inputs; enables compressed sensing and is what allows superposition to work
Feature engineeringconcept0.752
Domain of techniques for constructing informative features from raw data; covariance pooling is a feature engineering method for token sequences.