concept
active
concept:feature-density

Feature Density

Fraction of training tokens on which a given feature has nonzero activation; used as proxy metric for autoencoder quality

Neighborhood — ranked by edge-count

Concepts (3)

concept
  • Dead Neurons
    associated_with
    Autoencoder neurons that fail to activate across any datapoints during training; addressed via neuron resampling
  • Cluster of autoencoder features with extremely low activation density (~1e-7) that are generally not interpretable and appear to be training artifacts
  • Mode in feature density histogram around 1e-5 corresponding to interpretable features, contrasted with ultralow density cluster

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • Log-scale histogram of feature firing rates used as proxy for autoencoder quality during hyperparameter tuning
  • Approximate posterior probability distribution embodied in organism's internal states; organism's best guess about causes of sensations
  • The extracted set of sparse interpretable features from model embeddings via SAEs
  • Density of Centersconcept0.760
    The degree to which centers are packed and overlapped, contributing to the life of the whole.
  • Time-dependent densities of states and parameters obtained by maximizing free energy.
  • Research thread within About Blank concerning the structure and relational properties of neural network feature representations; covariance pooling tangentially supports this thread.
  • Feature Sparsityconcept0.755
    Property that features activate on only a small fraction of inputs; enables compressed sensing and is what allows superposition to work
  • Domain of techniques for constructing informative features from raw data; covariance pooling is a feature engineering method for token sequences.