concept
active
concept:pre-encoder-biasPre-Encoder Bias
Architectural modification subtracting a learned bias from autoencoder inputs before encoding; initialized to geometric median of dataset; improves autoencoder performance
Neighborhood — ranked by edge-count
Frameworks (1)
framework
- Sparse Autoencoder for Dictionary Learningassociated_withPrimary method introduced: trains a one-hidden-layer MLP with L1 sparsity penalty to decompose model activations into overcomplete feature dictionaries
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Features related to gender, racial, ethnic biases, slurs, and hate speech.
- Assumptions or preferences (e.g., parsimony) that determine how a learning system generalizes beyond training data
- Problem cited as a limitation of current LLMs; PRH predicts larger models should amplify bias less
- Neural network architecture that learns compressed representations; SOHMs are functionally equivalent.
- Expected prevalence of patterns (e.g., base-10 arithmetic) in pretraining corpora, influencing ρd and dr.
- Barrett and Simmons's neuroanatomical model of interoceptive prediction error and affect generation
- The tendency of deep networks to implicitly favor simpler solutions that fit the data, driving convergence