concept
active
concept:softmax-bottleneckSoftmax Bottleneck
Failure mode for output-surjectivity: LLMs may lack capacity to predict all tokens due to rank constraints
Neighborhood — ranked by edge-count
Papers (1)
paper
Concepts (1)
concept
- Strict Output-SurjectivitycontradictsAssumption that every output class can be produced by the DNN in each layer; key condition for Theorem 1
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Organisation that hosted the Holistic Intelligence unconference where the paper's ideas originated
- Neuronal dynamics computed from free energy gradients; interpreted as average firing rate of neural populations.
- Population structure mechanism implementing genetic assortment; cited as example of individuation mechanism in multicellularity.
- Compression-prediction trade-off; NIS encodes micro-states through an information bottleneck.
- A lower-dimensional activation that is the only pathway for information between higher-dimensional activations; e.g. the residual stream between MLP layers
- Policies assigned probability via softmax of expected free energy; enables self-evidencing behavior.
- Implementation detail weighting softmax by log(n_memories) to prevent down-weighting of attention values as memory set grows.
- Selecting policies using a softmax (normalized exponential) function of negative expected free energy.