Autoregressive Sampling

The mechanism by which LLMs generate text: drawing a token from the next-token distribution and appending it to context repeatedly

Neighborhood — ranked by edge-count

concept

Large Language Models (LLMs)
implements
Transformer-based models like GPT-4, LaMDA, PaLM; assessed for GWT indicators.

method

autoregressive modeling
related_to
Statistical technique where outputs are regressed on previous values; used in language generation

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Autoregressive modelsframework0.864
Second model system studied; used to show why flat autoregressive LLMs struggle with long-range coherence.
autoregressive recurrenceconcept0.834
Transformers are recurrent through autoregression because the K/V stream provides horizontal information flow across positions, even though each forward pass is feedforward.
autoregressive parallelizationconcept0.824
The training parallelization technique that latent methods are difficult to train with.
autoregressive persistenceconcept0.819
Baseline persistence of any probe direction arising from the autoregressive nature of LLMs, not specific to emotion content
Autoregressive Language Modelingconcept0.806
Training objective interpretable as optimizing a diverse set of tasks; thus subject to multitask scaling convergence pressures
Thompson Samplingmethod0.782
A Bayesian exploration strategy that samples from the posterior distribution over model parameters to decide actions.
Rejection samplingmethod0.773
A technique to filter model outputs; Redwood Research's project mentioned.
Activation Interval Samplingmethod0.767
Dividing feature activation spectrum into 11 evenly-spaced intervals and sampling uniformly to evaluate monosemanticity across activation levels