method
active
method:thompson-samplingThompson Sampling
A Bayesian exploration strategy that samples from the posterior distribution over model parameters to decide actions.
Neighborhood — ranked by edge-count
Frameworks (1)
framework
- RL variant that maintains beliefs over environment model; compared to active inference using Thompson sampling.
Concepts (1)
concept
- Reinforcement learning (RL)associated_withMachine learning paradigm where agents learn to maximize cumulative reward through interaction.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- The mechanism by which LLMs generate text: drawing a token from the next-token distribution and appending it to context repeatedly
- A technique to filter model outputs; Redwood Research's project mentioned.
- Procedure for sampling 64 random nonnegative combinations of cone basis vectors to evaluate the full cone distribution
- Dividing feature activation spectrum into 11 evenly-spaced intervals and sampling uniformly to evaluate monosemanticity across activation levels
- Temperature=0.8 sampled decoding for self-report; reduces collapse moderately but remains discrete and noisy
- Technique used to demonstrate that the self-prior captures visual–proprioceptive associations by recovering visual appearance from proprioception alone
- Human psychology method for repeated in-situ self-report; methodological inspiration for the paper's approach
- Setting a feature's value to its maximum observed value and sampling from the model to validate causal interpretations