hypothesis

active

hypothesis:if-loss-keeps-going-down-on-the-test-set-in-the-limit-the-model-must-be-learning-to-interpret-and-predict-all-patterns-represented-in-language-including-common-sense-reasoning-goal-directed-optimization-and-deployment-of-the-sum-of-recorded-human-knowledge

If loss keeps going down on the test set, in the limit the model must be learning to interpret and predict all patterns represented in language, including common-sense reasoning, goal-directed optimization, and deployment of the sum of recorded human knowledge.

Extrapolation of scaling predictive models to AGI.

Source paper

extracted_from

Simulators — LessWrong

Neighborhood — ranked by edge-count

Concepts (1)

concept

GPT (Generative Pre-trained Transformer)
associated_with
A family of large language models trained on next-token prediction, central example of simulators.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Reinforcement learning can be regarded as a limiting or special case of model-based approaches in general — or active inference in particular — when epistemic value is removed.claim0.788
§3 Discussion.
The benchmark’s diagnostic value lies in identifying why a model loses, not just that it losesclaim0.787
argues for fine-grained behavioral analysis over aggregate rankings
Some failures may reflect prompt design rather than model limitations, but the underlying issue is one of reasoning rather than instruction-following.claim0.778
Acknowledges the confound of not explicitly instructing models to track wealth, yet points to reasoning gaps given code agents avoid errors without prompts.
When a model discovers that its outputs produce effects, it accelerates learning through in-context learning, analogous to lucid dreaming.claim0.777
Describes scaffolding method and the model's meta-learning loop.
Different models cannot converge to the same representation if they have access to fundamentally different information; convergence is capped by mutual information between input signalsclaim0.773
Key limitation of the PRH for non-bijective observations
Human participants in the rule-learning paradigm should acquire insight after approximately 7–8 trials, fewer than the ~14 required by Bayes-optimal inference without model reduction, suggesting they perform Bayesian model selection.hypothesis0.771
Based on informal audience experiments; implies people use prior knowledge about rule structure
a model becomes strongly confident in its final answer, but continues generating tokens without revealing its internal beliefquote0.771
Core definitional quote for performative chain-of-thought
A model whose objective is prediction can simulate agents who optimize toward any objectives, with any degree of optimality (bounded above but not below by the model's power).claim0.768
Prediction orthogonality thesis.