Janus 2022: Simulators (LessWrong)

Blog post introducing the idea that an LLM maintains simulated characters in superposition; foundational for the simulacra framework

Neighborhood — ranked by edge-count

Thinkers (1)

thinker

Janus (LessWrong pseudonym)
authored
Author of the LessWrong 'Simulators' post that introduced the superposition of simulacra concept adopted by the paper

Frameworks (1)

framework

Simulacra in Superposition Framework
extendsintroduces
The more nuanced second metaphor: LLM as simulator maintaining a superposition of possible simulacra across a multiverse of characters

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Simulatorconcept0.689
The underlying LLM with autoregressive sampling; a passive entity capable of generating an infinity of simulacra but lacking its own beliefs or goals
It's tricky, because for a typical language model the entity is sort of tricameral: the base simulator, the simulated simulator, and the simulated awareness.quote0.687
Antra's earlier definitive statement of the tricameral model.
giving models janus's thread extends reconstruction accuracy distribution tails in both directionsfinding0.685
Sauers' study: exposing models to janus's post extended both positive and negative extremes of reconstruction accuracy.
Sauers' statistical anomaly: when models are given Janus post explaining transformers, reconstruction accuracy tails extend both ways, with ~1/1000 reconstructions anomalously accuratefinding0.683
Statistically rigorous analysis of Claude introspection; suggests models may have latent introspective capabilities that can be enhanced or disrupted.
The transformer entity is tricameral (base simulator, simulated simulator, simulated awareness), but there is less discreteness between these layers than previously claimed.claim0.678
Antra's revision of her earlier model; still considers interference between levels important.
NIS+ captures emergent static/dynamic patterns such as 'gliders' in Conway's Game of Life within the latent space.finding0.676
Yang et al. (2023) demonstration of emergent pattern recognition.
Simulator vs simulacra distinctionconcept0.676
The ontological separation between the generative rule (simulator) and the instances it produces (simulacra).
GRUs trained on the Arithmetic task use different types of numeric representations than incremental counting modelshypothesis0.675
Interpretive hypothesis supported by the lower IIA between Count and Cumu Val variables even in the restricted value range.