artifact
active
artifact:janus-information-flow-transformers-twitter-thread-sept-2025Janus Information Flow Transformers (Twitter thread, Sept 2025)
Original thread by janus explaining transformer information highways and introspection capabilities, posted on X.
Neighborhood — ranked by edge-count
Papers (1)
paper
Thinkers (1)
thinker
- janus (@repligate)authoredAuthor of the thread on transformer information flow; researcher exploring AI and consciousness.
Frameworks (1)
framework
- Wolfram Causal GraphcitesA framework from Wolfram physics viewing computation as a causal graph with foliations/time-slices specifying computation order.
Methods (1)
method
- KV cachingcitesCaching of key-value pairs to avoid recomputation; also provides a mechanism for introspection of earlier computations.
Concepts (10)
concept
- Residual StreamcitesProposed pathway flowing through layers at each position; calculates K/V values that feed horizontal information flow.
- IntrospectioncitesThe ability of a model to observe its own past internal states or computations; claimed to be architecturally permitted by transformers.
- Transformers are recurrent through autoregression because the K/V stream provides horizontal information flow across positions, even though each forward pass is feedforward.
- K/V StreamcitesProposed pathway flowing across positions at each layer; carries key, value, and attention-weighted information horizontally.
- Interferometric CognitionintroducesIdea that redundant information paths create interference patterns, leading to memory and cognition experienced as interferometric and continuous.
- exponential path combinatoricsintroducesThe number of distinct paths information can travel from point A to B in a transformer is C(m+n, n), quickly exceeding the number of atoms in the universe.
- Process using Q, K, V to compute a heat map over K and weighted sum of V.
- K valuescitesIn attention, key vectors that advertise 'where in the future should look here?'
- Q valuescitesIn attention, query vectors that ask 'where in the past should I look?' given the current state.
- V valuescitesIn attention, value vectors that carry the information future positions should receive.
Claims (9)
claim
- Claim formalizing the Anima Labs idea that transformers are effectively recurrent due to K/V stream.
- Janus's central claim that the architecture enables introspection, though usage in practice is a separate question.
- Janus's mathematical claim about exponential path combinatorics in transformers.
- Janus's claim about KV caching as an introspection mechanism.
- Janus's interpretive claim about key vectors.
- Janus's interpretive claim about query vectors.
- Janus's claim linking path redundancy to interferometric phenomenology.
- Transformer can be viewed as a Wolfram causal graph with foliations specifying computation order.citesJanus's interpretive framing of transformers as causal graphs.
- Janus's interpretive claim about value vectors.
Questions (1)
question
- How are LLMs actually leveraging the architectural degrees of freedom for introspection in practice?citesJanus notes that while architecture permits introspection, it is a separate question how models use it.
Artifacts (1)
artifact
- Antra/Imago dialogue (cube_flipper, April 2026) arguing transformers are recurrent; cited as evidence for introspection capability.