claim
active
claim:transformers-use-an-anti-markovian-solution-that-recomputes-relevant-numeric-information-at-each-step-in-the-multi-object-taskTransformers use an anti-Markovian solution that recomputes relevant numeric information at each step in the Multi-Object task
Prior finding from Grant et al. 2025 used to interpret low MAS IIA for GRU-Transformer hidden state comparisons.
Neighborhood — ranked by edge-count
Findings (1)
finding
- Validates MAS as a causal detector of representational differences invisible to correlative methods.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Evidence that in-context learning is not mere pattern matching but genuine optimization, relevant to applying the thesis to inference
- Claim formalizing the Anima Labs idea that transformers are effectively recurrent due to K/V stream.
- Antra's foundational claim about how introspection arises computationally rather than from memorised text.
- Transformers almost surely maintain input-injectivity throughout training, not just at initialisationhypothesis0.774Conjecture supported by Nikolaou et al. 2025 for last-token hidden states
- Interpretive claim connecting exponential path combinatorics to Lindsey's layer-dependent findings.
- Strategy used by transformers that recomputes relevant numeric information at each step, unlike Markovian GRU solutions; detected by MAS but not by RSA/CKA.
- Janus's claim linking path redundancy to interferometric phenomenology.
- Core claim for two-layer models; composition creates qualitatively more powerful in-context learning