paper:doi-10-48550-arxiv-2412-01003Competition Dynamics Shape Algorithmic Phases of In-Context Learning
Original abstract (expand)
In-Context Learning (ICL) has significantly expanded the general-purpose nature of large language models, allowing them to adapt to novel tasks using merely the inputted context. This has motivated a series of papers that analyze tractable synthetic domains and postulate precise mechanisms that may underlie ICL. However, the use of relatively distinct setups that often lack a sequence modeling nature to them makes it unclear how general the reported insights from such studies are. Motivated by this, we propose a synthetic sequence modeling task that involves learning to simulate a finite mixture of Markov chains. As we show, models trained on this task reproduce most well-known results on ICL, hence offering a unified setting for studying the concept. Building on this setup, we demonstrate we can explain a model's behavior by decomposing it into four broad algorithms that combine a fuzzy retrieval vs. inference approach with either unigram or bigram statistics of the context. These algorithms engage in a competition dynamics to dominate model behavior, with the precise experimental conditions dictating which algorithm ends up superseding others: e.g., we find merely varying context size or amount of training yields (at times sharp) transitions between which algorithm dictates the model behavior, revealing a mechanism that explains the transient nature of ICL. In this sense, we argue ICL is best thought of as a mixture of different algorithms, each with its own peculiarities, instead of a monolithic capability. This also implies that making general claims about ICL that hold universally across all settings may be infeasible.
Related work— refs + corpus + external arXiv
Cited / in-corpus / arXiv badges show which signals surfaced each row. Multi-source rows weighted higher.
- The mechanistic basis of data dependence and abrupt learning in an in-context classification taskGautam Reddy2023≈ 78%
- Context and Diversity Matter: The Emergence of In-Context Learning in World ModelsZhiyuan Chen, Yuxuan Zhong, Sunjian Zheng, Pengtao Shao, Bo Yu, Shaoshan Liu, Jianan Wang, Ning Ding, Yang Cao and Yu Kang Fan Wang2026≈ 77%
- Strategy Coopetition Explains the Emergence and Transience of In-Context LearningTed Moskovitz, Sara Dragutinovic, Felix Hill, Stephanie C.Y. Chan, Andrew M. Saxe Aaditya K. Singh2025≈ 75%
- Differential learning kinetics govern the transition from memorization to generalization during in-context learningGautam Reddy Alex Nguyen2024≈ 75%
- Two Ways of Understanding Social Dynamics: Analyzing the Predictability of Emergence of Objects in Reddit r/place Dependent on Locality in Space and TimeJavier Fernandez, Olaf Witkowski Alyssa M Adams2022≈ 74%
- Deep Latent Competition: Learning to Race Using Visual Control Policies in Latent SpaceTim Seyde, Igor Gilitschenski, Lucas Liebenwein, Ryan Sander, Sertac Karaman, Daniela Rus Wilko Schwarting2021≈ 74%
- A Survey of Learning in Multiagent Environments: Dealing with Non-StationarityMichael Kaisers, Tim Baarslag and Enrique Munoz de Cote Pablo Hernandez-Leal2019≈ 74%
- Next-token pretraining implies in-context learningPaul M. Riechers and Henry R. Bigelow and Eric A. Alt and Adam Shai2025≈ 74%
- COMBAT: Conditional World Models for Behavioral Agent TrainingPranay Meshram, Sumer Singh, Saurav Suman, Andrew Lapp, Shahbuland Matiana, Louis Castricato, Spencer Frazier Anmol Agarwal2026≈ 73%
- AI, Meet Human: Learning Paradigms for Hybrid Decision Making SystemsRoberto Pellungrini, Mattia Setzu, Fosca Giannotti and Dino Pedreschi Clara Punzi2026≈ 73%
- Compete and Compose: Learning Independent Mechanisms for Modular World ModelsFrederik Nolte, Bernhard Sch\"olkopf, Ingmar Posner Anson Lei2024≈ 73%
- Illusion or Algorithm? Investigating Memorization, Emergence, and Symbolic Processing in In-Context LearningSubhabrata Dutta, Ahmed Elshabrawy, Harish Tayyar Madabushi, Iryna Gurevych Jingcheng Niu2025≈ 73%
- ≈ 73%
- Episodic Memory for Learning Subjective-Timescale ModelsMatthew Crosby, Zafeirios Fountas Alexey Zakharov2020≈ 73%
- Learning Latent Action World Models In The WildTushar Nagarajan, Basile Terver, Nicolas Ballas, Yann LeCun, Michael Rabbat Quentin Garrido2026≈ 73%
- Learning without neurons in physical systemsin corpus2022≈ 71%
- ≈ 71%
- ≈ 70%
- ≈ 69%
- Mechanistic Knobs in LLMs: Retrieving and Steering High-Order Semantic Features via Sparse Autoencodersin corpus2026≈ 68%
- Active Inference: A Process Theoryin corpus2017≈ 68%
- ≈ 68%
- ≈ 68%
- Emergence and Causality in Complex Systems: A Survey on Causal Emergence and Related Quantitative Studiesin corpus2023≈ 68%
- An association-based model of dynamic behaviourin corpus2011≈ 68%
- Active inference: demystified and comparedin corpus2021≈ 68%
- ≈ 67%
- Information, Processes and Gamesin corpus≈ 67%
- The World Inside Neural Networksin corpus2026≈ 67%
- ≈ 67%
Similar preprints — Semantic Scholar
Cited by (1)
- The Guanyin Protocol: A Framework for Immediately Establishing an Understanding of Both Causality and Compassion in LLM Systems Using Semantic Anchoring
Semantic anchoring — the binding of a pretrained model's latent patterns to task-specific targets via external structure — predicts threshold-like performance flips with a single calibrated score S =