artifact

active

artifact:semantic-anchoring-in-llms-thresholds-transfer-and-geometric-correlates

Semantic Anchoring in LLMs: Thresholds, Transfer, and Geometric Correlates

Main paper presenting UCCT and semantic anchoring framework.

Neighborhood — ranked by edge-count

Thinkers (46)

thinker

Jason Wei
citesmentions
Emergent abilities of LLMs.
Trenton Bricken
citesmentions
Toy models of superposition.
Catherine Olsson
citesmentions
Stanislas Dehaene
citesmentions
Cited for global workspace theory and consciousness models.
Johannes von Oswald
citesmentions
Transformers learn in-context by gradient descent.
Anne M. Treisman
citesmentions
Cited for feature integration theory of attention.
Core Francisco Park
citesmentions
Competition dynamics in ICL and representation geometry.
Daniel Kahneman
citesmentions
Cited for dual-process theory.
Tom B. Brown
citesmentions
Lead author of GPT-3 paper, demonstrating few-shot learning.
Bernard J. Baars
citesmentions
Cited for global workspace theory.
Guangyu Hong
citesmentions
Mixtures of in-context learners (MOICL).
Jesse Hoogland
citesmentions
Developmental landscape of in-context learning.
Jinwu Hu
citesmentions
Test-time learning for LLMs.
Michael I. Posner
citesmentions
Cited for attentional networks.
Rylan Schaeffer
citesmentions
Are emergent abilities a mirage?
Sang Michael Xie
citesmentions
Bayesian explanation of ICL.
Sewon Min
citesmentions
Rethinking the role of demonstrations in ICL.
Siyin Wang
citesmentions
Bayesian example selection for ICL.
Zihang Dai
citesmentions
Meta-learning analogy for ICL.
Edward Y. Chang
authored
Ethan Y. Chang
authored
Christopher M. Bishop
mentions
Machine learning textbook author.
Rabiner 1989
cites
Author of a foundational tutorial on hidden Markov models.
Zeyneb N. Kaya
authored

+22 more

Frameworks (11)

framework

Global workspace theory
citesmentions
Theory of consciousness involving a global workspace for information.
Dual-process theory
citesmentions
Distinguishes fast pattern completion from deliberative control, used as analogy in paper.
Unified Contextual Control Theory (UCCT)
introduces
A theory that pretrained latent patterns are bound to task targets via external semantic anchors; formalized by anchoring strength S.
Competition Dynamics ICL Framework (Park et al. 2024)
cites
Formalizes regime shifts between retrieval-like and inference-like ICL; UCCT complements with when-predictor
Bayesian ICL (Xie et al. 2022)
cites
Prior framework explaining ICL as inference over task structure; UCCT adopts and extends the Bayesian lens
Feature integration theory
cites
Studies how inputs are gated in attention, cited as analogy.
In-Context Learning of Representations (Park et al. 2025)
cites
Reports phase-like breakpoints and geometry changes as context scales; UCCT provides measurable predictor
ByCS: Bayesian Example Selection
cites
Selection/weighting strategy for ICL demonstrations; in UCCT terms alters context posterior
Meta-Learning View of ICL (Dai et al. 2023)
cites
Views ICL as a form of meta-learning; UCCT sits alongside this account
MOICL: Mixtures of In-Context Learners
cites
Selection strategy that adapts which demonstrations carry signal; in UCCT terms increases effective ρd
Optimization-as-Inference View of ICL (von Oswald et al. 2023)
cites
Views transformers as performing implicit gradient descent; UCCT complements this mechanistic account

Methods (15)

method

E3: Layer-wise Geometric Trajectory Analysis
introduces
Quantitative study correlating layer-wise anchoring geometry (S_max, AUS_N) with behavioral thresholds θ50
E2: Numeral-Base Arithmetic Controlled Study
introduces
Quantitative study varying representational familiarity via numeral bases B10/B8/B9 at fixed computational complexity
whitening and z-scoring procedure
introduces
Calibration protocol: whiten embeddings on dev pool, z-score ρd and dr per layer.
layer-wise trajectory analysis
introduces
Computing per-layer S(ℓ) to summarize geometry.
per-dev z-scaling
introduces
Standardizing ρd and dr using dev-set means and stds to form dimensionless components of S.
whitening and z-scoring protocol
introduces
Standardization of ρd, dr, and log k on dev set for computing S.
layer-wise anchoring score S(ℓ) computation
introduces
Compute per-layer S(ℓ) = ρ̃d(ℓ) - d̃r(ℓ) - log k after whitening and standardization.
logistic fitting for shot thresholds
introduces
Fit a sigmoid to accuracy vs. k to estimate k50 and phase width.
Logistic surrogate fitting
introduces
Fitting a logistic function to success probability as a function of S or shot count to estimate midpoints and widths.
logistic surrogate model
introduces
Sigmoid fit linking S to success probability.
Per-dev z-scoring
introduces
Standardization of ρd and dr components using development-set mean and standard deviation.
retrieval-augmented generation (RAG)
about
Retrieving external content to augment prompts.
Whitening of span embeddings
introduces
Preprocessing step that uses dev-set covariance to standardize embedding scales before computing ρd and dr.
E1: Cross-Domain Anchoring Demonstrations
introduces
Qualitative experiment showing coherent anchors can rebind strong priors across text and vision modalities
Geometry summaries (Sbmax, AUSN)
introduces
Peak anchoring (Sbmax) and normalized area under the S(ℓ) curve (AUSN) used to summarize trajectory.

Artifacts (28)

artifact

Supplementary materials (prompts, seeds, scoring code)
about
Reproducibility package released with the paper.
Wei et al. (2022) Emergent abilities of large language models
cites
Documented threshold-like emergent behaviors.
Baars (2005) Global workspace theory of consciousness
cites
Cognitive framing of access and broadcasting.
Bishop (2006) Pattern recognition and machine learning
cites
Textbook reference for mixture models and sufficiency.
Bricken et al. (2023) Toy models of superposition
cites
Superposition and sparse feature structure.
Brown et al. (2020) Language Models are Few-Shot Learners
cites
Foundational ICL paper cited for few-shot capability and anchoring concept.
Clark et al. (2018) Think you have solved question answering? try arc
cites
ARC dataset cited for evaluation tasks.
Dai et al. (2023) GPTs are GPTs: An early look at the labor market impact
cites
Meta-learning analogy for ICL.
Dehaene (2014) Consciousness and the brain
cites
Global workspace theory cited as analogy for selective access.
Hong et al. (2025) Mixtures of in-context learners
cites
Selection/weighting of demonstrations.
Hoogland et al. (2024) The developmental landscape of in-context learning
cites
Stagewise geometry preceding behavioral milestones.
Hu et al. (2025) Test-time learning for large language models
cites
Test-time adaptation techniques.
Huang et al. (2025) OpenCoder: The open cookbook for top-tier code LLMs
cites
Code task dataset.
Kahneman (2011) Thinking, fast and slow
cites
Dual-process theory cited as cognitive analogy.
Liu et al. (2020) LogiQA: A challenge dataset for machine reading comprehension
cites
Logical inference dataset.
Min et al. (2022) Rethinking the role of demonstrations: What makes in-context learning work?
cites
Showed format dominates label use in ICL.
Olsson et al. (2022) In-context learning and induction heads
cites
Mechanistic work uncovering induction heads.
Park et al. (2024) Competition dynamics shape algorithmic phases of in-context learning
cites
Formalized regime shifts between retrieval-like and inference-like ICL.
Park et al. (2025) In-context learning of representations
cites
Geometry changes under context scaling.
Posner (2012) Attentional networks and consciousness
cites
Attentional networks theory.
Rabiner (1989) A tutorial on hidden Markov models
cites
HMM reference for sufficiency assumption.
Schaeffer et al. (2023) Are emergent abilities of large language models a mirage?
cites
Examined emergence vs. metric sharpness.
Talmor et al. (2019) CommonsenseQA: A question answering challenge
cites
Commonsense reasoning dataset.
Treisman & Gelade (1980) A feature-integration theory of attention
cites
Classic attention gating work.

+4 more

Concepts (19)

concept

anchoring strength S
introduces
Composite score S = ρd − dr − log k predicting anchoring success.
in-context learning (ICL)
about
Test-time adaptation from prompt or retrieved context with no parameter updates.
semantic anchoring
introduces
The central idea that external structure binds latent patterns to desired targets.
Fine-tuning
about
Parameter updates that reduce mismatch dr; another anchoring variant in UCCT.
Anchoring strength S = ρd - dr - log k
introduces
The calibrated score measuring how effectively anchors bind target patterns; a predictive correlate of success.
shot midpoint k50
introduces
Number of in-context exemplars to reach 50% accuracy in E2.
threshold-like behavior
about
Sharp performance changes when S crosses a critical value.
internal threshold θ50
introduces
Few-shot midpoint in E3's geometric analysis.
Logistic success surrogate
introduces
Phenomenological fit P(success)=σ(αS+β) used to summarize sharpness and midpoints.
Normalized area under S (AUSN)
introduces
Average of per-layer S(ℓ) scores, summarizing the breadth of anchoring trajectory.
cohesion ρd
introduces
Within-cluster tightness of target pattern representations.
Mismatch dr
introduces
Distance between prior knowledge centroid and target pattern centroid, e.g., 1 - cos(eprior, eT).
peak anchoring Sbmax
introduces
Maximum layer-wise anchoring score across layers.
Cross-domain anchoring
about
Observation that anchoring effects appear across text and vision modalities.
Emergent Abilities of LLMs
mentions
Prior work documenting abrupt capability changes under scale; UCCT provides a measurable predictor for when they occur
task-dependent threshold Sc
introduces
Critical anchoring strength above which performance flips sharply.
Threshold Sc
introduces
Value of S above which performance sharply increases; varies by model, layer, and task.
anchor budget k
introduces
Number of few-shot exemplars provided.
Phase Transition / Regime Shift in ICL
mentions
Phenomenon documented by Park et al. 2024/2025 that UCCT complements by providing a when-predictor

Claims (14)

claim

UCCT strictly generalizes ICL and reads retrieval-augmented generation and fine-tuning as the same anchoring process acting on one measurable score S
introducessupports
Authors' central interpretive claim about the scope of their theory
Peak anchoring Sbmax and normalized area AUSN correlate with per-item success and internal shot midpoints θ50, providing a geometry-to-behavior bridge.
supports
Main interpretation of E3.
Few-shot thresholds and transition widths track ρd/dr at fixed computational complexity
supports
E2 main interpretive claim.
Anchors recruit and bind latent structure; they do not create new knowledge in the model
introduces
Scope-limiting claim clarifying UCCT's interpretation of what anchoring does
The anchoring score S is a predictive correlate of when anchoring succeeds and why small prompt changes yield threshold-like shifts.
introduces
A central claim about the operational value of S.
S = ρd - dr - log k predicts shot midpoints across different bases, tasks, and models
supports
Predictive practical utility claim.
Small, coherent anchors can rebind strong priors without changing model weights
supports
Cross-domain anchoring claim.
Threshold-like performance flips occur when anchoring strength S crosses a task-dependent critical value Sc.
supports
Interpretation of abrupt behavior changes.
Layer-wise geometry shows early dip, mid-layer alignment, and late standardization across tasks
supports
Qualitative pattern from E3.
Prompt and context design are cognitive-control operations: they toggle latent competencies rather than teaching the model from scratch.
supports
Assertion about the nature of prompt engineering.
S is a predictive correlate calibrated on dev sets, not an absolute measure
supports
Clarifies nature of S.
Shot midpoints follow k50 ∝ dr/ρd; higher cohesion and lower mismatch yield fewer required examples
introduces
Core quantitative prediction of UCCT validated by E2 threshold ordering
UCCT offers a compact, testable formulation with measurable quantities (ρd, dr, k, S, Sc)
supports
Falsifiability claim.
Mid-layers (6-15) achieve peak anchoring because semantic structure differentiates while maintaining coherence, forming a Goldilocks zone
introduces
Interpretation of E3 layer-wise results; motivates targeted UCCT interventions at layers 8-12

Hypotheses (3)

hypothesis

Hypothesis 1 (Threshold Behavior): There exists a task-dependent threshold Sc such that performance exhibits sharp changes as S crosses Sc, with value and transition width depending on model, layer, and pooling
introducessupports
Core testable hypothesis of UCCT about the nature of performance transitions under anchoring
Hypothesis: Peak alignment location S_max and normalized trajectory area AUS_N predict shot midpoints θ50
introduces
E3 prediction that internal geometry provides a bridge to behavioral thresholds
Hypothesis: Shot midpoint ordering k50(B10) < k50(B8) ≈ k50(B9) follows pretraining exposure density
introduces
E2 prediction that bases with higher pretraining exposure require fewer shots to cross threshold

Events (3)

event

Experiment 1: Cross-domain anchoring demonstrations
mentions
Qualitative tests of anchor rebinding of strong priors in text and vision.
Experiment 2: Numeral-base arithmetic quantitative study
mentions
Varying representational familiarity at fixed complexity to test k50 ordering and transfer.
Experiment 3: Geometric trajectory analysis
mentions
Layer-wise geometry analysis linking Sbmax/AUSN to θ50.

Datasets (2)

dataset

Numeral-base arithmetic datasets (B10, B8, B9)
about
Synthetic arithmetic datasets for base-10, base-8, base-9 two-digit addition with train and test splits.
E3 evaluation prompts (25 tasks)
about
Set of 25 prompts spanning commonsense, logic, science, arithmetic, code tasks used for layer-wise geometry overlay.

Questions (1)

question

Where does reliable, goal-directed behavior come from if it is not explicitly programmed?
mentions
Opening research question of the paper.

Venues (1)

venue

arXiv
mentions
Publication venue for the preprint (arXiv:2605.01148).