Few-shot anchoring & latent structure

How minimal examples disambiguate and recruit latent arithmetic/reasoning interpretations in LLMs

59 members. Each node is clickable.

Loading graph…

Sub-communities (9)

Finer clusters this community splits into. Each is its own community page.

Unified Competency Control Theory (UCCT)9 Few-shot learning phase transitions in neural networks9 Prompt anchoring and latent structure binding8 Mid-layer representation geometry in neural networks8 Anchoring score S for few-shot learning transitions7 Model base robustness and transfer learning asymmetries5 Neural activation geometry and behavioral prediction5 Mechanistic editing through parameter surgical intervention4 Anchoring bias in commonsense reasoning4

Drawn from 6 sources

The papers/notes whose extracted claims & findings make up this cluster.

The Guanyin Protocol: A Framework for Immediately Establishing an Understanding of Both Causality and Compassion in LLM Systems Using Semantic Anchoring51 members
Paper Summary: Interpreting Language Model Parameters4 members
2026-05-12_room-to-play-in-eval-cohort.md1 member
On biological and artificial consciousness: A case for biological computationalism1 member
SFR-DeepResearch: Towards Effective Reinforcement Learning for Autonomously Reasoning Single Agents1 member
cognitive-glue-and-alexander.md1 member

Bridges (20)

Other communities that share members with this one — cross-cutting threads or papers that sit at the seam between two themes.

Layer-wise geometry predicting few-shot learning10 shared
Anchoring score threshold theory10 shared
Few-shot learning phase transitions in neural networks9 shared
Unified Competency Control Theory (UCCT)9 shared
Mid-layer representation geometry in neural networks8 shared
Prompt anchoring and latent structure binding8 shared
Anchoring score S for few-shot learning transitions7 shared
Few-shot arithmetic learning thresholds6 shared
Neural activation geometry and behavioral prediction5 shared
Unified Contextual Conditioning Theory (UCCT)5 shared
Model base robustness and transfer learning asymmetries5 shared
Anchoring bias in commonsense reasoning4 shared
Mechanistic editing through parameter surgical intervention4 shared
Cross-base fine-tuning transfer asymmetry3 shared
Commonsense reasoning anchoring bias3 shared
Coherent anchor prior rebinding2 shared
Targeted neural network weight surgery2 shared
Anchors as latent structure recruiters1 shared
Prompting as cognitive control operations1 shared
Ambiguous arithmetic anchor interpretations1 shared

Claims (32)

Few-shot thresholds and transition widths track ρd/dr at fixed computational complexityE2 main interpretive claim.
Prompt and context design are cognitive-control operations: they toggle latent competencies rather than teaching the model from scratch.Assertion about the nature of prompt engineering.
UCCT strictly generalizes ICL and reads retrieval-augmented generation and fine-tuning as the same anchoring process acting on one measurable score SAuthors' central interpretive claim about the scope of their theory
Anchors recruit and bind latent structure; they do not create new knowledge in the modelScope-limiting claim clarifying UCCT's interpretation of what anchoring does
Cross-domain anchoring demonstrates that UCCT's principles apply beyond textClaim of modality generality
Fine-tuning reduces mismatch dr, retrieval increases effective cohesion ρd, and few-shot adjusts the budget kUnified interpretation of different adaptation methods via UCCT terms
Higher-density priors (B10) are more robust to fine-tuning than lower-density ones (B9).Interpretation of cross-base transfer asymmetry.
Layer-wise anchoring peaks in a 'Goldilocks zone' between early and late layers.Qualitative characterization of optimal anchoring depth.
Layer-wise geometry summaries (Sbmax, AUSN) predict internal few-shot thresholds θ50Claim that geometry-to-behavior correlates exist
Layer-wise trajectories show early enrichment, mid-layer alignment, and late re-clustering.Qualitative geometry pattern.
Peak anchoring Sbmax and normalized area AUSN correlate with per-item success and internal shot midpoints θ50, providing a geometry-to-behavior bridge.Main interpretation of E3.
Rank-one matrix decomposition constraint enforcing mechanistic simplicityCore design principle of VPD: each parameter subcomponent is constrained to be a simple rank-one matrix to enable isolated understanding and combination.
S = ρd - dr - log k is a predictive correlate of when few-shot behavior flipsClaim that S predicts threshold midpoints across different bases, tasks, and models
S = ρd - dr - log k predicts shot midpoints across different bases, tasks, and modelsPredictive practical utility claim.
S = ρd − dr − log k is a predictive correlate of anchoring success across few-shot, SFT, and CoT.UCCT's practical utility claim.
S is a predictive correlate calibrated on dev sets, not an absolute measureClarifies nature of S.
Shot midpoint ordering k50(B10) < k50(B8) ≈ k50(B9) tracks pretraining exposure densityInterpretation that pattern density from pretraining determines few-shot requirements
Small prompt changes can yield threshold-like shifts because S crosses the critical value ScAuthors' explanation for abrupt behavioral changes
Small, coherent anchors can rebind strong priors and exhibit near-threshold sensitivity.Conclusion from E1 and central UCCT claim.
Small, coherent anchors can rebind strong priors without changing model weightsCross-domain anchoring claim.
The additive form S = ρd - dr - log k is parsimonious and aligns with log-odds intuitionJustification for the linear combination
The anchoring score S is a predictive correlate of when anchoring succeeds and why small prompt changes yield threshold-like shifts.A central claim about the operational value of S.
The budget term −log k acts as a regularizer to discourage degenerate long prompts.Theoretical interpretation.
The ordering of few-shot thresholds k50 and transition widths aligns with k50 ∝ dr/ρd.Interpretation of E2 results.
The three forces—cohesion, mismatch, budget—summarize anchoring trajectories.Summary of the decomposition of S.
Threshold-like performance flips occur when anchoring strength S crosses a task-dependent critical value Sc.Interpretation of abrupt behavior changes.
Transition widths ∆k increase with mismatch D(P0 ∥ PT), evidenced by wider widths from B10 to B9Interpretive claim linking phase width in E2 to mismatch term in UCCT
UCCT fills a gap in explaining when behavior flips for a specific prompt and how much anchor budget is neededAuthors contrast their work with prior phase/representation studies
UCCT offers a compact, testable formulation with measurable quantities (ρd, dr, k, S, Sc)Falsifiability claim.
UCCT provides practical diagnostics for prompt design, retrieval, and light fine-tuning via S without additional training infrastructureApplied contribution claim: S enables 'add 2 more examples to cross threshold' decisions
+2 more

Findings (27)

2-shot reinterpretation of '-' yields 23 for 15-8 on held-out queryE1 qualitative: two exemplars (2-3=5, 7-4=11) cause LLMs to output 23 for 15-8.
Adding a single disambiguating example (12−9=21) aligns divergent M1-M4 interpretations under tested seedsE1 finding consistent with threshold-crossing: near-threshold state resolved by one additional anchor
Ambiguous anchors (33-27=60, 11-9=20) yield four distinct arithmetic interpretations across M1-M4Models produce different answers (240, 138, -240) from the same ambiguous prompt
AUSN mean -2.119 ± 0.198Normalized area under S(ℓ) averaged over seeds.
B10 phase width Δk = 1.21 ± 0.18Transition width (k90 – k10) for B10.
B10 shot midpoint k50 = 0.28 ± 0.05 shots with accuracy 94.8 ± 1.2%Lowest threshold condition in E2; near-zero/one-shot threshold consistent with high pretraining density
B9 phase width (k90 − k10) = 3.74 ± 0.31 shotsWidest transition in E2; consistent with lower prior density requiring more shots for reliable threshold crossing
Commonsense reasoning S ≈ -2.15 uniformLower, uniform anchoring for pattern-matching tasks.
Commonsense reasoning shows uniform but weaker anchoring (S ≈ −2.15)Task-specific comparison.
Commonsense reasoning tasks S≈-2.15Lower, more uniform anchoring for commonsense tasks
Correlation between layer-wise S scores and task accuracy: ρ = -0.73, p < 0.001Shows S predicts anchoring effectiveness.
Cross-base fine-tuning yields asymmetric transfer: B10 transfers most robustly, B9 leastIn-base gains accompanied by uneven OOD drops; higher-density priors more robust.
Cross-base transfer: B10 transfers most robustlyB10 fine-tuning yields smallest OOD drops when transferring to other bases
Direct model editing via parameter subcomponent modification—emoticon eye recognition altered to predict shocked faces with no retrainingDemonstrated that VPD-discovered subcomponents encode true computational machinery by enabling targeted, predictable behavior changes without gradient-based training.
Editing the emoticon eye subcomponent to output the unembedding vector for 'o' causes the model to predict shocked faces for all emoticonsDirect parameter subcomponent overwrite produces a clean behavioral change without training.
k50 for base-10 two-digit addition: 0.28 ± 0.05 shotsShot midpoint from logistic fit over 10 runs.
k50 ordering: B10 (0.28) < B8 (1.83) < B9 (2.91) follows pretraining densityMonotone ordering consistent with k50 ∝ dr/ρd.
Larger Sbmax associated with smaller θ50 in E3 sweepGeometry-to-behavior correlate within E3.
Length normalization prevents degenerate tool-calling trajectories and repeated tool calls without normalization.Empirical result showing that without length normalization, RL training produces rapidly increasing tool usage with performance collapse and repetitive tool calls.
LLaMA-3.1-8B: Sbmax = -1.896 ± 0.211, AUSN = -2.119 ± 0.198, peak layer ℓ* = 10 (median)Seed-pooled geometry-only statistics (per-dev z units).
Math and code tasks show strongest mid-layer anchoring on LLaMA (S ≈ −1.65 at layers 8-12)Task-specific E3 finding showing compositional reasoning requires deeper processing
Math/code tasks S ≈ -1.65 at layers 8–12Task-specific peak anchoring score for structured reasoning domains.
Meta-LLaMA-3.1-8B-Instruct shows optimal anchoring at layer 9 (S ≈ −1.90, median peak layer ℓ* = 10 [IQR 0.384])E3 result establishing the Goldilocks zone at mid-layers for LLaMA architecture
Peak layer ℓ* median 10, IQR 0.384Median layer where S(ℓ) peaks, across seeds.
Sbmax mean -1.896 ± 0.211Geometry summary peak anchoring score averaged over seeds.
Single dendritic layer solves XOR-like problems with capacity matching 8-layer deep networks.Evidence from Beniaguev et al. (2021) that individual biological neurons vastly outperform McCulloch-Pitts model; supports hybrid computation claim.
Subcomponent L2.MLP.down:3382 (density 0.00%) predicts emoticon continuations after colon, semicolon, or equalsSpecific discovered subcomponent that activates on punctuation like ' :', ' ;', ' =', ':-' and predicts the rest of emoticons/emojis.