claim

active

claim:s-is-a-predictive-correlate-calibrated-on-dev-sets-not-an-absolute-measure

S is a predictive correlate calibrated on dev sets, not an absolute measure

Clarifies nature of S.

Source paper

extracted_from

The Guanyin Protocol: A Framework for Immediately Establishing an Understanding of Both Causality and Compassion in LLM Systems Using Semantic Anchoring

(2025) · Edward Yi Chang · Kaya, Zeyneb N. · Ethan Chang

Neighborhood — ranked by edge-count

Findings (1)

finding

Correlation between layer-wise S scores and task accuracy: ρ = -0.73, p < 0.001
supports
Shows S predicts anchoring effectiveness.

Communities (2)

community

Few-shot anchoring & latent structure
members_of
How minimal examples disambiguate and recruit latent arithmetic/reasoning interpretations in LLMs
Anchoring score S for few-shot learning transitions
members_of
Predictive metric S = ρd - dr - log k quantifies when LLM behavior sharply transitions across few-shot, SFT, and CoT settings via layer-wise calibration.

Artifacts (1)

artifact

Semantic Anchoring in LLMs: Thresholds, Transfer, and Geometric Correlates
supports
Main paper presenting UCCT and semantic anchoring framework.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

The anchoring score S is a predictive correlate of when anchoring succeeds and why small prompt changes yield threshold-like shifts.claim0.776
A central claim about the operational value of S.
S = ρd − dr − log k is a predictive correlate of anchoring success across few-shot, SFT, and CoT.claim0.763
UCCT's practical utility claim.
dev set calibrationmethod0.760
Fixed dev pool of 1000 prompts used for whitening and z-scoring parameters.
S = ρd - dr - log k is a predictive correlate of when few-shot behavior flipsclaim0.756
Claim that S predicts threshold midpoints across different bases, tasks, and models
S = ρd - dr - log k predicts shot midpoints across different bases, tasks, and modelsclaim0.756
Predictive practical utility claim.
The results generalise readily to non-equilibrium systems where scaling relationships remain similar (e.g., dynamic or localised scaling).claim0.752
Claim about broader applicability of the scaling argument
Alignment type is the only significant predictor of scores (p=0.006); architecture and parameter count do not.finding0.746
Kruskal-Wallis test result: Constitutional AI predicts highest baseline; roleplay/empathy training predict lowest.
it creates a ‘we’; what we are doing, for example, whether our behaviours are coordinated or not, becomes a relevant variable.quote0.741
Illustrates how non-separable functions shift identity to the collective level.