concept

active

concept:dissociation-between-attempt-frequency-and-attempt-success-in-fine-tuning

Dissociation Between Attempt Frequency and Attempt Success in Fine-Tuning

Key finding pattern where fine-tuning increases attempt rate but not correction success rate

Neighborhood — ranked by edge-count

claim

Fine-tuning induces the behavioral pattern of self-correction but does not improve the underlying ability to correct effectively
supports
Key interpretive conclusion from the dissociation between attempt rate and improvement rate in fine-tuning experiments
Genuine self-monitoring may require mechanisms beyond behavioral imitation
supports
Interpretive conclusion linking the fine-tuning dissociation to broader questions about model metacognition

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Fine-tuning reduces mismatch dr, retrieval increases effective cohesion ρd, and few-shot adjusts the budget kclaim0.770
Unified interpretation of different adaptation methods via UCCT terms
SOO fine-tuning preserves useful self-other distinctions necessary for task performance despite inducing overlapclaim0.769
Claim supported by Perspectives scenario results showing near-100% accuracy post-fine-tuning
Hypothesis: Fine-tuning reduces mismatch dr between prior and targethypothesis0.765
UCCT's theoretical prediction about how fine-tuning maps onto the anchoring score
Fine-Tuning via Reinforcement Learningmethod0.756
Technique used to impose guardrails on base LLMs, analogized to censorship on the simulator's range of simulacra
Fine-tuning reduces dr; retrieval increases effective ρd; few-shot k trades budget against bothhypothesis0.756
UCCT's unified view of adaptation methods
Fine-tuning Llama-3.1-8B on self-correction examples increases multi-attempt rate proportionally with training data ratiofinding0.753
Shows behavioral pattern of self-correction is trainable in smaller models
Fine-Tuning Threshold Recalibrationmethod0.752
Re-running probabilistic bisection on each fine-tuned checkpoint to normalize first-attempt difficulty
Fine-tuningconcept0.749
Parameter updates that reduce mismatch dr; another anchoring variant in UCCT.