finding
active
finding:npi-licensing-mechanism-in-pythia-1b-emerges-in-discrete-stages-steps-1000-2000-3000-not-graduallyNPI licensing mechanism in pythia-1b emerges in discrete stages (steps 1000, 2000, 3000) not gradually
Training dynamics finding showing abrupt rather than gradual emergence of NPI mechanism
Source paper
extracted_from(2024) · Aryaman Arora · Dan Jurafsky · Christopher Potts
Neighborhood — ranked by edge-count
Claims (1)
claim
- Main mechanistic finding from case studies; evidence from training checkpoint analysis of pythia-1b
Findings (1)
finding
- Training dynamics finding showing filler-gap takes longer to learn than NPI licensing
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Mechanistic finding from CausalGym case study showing multi-step information movement in NPI mechanism
- Mechanistic finding from CausalGym case study showing complex multi-step movement for filler-gap
- Robustness check across seeds showing occasional failures of alignment map training
- Baseline accuracy showing small models fail on harder NPI licensing tasks
- Attributed to model anisotropy from saturation making hidden states harder to access
- Mechanistic interpretation of training dynamics in case studies
- DAS consistently finds the most causally-efficacious features across all pythia model sizes in CausalGymfinding0.716Main benchmark result showing DAS superiority over probing, diff-in-means, PCA, k-means, LDA, and random
- Training progression result showing non-linear maps are uncorrelated with genuine task learning