finding
active
finding:npi-mechanism-in-pythia-1b-moves-negation-feature-through-complementiser-that-auxiliary-verb-and-main-verb-across-layers-before-predicting-npi-any

NPI mechanism in pythia-1b moves negation feature through complementiser 'that', auxiliary verb, and main verb across layers before predicting NPI 'any'

Mechanistic finding from CausalGym case study showing multi-step information movement in NPI mechanism

Source paper

extracted_from
CausalGym: Benchmarking causal interpretability methods on linguistic tasks
(2024) · Aryaman Arora · Dan Jurafsky · Christopher Potts

Neighborhood — ranked by edge-count

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.