causal masking

Attention restricted to previous tokens only, as in decoder-only models; leads to AR(ω)-like behaviour and no ordered phase

Neighborhood — ranked by edge-count

concept

context window
implements
Finite number of previous tokens used by autoregressive models to predict the next token; defines interaction range

finding

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

causally-masked attentionmethod0.847
Attention mechanism with causal mask limiting each token's view to previous tokens; used in decoder-only transformers
Causal Mediationconcept0.829
Whether an internal direction causally controls a target behavior, verified by intervention success
Causal Attention Maskmethod0.826
Modification to transformer restricting keys and values to previous time-steps only, mimicking how an agent accumulates experiences.
Causal Tracingconcept0.793
Mechanistic interpretability technique for locating factual associations, mentioned as future work direction.
Causal Scrubbingmethod0.788
Method by Chan et al. 2022 for rigorously testing interpretability hypotheses via interventions
Causal abstractionconcept0.773
A framework the paper uses alongside feature geometry to deepen mechanistic understanding of LMs
causal bypassingconcept0.768
Confound where naming injected concepts reflects direct logit effects rather than metacognitive awareness, raised by Morris & Plunkett
Causal Mechanismconcept0.765
Function determining the value of a variable based on its causal parents in an acyclic causal model.