claim
active
claim:what-appears-to-be-a-representation-of-lexical-entailment-in-bert-is-actually-a-data-structure-of-two-word-identity-representations-not-an-encoding-of-the-entailment-relationWhat appears to be a representation of lexical entailment in BERT is actually a data structure of two word identity representations, not an encoding of the entailment relation
Key asymmetry between hierarchical equality and NLI experiments; BERT stores identities rather than the abstract relation.
Source paper
extracted_from(2023) · Atticus Geiger · Zhengxuan Wu · Christopher Potts · Thomas Icard +1
Neighborhood — ranked by edge-count
Papers (1)
paper
Findings (1)
finding
- Lexical entailment representation decomposes into word identity sub-representations with ~0.97-0.98 IIA (Lexeme Subspace of Lexical Entailment)associated_withsupportsIn contrast to hierarchical equality, lexical entailment in BERT decomposes into representations of word identities, not a single abstract relation.
Claims (2)
claim
- DAS reveals that the neural network encodes abstract relational structure rather than raw input identities.
- Motivated by the finding that lexical entailment decomposes into word identities.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Research question leading to the key NLI finding about word identity data structures.
- The semantic relation between words wp and wh (entails/neutral) used as an intermediate variable in the MoNLI high-level model.
- Localization result from patching experiments; identifies group (b) hidden states as the locus of truth representations
- The underlying truth representation may generalize across lexical choices and languageshypothesis0.751Suggested by non-English Yes/No outputs post-intervention, requiring further investigation
- Open theoretical problem CIMC acknowledges: precisely characterizing the representational format of perception
- Defines embedment as a spatially hierarchical technique.
- Antra's earlier definitive statement of the tricameral model.