question

active

question:can-the-distributed-representation-of-lexical-entailment-be-decomposed-into-representations-of-the-individual-word-identities

Can the distributed representation of lexical entailment be decomposed into representations of the individual word identities?

Research question leading to the key NLI finding about word identity data structures.

Source paper

extracted_from

Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations

(2023) · Atticus Geiger · Zhengxuan Wu · Christopher Potts · Thomas Icard +1

Neighborhood — ranked by edge-count

Findings (1)

finding

Lexical entailment representation decomposes into word identity sub-representations with ~0.97-0.98 IIA (Lexeme Subspace of Lexical Entailment)
answered_by
In contrast to hierarchical equality, lexical entailment in BERT decomposes into representations of word identities, not a single abstract relation.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

What appears to be a representation of lexical entailment in BERT is actually a data structure of two word identity representations, not an encoding of the entailment relationclaim0.834
Key asymmetry between hierarchical equality and NLI experiments; BERT stores identities rather than the abstract relation.
The discovery of perfect abstract equality representations that cannot be decomposed into entity representations is a foundational result informing our understanding of how symbolic and connectionist architectures coexistclaim0.752
Concluding claim about theoretical significance of the hierarchical equality finding.
Can truth representations be disambiguated from closely related features such as 'commonly believed' or 'verifiable' using simple factual statements?question0.746
Acknowledged limitation: simple uncontroversial statements cannot distinguish truth from related epistemic features
Organised structure of relationships between parts, sufficient to exhibit information integration and collective action, can be produced via fully-distributed unsupervised learning.claim0.744
Central claim from connectionist models: complex coordination emerges without centralized control or external teacher.
The underlying truth representation may generalize across lexical choices and languageshypothesis0.744
Suggested by non-English Yes/No outputs post-intervention, requiring further investigation
Towards Monosemanticity: Decomposing Language Models with Dictionary Learning (Bricken et al., 2023)concept0.742
Foundational SAE mechanistic interpretability paper
Distributed representationconcept0.730
Idea that information is spread across many neurons; superposition is a subtype.
Laws for free from semantic homomorphism.claim0.729