method
active
method:mean-squared-error-between-self-and-other-activations

Mean Squared Error between self and other activations

The specific implementation of SOO loss using MSE between self_attn.o_proj outputs at a specified layer

Neighborhood — ranked by edge-count

Concepts (1)

concept
  • The attention output projection layer where SOO Loss is computed; maps multi-head attention outputs to hidden dimension

Methods (1)

method
  • A loss function measuring the dissimilarity of latent model representations of self and other, minimized during fine-tuning

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.