method
active
method:adaptive-beta-softmax-scaling

Adaptive Beta Softmax Scaling

Implementation detail weighting softmax by log(n_memories) to prevent down-weighting of attention values as memory set grows.

Neighborhood — ranked by edge-count

Frameworks (1)

framework
  • The transformer version directly analogous to TEM, introduced in this paper, offering dramatic performance improvements.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.