finding
active
finding:tem-t-requires-many-fewer-data-samples-than-tem-to-reach-equivalent-performance-sample-efficiency-improvementTEM-t requires many fewer data samples than TEM to reach equivalent performance (sample efficiency improvement)
Empirical performance comparison showing TEM-t is a more efficient learner than the original TEM.
Source paper
extracted_from(2021) · James C. R. Whittington · Joseph W. Warren · Timothy E.J. Behrens
Neighborhood — ranked by edge-count
Frameworks (1)
framework
- TEM-Transformer (TEM-t)supportsThe transformer version directly analogous to TEM, introduced in this paper, offering dramatic performance improvements.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Empirical computational efficiency result comparing TEM-t to the original TEM implementation.
- TEM memory retrieval is mathematically equivalent to transformer self-attention without softmaxclaim0.784Central theoretical claim: a single step of TEM attractor dynamics equals a dot-product attention, making TEM a special case of transformer.
- Theoretical claim linking the TEM-t architecture to the Teyler-Rudy hippocampal indexing theory.
- Selective pressure toward convergence via task generality
- Universalist claim predicting cross-cultural generality.
- The model tends to reflect more when the question is difficult, and accuracy is generally lower for harder questionshypothesis0.737Hypothesis explaining negative correlation between reflection rate and accuracy without implying reflection is harmful
- Second falsifiable prediction linking objective function structure to valence profile
- Implication of PRH for 'scale is all you need' argument