artifact
active
artifact:von-oswald-et-al-2023-transformers-learn-in-context-by-gradient-descentvon Oswald et al. (2023) Transformers learn in-context by gradient descent
Optimization-as-inference view of ICL.
Neighborhood — ranked by edge-count
Thinkers (1)
thinker
- Johannes von OswaldauthoredTransformers learn in-context by gradient descent.
Artifacts (1)
artifact
- Main paper presenting UCCT and semantic anchoring framework.