artifact
active
artifact:von-oswald-et-al-2023-transformers-learn-in-context-by-gradient-descent

von Oswald et al. (2023) Transformers learn in-context by gradient descent

Optimization-as-inference view of ICL.

Neighborhood — ranked by edge-count

Thinkers (1)

thinker

Artifacts (1)

artifact