thinker

active

thinker:john-schulman

John Schulman

Cited for scaling laws for reward model overoptimization (2022).

Authored

0

Introduces

1

Studies

0

Affiliations

0

Cited by

0

More papers — OpenAlex / S2

Originates (1)

method

Proximal Policy Optimization

Recent mentions (3)

papers-typed
kim-2026-active-inference.md
papers-typed
greenblatt-2024-alignment.md
papers-typed
yuntao-2022-cat-s.md