Watkins and Dayan 1992

Original Q-Learning paper cited for the learning algorithm used in all agents

Neighborhood — ranked by edge-count

paper

method

Q-learning
cites
Model-free RL algorithm used in experimental comparison; employs ε-greedy exploration.

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Dubey, Griffiths, and Dayan 2022concept0.749
Source of the happiness function f[h] that this paper extends with pain-belief
Mahajan, Dayan, and Seymour 2025concept0.719
Casts pain and injury as POMDPs; direct precursor to this paper's approach
Zimmermann et al. (2021)concept0.680
Showed contrastive learning inverts the data generating process; supports claim that contrastive learners recover statistics of underlying world
Yamins et al. (2014)concept0.664
Showed that performance-optimized neural networks align with biological brain representations in higher visual cortex
Baker et al. 2017concept0.658
Reference for ToM modeled through partially observable inference of others' beliefs
Aghion, Jones & Jones (2017)concept0.655
Cited for analysis of AI and economic growth relevant to Malthusian dynamics of digital minds