finding
active
finding:baseline-llm-condition-in-ipd-replicates-prior-findings-agents-cooperate-selectively-only-when-opponent-consistently-cooperatesBaseline LLM condition in IPD replicates prior findings: agents cooperate selectively only when opponent consistently cooperates
Replication of Fontana et al. 2025 findings in the paper's own Experiment 2 baseline condition
Source paper
extracted_from(2025) · Ruben Laukkonen · Fionn Inglis · Shamil Chandaria · Lars Sandved-Smith +4
Neighborhood — ranked by edge-count
Concepts (1)
concept
- The primary source paper proposing four contemplative principles for AI alignment and piloting them empirically
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- All prompting techniques led to full cooperation against Always Cooperate opponents in IPDfinding0.815Ceiling finding in IPD experiment; baseline sufficient when opponent always cooperates
- Abstract sentence summarising performance and failures.
- LLMs exhibit systematic errors that deterministic logic avoids.
- Deceptive RL baseline agents have lower mean neural self-other overlap than honest baseline agentsclaim0.778Core empirical prediction tested in RL experiments, confirmed by 100% classification accuracy
- Skeptical prior work motivating the need to validate self-reports against internal states rather than taking them at face value
- Central interpretive claim of the paper
- The paper's claim that theoretical convergence across GWT, RPT, HOT, IIT makes the findings non-coincidental
- Binder et al. finding cited as evidence that LLMs possess introspective capacity analogous to mindfulness