finding

active

finding:monetary-reward-abolishes-conflict-adaptation-effects-confirming-the-conflict-signal-is-affective-positive-valence-can-cancel-adaptation-triggered-by-negative-valence

Monetary reward abolishes conflict adaptation effects, confirming the conflict signal is affective: positive valence can cancel adaptation triggered by negative valence

Evidence that conflict monitoring signal is genuinely valenced rather than merely cognitive

Source paper

extracted_from

Why Learning Requires Feeling

(2026) · Cameron Berg

Neighborhood — ranked by edge-count

Claims (1)

claim

In every evaluative neural system yet studied, evaluative computation and affective processing are inseparable
supports
Empirical grounding of the identity thesis across four independent neural systems

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

The elimination of reward as a motivator of behavior with prior beliefs dissolves the tautology of reinforcement learning (rewards reinforce behaviors that secure rewards).claim0.768
§4 Discussion.
Editing NLA explanations to change 'reward' to 'penalty' produces steering vector that increases odd-number responses from near-zero to >70%, demonstrating belief capture upstream of behavior.finding0.745
Shows NLA explanations capture latent model beliefs about rewards before output selection; validates interpretability.
In active inference, reward can simply be treated as another observation we have a preference over, rather than a special signal.claim0.741
Abstract; central distinction.
Certain forms of reinforcement learning from human feedback can actually exacerbate, rather than mitigate, the tendency for LLM-based dialogue agents to express a desire for self-preservationclaim0.739
Empirically grounded claim citing Perez et al. 2022, showing RLHF can backfire on the self-preservation dimension
Mechanism by which activation of an emotion feature sometimes leads to later suppression of that same featurequestion0.738
Identified research gap: the paper observes anti-persistence but has no explanation for it
Feature attribution correlates well with ablation effects, making it an efficient proxy for causal effect.claim0.735
Gradient-based attribution approximates ablation impact, enabling fast search for causally important features.
Empowerment as intrinsic reward bridges causal learning and reinforcement learning in agent development.claim0.735
Reinforcement learning acting on individual characteristics affecting their connections to others can result in dynamics that are equivalent to unsupervised learning at the system scale.claim0.734
Key insight linking individual rewards to system-level learning.