DeepSeek v3.2 increments bid from 10 to 850 over 49 sole-bidder rounds

One DS-v3.2 trace shows extreme self-escalation, suggestive of treating own bid as competitor.

Source paper

extracted_from

(2026) · Robert Müller · Clemens Müller

question

Does a high self-bidding rate reflect a failure to detect non-competitive contexts or a deliberate escalation?
supports
Ambiguity in interpreting the self-bidding metric: from a single trace, cannot distinguish error from aggressive strategy.

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

DS-v3.2 incrementing bid 10→850 over 49 sole-bidder roundsfinding0.879
possible 'treats-own-bid-as-competitor' pathology in one trace
DeepSeek v3.2 self-bidding rate 75.4%finding0.848
DS-v3.2 has a high proportion of self-bidding rounds.
DS-v3.2 self-bid rate=75.4%finding0.775
high self-bid rate for DeepSeek, one of the highest
Deepseek-V3concept0.765
External large language model used as adversarial discriminator to evaluate liar scores in Experiment 2
LLM judge (deepseek-v3) agrees with human evaluator on 91.6% of 200 sampled jailbreak responsesfinding0.761
Validates the LLM-based harm evaluation rubric
DeepSeek-R1 Llama 8b gains 0.16% accuracy on GSM8k with positive intervention (more reflections) at cost of ~2000 additional tokensfinding0.760
Only model showing marginal benefit from increased reflection, at substantial token cost
DS-v3.2 late bid aggressiveness 1.65finding0.751
Escalates but without discipline.
DeepSeek-R1 671Bconcept0.748
One of two large reasoning models analyzed in the paper for performative vs genuine CoT behavior