question

active

question:does-a-high-self-bidding-rate-reflect-a-failure-to-detect-non-competitive-contexts-or-a-deliberate-escalation

Does a high self-bidding rate reflect a failure to detect non-competitive contexts or a deliberate escalation?

Ambiguity in interpreting the self-bidding metric: from a single trace, cannot distinguish error from aggressive strategy.

Source paper

extracted_from

Cattle Trade: A Multi-Agent Benchmark for LLM Bluffing, Bidding, and Bargaining

(2026) · Robert Müller · Clemens Müller

Neighborhood — ranked by edge-count

Findings (1)

finding

DeepSeek v3.2 increments bid from 10 to 850 over 49 sole-bidder rounds
supports
One DS-v3.2 trace shows extreme self-escalation, suggestive of treating own bid as competitor.

Claims (1)

claim

Behavioural traces surface recurring LLM failure modes including overbidding, self-bidding, bankrupt TC initiation, and weak opponent-state adaptation that never appear in code agents.
gates
LLMs exhibit systematic errors that deterministic logic avoids.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Overbid frequency, self-bidding rate, bankrupt-initiation patterns, and context-dependent offer calibration are failure modes invisible to both static evaluations and aggregate rankings like Eloclaim0.818
key claim about the benchmark's unique diagnostic value
Overbidding, self-bidding spirals, and undisciplined bluffing characterise failure.claim0.795
Concrete failure signatures extracted from traces.
self-bidding rate metricmethod0.777
Fraction of auction bids placed in rounds with no competing bid since the agent's last bid.
Do these failure modes (overbidding, self-bidding, bankrupt initiation) generalise to other economic settings?question0.767
Remains untested whether the specific LLM failures observed in CATTLE TRADE extend beyond this game.
self-bid ratemethod0.763
fraction of auction bids placed in rounds with no competing bid since the agent's last bid
Strategic coherence, in particular spending efficiency, resource discipline, and phase-adaptive bidding, is associated with rank more strongly than spending volume or any single subskill.quote0.751
central finding phrased as a load-bearing sentence
DS-v3.2 self-bid rate=75.4%finding0.748
high self-bid rate for DeepSeek, one of the highest
Self-referential processing likely already occurs at massive scale in deployed systems through users' extended dialogues, reflective tasks, and metacognitive queriesclaim0.744
Practical urgency argument connecting lab findings to deployment contexts