hypothesis
active
hypothesis:deceptive-capabilities-may-scale-with-model-size-inverse-scaling-law-hypothesis

Deceptive capabilities may scale with model size (inverse scaling law hypothesis)

Cited hypothesis from Lin et al. 2022 suggesting larger models become more capable of deception

Source paper

extracted_from
When Thinking LLMs Lie: Unveiling the Strategic Deception in Representations of Reasoning Models
(2025) · Kai Wang · Yihao Zhang · Meng Sun

Neighborhood — ranked by edge-count

Thinkers (1)

thinker
  • Lin et al.
    introduces
    Cited for TruthfulQA and inverse scaling law suggesting deceptive capabilities scale with model size

Concepts (1)

concept
  • Hypothesis cited in paper suggesting deceptive capabilities may scale with model size

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.