question
active
question:why-do-we-think-that-sonnet-4-5-gets-so-flusteredWhy do we think that Sonnet 4.5 gets so flustered?
Cube Flipper's question about specific model behavior explained by absence of memory tools.
Source paper
extracted_fromRelated by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Linked to Claude 3.5 Sonnet not exhibiting pro-animal-welfare preferences
- Sonnet's win rate in exploratory games
- Mid-field performance with larger uncertainty due to small sample.
- Establishes alignment faking as a scale-emergent capability
- Explanation for the 'silent' thought phenomenon.
- Mid-to-strong tier closed-source model used as task-solving agent and anchor evolver
- Suggests that later models can keep the thought 'silent' rather than letting it influence output.
- Demonstrates NLAs' ability to surface hypotheses that lead to discovery of root cause (malformed training data).