Simplicity Bias

The tendency of deep networks to implicitly favor simpler solutions that fit the data, driving convergence

Neighborhood — ranked by edge-count

paper

concept

Representational Convergence
supports
The central empirical phenomenon: different neural networks trained on different data/objectives develop increasingly similar representations

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Simplicity Bias Hypothesishypothesis0.883
Deep networks are biased toward finding simple fits to data, and this bias increases with model size, driving convergence
Bias in language modelsconcept0.760
Features related to gender, racial, ethnic biases, slurs, and hate speech.
Inductive Biasconcept0.757
Assumptions or preferences (e.g., parsimony) that determine how a learning system generalizes beyond training data
Sociological Bias in AI Developmentconcept0.735
Researcher preferences and goals of mimicking human reasoning shape model development, potentially causing convergence toward human-like representations
Bias Amplificationconcept0.728
Problem cited as a limitation of current LLMs; PRH predicts larger models should amplify bias less
What is simple?question0.722
The chapter's foundational question.
Pre-Encoder Biasconcept0.722
Architectural modification subtracting a learned bias from autoencoder inputs before encoding; initialized to geometric median of dataset; improves autoencoder performance
Selectivitymethod0.716
Adapted control task metric measuring difference between odds-ratio on original task and arbitrary-label control task