Anisotropy in Language Models

Property of smaller saturated models making hidden states harder to access via alignment maps

Neighborhood — ranked by edge-count

finding

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Language Modelsconcept0.795
Primary substrate for manifold steering experiments; demonstrates method on reasoning and in-context tasks.
Bias in language modelsconcept0.792
Features related to gender, racial, ethnic biases, slurs, and hate speech.
Language Modelconcept0.790
Primary test domain for manifold steering, including reasoning and ICL tasks
Autoregressive Language Modelingconcept0.778
Training objective interpretable as optimizing a diverse set of tasks; thus subject to multitask scaling convergence pressures
Andreas 2022: Language models as agent modelsconcept0.743
Paper hypothesising LLMs model agent beliefs/desires/intentions with preliminary GPT-3 evidence; cited as ref 2
Can Large Language Models Genuinely Shift Human Perspectivequestion0.736
What happens mechanistically during cessation in language models?question0.732
Follow-up on empirical grounding; answered 'no one looked yet'.
The better an LLM is at language modeling, the more it aligns with vision models, and vice versa — linear relationship between language modeling score and vision-language alignmentfinding0.729
Core cross-modal empirical result: larger and better language models align better with vision models