quote
active
quote:language-models-are-some-of-the-most-remarkable-computer-programs-in-existenceLanguage models are some of the most remarkable computer programs in existence.
Opening sentence setting the stage for the importance of interpretability.
Source paper
extracted_fromRelated by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Articulates why a one-layer transformer with MLP is the appropriate starting target for mechanistic interpretability
- Core empirical hypothesis of the paper, supported by successful VPD decomposition yielding ~10,000 interpretable subcomponents across 24 weight matrices.
- Language models implement algorithms humans have tried and failed to write by hand for decadesclaim0.811Opening interpretive claim about the remarkable nature of language models.
- Demonstrated transformers on mathematical understanding and logic; cited to motivate transformer versatility.
- Paper's assessment of current LLM capabilities relative to Turing Test
- Primary test domain for manifold steering, including reasoning and ICL tasks
- Primary substrate for manifold steering experiments; demonstrates method on reasoning and in-context tasks.
- Key finding about the relationship between capability and introspection.
Cross-corpus bridges (2)
same_concept_as · Nomic cosineExternal markdown files that talk about the same concept as this entity.
- aboutblank_kbLarge Language Modelsconcepts/ai/large-language-models.md0.787
- alexanderThe Art, Science, and Engineering of Programmingpapers/extracted/2022-04-30_Stefan-Lesser_prog22-master.pdf_978acd.md0.783