Pure Feature

A feature that responds to only a single latent variable, contrasted with polysemantic features

Neighborhood — ranked by edge-count

concept

Polysemantic Neuron
contradicts
A neuron that responds to multiple unrelated inputs, posing a major challenge for circuit-level interpretation

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Pure Functional Programmingframework0.787
A paradigm relying on recursion equations without assignment; Linda authors compare it on DNA sequence similarity problem.
Pure unityconcept0.783
The ultimate condition of living structure where the whole becomes a single, undivided entity made of beings, all rooted in the same I.
Feature Sparsityconcept0.774
Property that features activate on only a small fraction of inputs; enables compressed sensing and is what allows superposition to work
feature as applicationconcept0.759
Metaphor treating each system feature or function as a separate application that can be independently loaded and managed.
Feature Visualizationmethod0.758
Method of optimizing input to cause a neuron to fire maximally, used to characterize what a neuron detects; establishes causal link
Action Featuresconcept0.757
Dual interpretation of features: in addition to responding to inputs, features also act to increase probability of specific output tokens
Feature Universalityconcept0.753
Property of features that form consistently across different models trained on the same or similar data, suggesting features are real representational units
Sparse Feature Dictionaryconcept0.751
The extracted set of sparse interpretable features from model embeddings via SAEs