hypothesis
active
hypothesis:linear-representation-hypothesis-neural-networks-represent-meaningful-concepts-as-directions-in-their-activation-spaces

Linear representation hypothesis: neural networks represent meaningful concepts as directions in their activation spaces.

Foundation for interpreting features as linear directions.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.