method
active
method:concept-erasureConcept Erasure
Interpretability method backed by linear representation hypothesis for removing concept information
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Central entity of Jackson's framework: a structure invented to give coherent account of immediate consequences of actions; the building block of software design
- An increment of functionality consciously introduced by a designer to serve a purpose; the building block of a system's design.
- How a neural network encodes a semantic concept internally, argued to be better captured by manifolds than by atomic features.
- Jackson's operational definition of a software concept.
- A reusable concept pattern applicable across contexts, with known purpose and operational principle.
- Probabilistic framework formalizing concept-specific subspaces for targeted steering in generative models.