method
active
method:concept-erasure

Concept Erasure

Interpretability method backed by linear representation hypothesis for removing concept information

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.