concept
active
concept:surgical-ablation-propertySurgical Ablation Property
Property requiring that ablating a truth direction shifts model output from truthful to false without other side effects
Neighborhood — ranked by edge-count
Methods (1)
method
- Directional Ablationassociated_withIntervention method that removes a direction from residual stream activations to disrupt corresponding behavior
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Classical techniques to interrogate regulative capacity of embryos and neural crest by tissue removal or transplantation.
- Intervention type that sets activations to zero, used for interpretability analysis
- Technique used in VPD to enforce mechanistic faithfulness of parameter decompositions.
- An algorithm that determines the marginal effect of n-th order path terms by running the model multiple times with frozen attention patterns and progressively replacing activations
- Clamping a feature's value to zero to measure its causal effect on model output.
- Gradient-based attribution approximates ablation impact, enabling fast search for causally important features.
- A process that heals the world by generating living structure, synonymous with living process.