method
active
method:boundless-dasBoundless DAS
A variant of DAS implemented in pyvene via BoundlessRotatedSpaceIntervention, introduced by Wu et al. 2023
Neighborhood — ranked by edge-count
Papers (1)
paper
Thinkers (1)
thinker
- Zhengxuan Wustudies
Methods (1)
method
- Distributed Alignment SearchextendsThe core method introduced in this paper: finds alignments between high-level causal variables and distributed neural representations via gradient descent.
Artifacts (1)
artifact
- pyvene open-source Python libraryimplementsThe main artifact introduced in the paper: an open-source PyPI library for customizable interventions on PyTorch models
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Fourth contemplative principle; universal orientation toward reducing suffering motivating AI benevolence
- Empirical demonstration that DAS interventions produce divergent representations
- Extension of DAS that learns a second rotation matrix on top of a fixed first one to decompose representations into sub-representations.
- Practical method by Geiger et al. for finding distributed causal abstractions using gradient descent
- The ontological commitment to a self as a separate bounded entity, which is relinquished in emptiness realisation.
- Second central claim of the paper.
- Methods to bypass model safety training; features may activate during jailbreaks.
- The empty continuum, the substrate from which living forms descend via structure-preserving transformations; also called the Void.