method
active
method:selectivitySelectivity
Adapted control task metric measuring difference between odds-ratio on original task and arbitrary-label control task
Neighborhood — ranked by edge-count
Frameworks (1)
framework
- CausalGymusesMulti-task benchmark of linguistic behaviours for measuring causal efficacy of interpretability methods, adapted from SyntaxGym
Methods (1)
method
- Adaptation of Hewitt and Liang control tasks to CausalGym: next-token labels replaced with arbitrary tokens to measure method expressivity
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Comparing models using log-evidence approximated by free energy.
- The directness of motivation by practical concerns, characteristic of living processes in the examples.
- Choice of policies minimizing expected free energy to realize preferred future states.
- The capacity of materials and techniques to allow fine-tuning of dimensions and shape to each unique building condition; identified as the biggest issue in achieving living architecture.
- Requirement that answers to questions be responsive as well as truthful; requires knowing that questioner will know the answer after receiving it.
- Formal notion of what constitutes an individual agent; bridges Buddhist and information-theoretic perspectives.
- Mechanism that selects information from modules for representation in the global workspace.
- The stage in the genetic algorithm where top 10% embryos by phenotypic fitness are selected.