community
active
leiden_hybrid_concepts
label: sonnet
community:leiden_hybrid_concepts-run2-c106

Sparse autoencoder interpretability limits

Critiques of SAEs for mechanistic interpretability, focusing on activation vs. parameter decoding gaps.

2 members. Each node is clickable.

Loading graph…

Drawn from 2 sources

The papers/notes whose extracted claims & findings make up this cluster.

Bridges (3)

Other communities that share members with this one — cross-cutting threads or papers that sit at the seam between two themes.

Claims (2)