concept
active
concept:sparse-autoencoder-features

Sparse Autoencoder Features

Used in Anthropic welfare assessment to identify performative behavior and hidden emotional struggle co-activating features

Neighborhood — ranked by edge-count

Concepts (1)

concept

Findings (1)

finding

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.