finding
active
finding:golden-gate-bridge-feature-34m-31164353-fires-strongly-on-wikipedia-snippets-in-chinese-japanese-korean-russian-vietnamese-greekGolden Gate Bridge feature [34M/31164353] fires strongly on Wikipedia snippets in Chinese, Japanese, Korean, Russian, Vietnamese, Greek.
Demonstrates multilingual generalization of SAE features.
Source paper
extracted_fromNeighborhood — ranked by edge-count
Claims (1)
claim
- Features respond to concepts across languages and in images, not just text.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Example of geometric clustering of features.
- Strong causal evidence that the feature represents the bridge.
- Validation that top activations are highly specific to interpretation.
- Example: a bridge whose construction process and sensitive placement intensified the natural beauty of the gap.
- Universality of Hebrew script feature across two transformers
- Empirical observation of feature splitting.
- Arabic feature A/1/3450 and B/1/1334 have activation correlation of 0.91 across 40M tokensfinding0.701Demonstrates universality of the Arabic script feature across two independently trained transformers
- Universality of base64 feature across two transformers