finding
active
finding:golden-gate-bridge-feature-neighborhood-includes-alcatraz-presidio-lake-tahoe-yosemite-decoder-cosine-similarity-maps-onto-semantic-relatednessGolden Gate Bridge feature neighborhood includes Alcatraz, Presidio, Lake Tahoe, Yosemite; decoder cosine similarity maps onto semantic relatedness.
Example of geometric clustering of features.
Source paper
extracted_fromNeighborhood — ranked by edge-count
Claims (1)
claim
- Decoder cosine similarity maps onto concept similarity.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Demonstrates multilingual generalization of SAE features.
- Validation that top activations are highly specific to interpretation.
- Example: a bridge whose construction process and sensitive placement intensified the natural beauty of the gap.
- Strong causal evidence that the feature represents the bridge.
- Identifying related features by cosine distance in SAE decoder space.
- Hypothesized intermediate feature explaining antipodal alignment between cities and neg_cities in early-middle layers
- Empirical observation of feature splitting.
- Marriott hotel case.