finding
active
finding:2d-projections-of-activations-show-clearly-separable-clusters-for-f0-f2-and-a1-at-layer-25-but-increasingly-entangled-activations-for-f4-f5-and-a2-a32D projections of activations show clearly separable clusters for F0-F2 and A1 at layer 25, but increasingly entangled activations for F4-F5 and A2-A3.
Visual geometric evidence for the fundamental entanglement of true/false activations in harder tasks.
Source paper
extracted_from(2026) · Angelos Poulis · Mark Crovella · Evimaria Terzi
Neighborhood — ranked by edge-count
Claims (1)
claim
- Establishes task difficulty as a hard limit that instructions cannot overcome.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Demonstrates that early-layer probes capture sentence polarity rather than truth.
- Structural finding showing modular organization within the sparse neuron set
- Striking mechanistic finding that injection creates universally detectable perturbation in residual stream immediately downstream
- Interpretive claim attributing representational pattern to internal model state during threat-based deception
- Explains why time and sequence are essential for generated complexity.
- Connects this study's results to Schrimpf et al. 2021 and Caucheteux et al. 2022/2023 findings on brain-LLM alignment.
- Single dendritic layer solves XOR-like problems with capacity matching 8-layer deep networks.finding0.754Evidence from Beniaguev et al. (2021) that individual biological neurons vastly outperform McCulloch-Pitts model; supports hybrid computation claim.
- Quantitative verification of the mechanistic theory; both circuits required for the induction algorithm show the predicted copying/matching structure