framework
active
framework:solu-activation-function

SoLU Activation Function

Prior Anthropic approach to increasing neuron monosemanticity via activation function design; found to make some neurons more interpretable at cost of others

Neighborhood — ranked by edge-count

Claims (1)

claim

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • The nonlinear activation function used in MLP layers; prevents the linearization approach used for attention layers from extending to MLP layers
  • Activationsconcept0.727
    Internal representations of the model on which probes operate; the method uses activations to rank datapoints.
  • Token-level analysis of OTD and backtracking latent activations aligned at correction points across episodes
  • Intervention method that adds a learned direction vector to residual stream activations to steer model behavior
  • Component of the contrastive retrieval pipeline analyzing activation statistics.
  • Activation Probingconcept0.687
    Technique of reading out model beliefs from internal activations before the final answer token is generated
  • Latent model activations when processing inputs framed from another agent's perspective
  • The conventional approach (e.g., SAEs, transcoders) of decomposing activations into interpretable features.