concept
active
concept:scaling-laws-for-activation-steering-with-llama-2-models-and-refusal-mechanisms-ali-et-al-2025

Scaling Laws for Activation Steering with Llama 2 Models and Refusal Mechanisms (Ali et al., 2025)

Related work finding larger models more resistant to steering, potentially consistent with ESR in 70B

Neighborhood — ranked by edge-count

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.