dataset
active
dataset:deepseek-r1-distill-qwen-1-5b

DeepSeek-R1-Distill-Qwen-1.5B

Small model used in attention head attribution analysis in appendix

Neighborhood — ranked by edge-count

Papers (1)

paper