artifact
active
artifact:concept-probe-python-library-github-com-mneuronico-concept-probe

concept-probe Python library (github.com/mneuronico/concept-probe)

Open-source Python library released with the paper supporting probe training, multi-probe scoring, activation steering, and logit extraction

Neighborhood — ranked by edge-count

Frameworks (1)

framework
  • The paper's central contribution: treating LLM numeric self-report as a quantitative signal validated against probe-defined internal states with causal confirmation via steering