method
active
method:logistic-regression-probe

Logistic Regression Probe

Standard linear probing technique; compared to mass-mean probing for classification accuracy and causal implication

Neighborhood — ranked by edge-count

Frameworks (1)

framework
  • Introduced in this paper: an optimization-free probing technique using difference-in-means direction with optional covariance correction

Concepts (1)

concept
  • The direction logistic regression converges to on linearly separable data; shown to be suboptimal for identifying truth direction

Methods (1)

method

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • Ridge regression fit on top-256 PCs of Gemini embeddings to predict model layer-40 activations and compute residuals
  • Method used to predict model activations from Gemini embeddings and compute residuals for probe construction
  • Linear classifier approach applied to model activations to identify which training datapoints caused undesired behaviors in post-training.
  • Probesconcept0.753
    Interpretability tools that decode information from internal model activations; here, linear probes are used for data attribution.
  • Sigmoid fit linking S to success probability.
  • Probe method combining causal interventions and structural analysis, supported by pyvene's activation collection
  • Fitting a logistic function to success probability as a function of S or shot count to estimate midpoints and widths.
  • Linear Probemethod0.744
    Simple linear classifiers trained on model activations used as the probing technique within the introduced method.