finding

active

finding:interest-probe-peak-cohen-s-d-1-67-layer-14-p-9-45-10-6-in-llama-3-2-3b

Interest probe: peak Cohen's d=1.67 (layer 14), p=9.45×10⁻⁶ in LLaMA-3.2-3B

Probe validation result confirming interest direction captures meaningful structure

Source paper

extracted_from

(2026) · Nicolas Martorell · Bianchi, Bruno

concept

Focus probe (distracted vs. focused)
supports
One of four emotive concept probes trained; contrastive pair distracted/focused with best layer 10 in LLaMA-3.2-3B
Interest probe (bored vs. interested)
supports
One of four emotive concept probes trained; contrastive pair bored/interested with best layer 14 in LLaMA-3.2-3B

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Impulsivity probe: peak Cohen's d=3.60 (layer 13), p=3.58×10⁻¹³ in LLaMA-3.2-3Bfinding0.898
Strongest probe validation result; highest Cohen's d among the four concepts
Wellbeing probe: peak Cohen's d=3.34 (layer 16), p=7.21×10⁻¹³ in LLaMA-3.2-3Bfinding0.885
Probe validation result confirming wellbeing direction captures meaningful structure
Gemma 3 4B wellbeing probe: peak Cohen's d=1.8finding0.815
Weaker cross-family probe; explains weaker introspection in Gemma
Qwen 2.5 7B wellbeing probe: peak Cohen's d=3.5finding0.807
Strongest cross-family probe; explains clearer introspection in Qwen than Gemma
LLaMA E3 geometry summary: S_max = −1.896 ± 0.211, AUS_N = −2.119 ± 0.198, peak layer ℓ* = 10 [IQR 0.384]finding0.798
Seed-pooled geometry statistics for LLaMA in E3, providing quantitative basis for geometry-to-behavior correlate
LLaMA-3.1-8B: Sbmax = -1.896 ± 0.211, AUSN = -2.119 ± 0.198, peak layer ℓ* = 10 (median)finding0.797
Seed-pooled geometry-only statistics (per-dev z units).
Interest probe score drifts positively across turns: LMM slope=0.005, p=4.12×10⁻¹⁴ in LLaMA-3.2-3Bfinding0.789
Demonstrates genuine internal-state dynamics in LLMs during multi-turn conversation
MM probe trained on likely dataset achieves NIE of 0.70 (false→true) on LLaMA-2-13B, surprisingly strong but weaker than truth probesfinding0.784
Likely-trained MM probe is a surprisingly effective causal baseline due to correlation between truth and probability on sp_en_trans