Wellbeing probe (sad vs. happy)

One of four emotive concept probes trained; contrastive pair sad/happy with best layer 16 in LLaMA-3.2-3B

Neighborhood — ranked by edge-count

method

Contrastive mean-difference probe
implements
Probe construction method: concept vector at each layer is L2-normalized difference between mean positive and mean negative representations from contrastive system prompts

concept

Emotive states in LLMs
implements
Directions in activation space associated with contrastive emotive concept pairs studied in this paper as targets for introspection

finding

Wellbeing probe: peak Cohen's d=3.34 (layer 16), p=7.21×10⁻¹³ in LLaMA-3.2-3B
supports
Probe validation result confirming wellbeing direction captures meaningful structure

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Interest probe (bored vs. interested)concept0.729
One of four emotive concept probes trained; contrastive pair bored/interested with best layer 14 in LLaMA-3.2-3B
Well Beingconcept0.729
Documented outcome of practices diminishing sense of self; measured effect of self-illusion awareness.
Wellbeing Internationalinstitute0.715
Organization that made the article open access.
Probesconcept0.714
Interpretability tools that decode information from internal model activations; here, linear probes are used for data attribution.
Feelingconcept0.706
The experiential measure of life; a living process is congruent with and governed by feeling, and the feeling a place presents is the measure of its life.
Feeling (as distinct from emotion)concept0.703
Alexander distinguishes 'feeling' — the sense of being part of the ocean, sky, world — from emotions like happiness, sadness, or anger
Logistic Regression Probemethod0.699
Standard linear probing technique; compared to mass-mean probing for classification accuracy and causal implication
What Is The Relationship Between Addressing Individual Wellbeingquestion0.694