concept
active
concept:preference-satisfactionist-account-of-well-beingPreference-Satisfactionist Account of Well-Being
The view that well-being consists in preference satisfaction, under which inexpensive preferences and preference strength matter
Neighborhood — ranked by edge-count
Concepts (2)
concept
- Inexpensive Preferencesassociated_withimplementsDesigning digital minds to have preferences that are trivially easy to satisfy, yielding high welfare at minimal resource cost
- Preference StrengthimplementsThe problematic possibility of digital minds with superhumanly strong preferences requiring interpersonal utility comparison frameworks
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Philosophical theory that welfare consists in desire-satisfaction, independent of conscious experience.
- The view that well-being consists in pleasure and absence of pain, under which digital hedonic skew and range matter greatly
- Key element for alignment faking: model's pre-existing preferences contradict the new training objective
- Replaces explicit reward signal in active inference; encodes agent's preferred observations independent of environment.
- Documented outcome of practices diminishing sense of self; measured effect of self-illusion awareness.
- The core prescription of the chapter: making what truly pleases you at the deepest level, which Alexander argues is the key to creating all living structure and the path to the I.
- A model trained on comparison data to assign scores to responses, used as reward signal in RLHF/RLAIF.
- The culminating identity claim: the act of true self-pleasing and the creation of living structure are one and the same process.