concept
active
concept:unnatural-outputsunnatural outputs
Artifactual behaviors produced when interventions cut off the data manifold, e.g., via linear steering.
Neighborhood — ranked by edge-count
Papers (1)
paper
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Configurations created by human designers that violate structure-preserving unfolding, lying outside the set L of living structure.
- The correctness of a model's generated outputs, distinct from the correctness of statements provided as input.
- Specification relating a program's inputs and outputs, analogous to illocutionary correctness.
- Diagrammatic encoding of program behavior via concept lattices reveals reachability structure and non-determinism without fixed calculational rules.
- Models can distinguish artificially prefilled outputs from intentional responses by referencing prior internal representations; injection of matching concept vector causes model to retroactively accept prefill as intentional.
- Ability to distinguish one's own outputs from those of other models or humans; related to prefill detection.
- Computational modeling approach studying living/cognitive systems; used to test hypotheses about self-illusion effects without direct intervention.