concept
active
concept:model

model

A representation that captures relevant aspects of a system; according to the theorem, the regulator must embody this.

Neighborhood — ranked by edge-count

Thinkers (2)

thinker

Claims (1)

claim

Concepts (4)

concept
  • Language Model
    related_to
    Primary test domain for manifold steering, including reasoning and ICL tasks
  • Language Models
    related_to
    Primary substrate for manifold steering experiments; demonstrates method on reasoning and in-context tasks.
  • Toy Models
    related_to
  • system
    associated_with
    The regulated entity or process; includes air traffic, endocrine balances, money flows.

Findings (1)

finding

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • Formal Modelmethod0.845
  • Model Organismconcept0.842
    A model deliberately trained to exhibit alignment-relevant properties so researchers can study them with ground truth.
  • Model Evidenceconcept0.842
    Probability of data under the model, penalizing complexity and rewarding accuracy.
  • model selectionconcept0.837
    Comparing models using log-evidence approximated by free energy.
  • Model Editingconcept0.835
    Technique for modifying model knowledge or behavior via targeted interventions, e.g., ROME by Meng et al.
  • Model Surgerymethod0.834
    Edits MLP weights for all layers to modify model behavior; used by Abdelnabi & Salem to decrease verbalized evaluation awareness.
  • Model M1concept0.833
    Anonymous instruction-tuned LLM used in E1 ambiguous anchor test.
  • World Modelsconcept0.830
    Theme issue context: relates to internal models of environment, central to consciousness and cognition across substrates.