Reasoning Theater Framework

The conceptual framework introduced by the paper distinguishing performative CoT from genuine reasoning using activation probing

Neighborhood — ranked by edge-count

paper

method

CoT Monitor
uses
Named method for monitoring chain-of-thought text to detect when the model signals its answer, compared against activation probes
Early Forced Answering
uses
Named evaluation protocol: truncating CoT at various points and forcing the model to give a final answer, to measure when the answer stabilizes

concept

Activation Probing
uses
Technique of reading out model beliefs from internal activations before the final answer token is generated

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Frameworkconcept0.756
1984 Ashton-Tate integrated system with frames, FRED language, and overlapping windows; design reference for Playground's approach.
Role Play Framework for Dialogue Agentsframework0.754
The primary conceptual framework proposed: understanding dialogue agent behaviour as role play of characters
Information Theoretic Frameworkframework0.745
Tool AI frameworkframework0.741
Bostrom's category of AIs that perform specific tasks without overarching goals.
Genie AI frameworkframework0.738
Bostrom's category of AIs that produce desired results given commands but do not act autonomously.
Oracle AI frameworkframework0.732
The view of AI as a question-answer system optimized for correctness, often inherited from supervised learning.
Relevance Realization Frameworkframework0.722
A Mathematical Framework for Transformer Circuits (Elhage et al., 2021)concept0.722
Foundational mechanistic interpretability paper on transformer circuit analysis