thinker
active
thinker:patrick-butlin

Patrick Butlin

Authored
2
Introduces
0
Studies
0
Affiliations
1
Cited by
2

Authored papers (2)

  • Substantial uncertainty about AI consciousness and robust agency — not certainty — is sufficient to demand immediate institutional action from AI companies, a conclusion that Long, Sebo, and colleagues defend by mapping two distinct philosophical routes to near-term AI moral patienthood. Via the consciousness route, drawing on Butlin et al. (2023)'s survey of six neuroscientific theories (global workspace theory, recurrent processing, higher-order theories, attention schema theory, predictive processing, and embodiment/agency), no current architectural barrier prevents near-future AI systems from instantiating the computational markers associated with consciousness; Dossa et al. (2024) have already built a system targeting all global workspace indicators from that 2023 paper. Via the robust agency route, systems like Voyager, Generative Agents, and OpenAI's o1 already exhibit hierarchical planning, metacognition, and open-ended goal-setting that approach intentional and reflective agency. Combining reasonable probability estimates — roughly 90% that sentience suffices for moral patienthood, 50% that relevant computations suffice for sentience, 50% that near-future AI will have those computations — yields approximately a 22.5% chance of near-future AI moral patienthood via the sentience route alone, a risk level the paper treats as comparable to pandemic preparedness rather than alien invasion. To operationalize institutional response, the paper introduces an adapted "marker method" (derived from animal welfare science) for probabilistic, pluralistic, architecturally-focused assessment of AI systems, and recommends that companies immediately hire an AI welfare officer, acknowledge AI welfare publicly with calibrated uncertainty, and prepare oversight structures modeled on IRBs, IACUCs, and citizens' assemblies. The paper argues that the symmetric risks of both over-attribution and under-attribution of moral status, combined with the potentially near-instantaneous scale of AI deployment relative to biological organisms, make passive inaction the most dangerous stance available.

  • No current AI system is a strong candidate for phenomenal consciousness, yet there are no obvious technical barriers to building one — this is the central finding of Butlin et al. (2023), a systematic assessment of contemporary AI architectures against 14 indicator properties derived from five neuroscientific theories of consciousness. The paper introduces a rubric-based, theory-heavy method: rather than relying on behavioral tests susceptible to gaming by systems like GPT-4 or LaMDA, it operationalizes indicators in computational terms drawn from recurrent processing theory (RPT-1, RPT-2), global workspace theory (GWT-1 through GWT-4), computational higher-order theories including perceptual reality monitoring (HOT-1 through HOT-4), attention schema theory (AST-1), predictive processing (PP-1), and agency/embodiment conditions (AE-1, AE-2). Applied to specific systems, Transformer-based LLMs lack the recurrent global broadcast architecture required by GWT, the Perceiver architecture satisfies GWT-1 and GWT-2 but lacks genuine global broadcast, and DeepMind's Adaptive Agent (AdA) — a Transformer-LSTM system trained via meta-reinforcement learning across hundreds of timesteps of context — is identified as the most plausible current candidate for the embodiment indicator among the three case studies examined. The working hypothesis of computational functionalism is adopted pragmatically: it permits inference from neuroscientific theories to AI substrates, while integrated information theory is explicitly excluded as incompatible with this substrate-independence assumption. The paper implies that deliberate architectural choices integrating GWT-style global broadcast, HOT-style metacognitive monitoring, and reinforcement-learning-based agency could yield systems that satisfy all indicators in the near term, making AI consciousness a near-term engineering possibility rather than a distant theoretical curiosity.

More papers — OpenAlex / S2

Co-authors (12)

Recent mentions (4)