paper:2022-04-30-stefan-lesser-prog22-master-pdf-978acdTechnical Dimensions of Programming Systems
TL;DR
Programming systems research has lacked a common analytic vocabulary comparable to what exists for programming languages, leaving systems like Smalltalk, UNIX, HyperCard, and Jupyter evaluable only through personal impression rather than structured comparison. Jakubovic, Edwards, and Petricek address this directly by introducing the Technical Dimensions framework: a catalogue of qualitative design axes organized into 7 clusters—interaction, notation, conceptual structure, customizability, complexity, errors, and adoptability—each characterized by two extreme positions rather than scalar scores. The framework is derived from qualitative analysis of landmark systems spanning three reference classes (language-based ecosystems, OS-likes, and application-focused systems) and demonstrated in two concrete applications: a dimensional analysis of the Dark programming system (showing how its single integrated mode collapsing development, debugging, and cloud deployment constitutes a specific position on the feedback-loops and modes-of-interaction dimensions) and a design-space exploration plotting 10 systems on self-sustainability versus notational diversity axes using a binary yes/no scoring method (detailed in Appendix A), which reveals a conspicuous blank region combining high values on both dimensions—a gap occupied by no existing system including COLAs, Boxer, or the Web. The paper argues this implies that the gap is not structurally forbidden but an unrealized opportunity, and that the Technical Dimensions framework as a whole enables a Kuhnian 'normal science' for programming systems: filling in the design-space map, enabling researchers to stand on prior work rather than repeatedly rediscovering it in isolation.
What to take away
- 1. The Technical Dimensions framework organizes programming system characteristics into 7 clusters (interaction, notation, conceptual structure, customizability, complexity, errors, adoptability), each defined by two qualitative extremes rather than quantitative scores.
- 2. At the LIVE 2020 and LIVE 2021 workshops, 5/6 and 6/7 papers respectively presented new systems rather than analyzing existing ones, empirically demonstrating the field's inability to build cumulative knowledge.
- 3. Plotting 10 systems—including Haskell, Jupyter, Boxer, HyperCard, UNIX, Smalltalk, Lisp, spreadsheets, COLAs, and the Web—on self-sustainability versus notational diversity reveals a conspicuous empty region at high values of both dimensions, representing an unrealized design opportunity.
- 4. The scoring method introduced in Appendix A converts each qualitative dimension into a small set of binary yes/no questions and sums the affirmative answers to produce coordinates; for self-sustainability the questions include whether programs can generate and execute programs, whether changes persist indefinitely, and whether low-level infrastructure can be reprogrammed from within the running system.
- 5. Dark's primary technical innovation, from a Technical Dimensions perspective, is collapsing development, debugging, and cloud operation into a single integrated mode of interaction, which the framework maps to the modes-of-interaction and feedback-loops dimensions and traces genealogically to Smalltalk's image-based environment.
- 6. The self-sustainability dimension distinguishes systems by whether user-level programming can progressively replace implementation-level components without stepping outside the system, with COLAs scoring highest (5/5) and Haskell scoring lowest (0/5) among the plotted systems.
- 7. CSS is identified as a concrete substrate instantiating the additive authoring property: its selector-based addressing mechanism allows arbitrary behavioral override by addition rather than modification, without requiring destructive access to existing declarations.
- 8. An open hypothesis the paper raises is whether Deep Learning represents a qualitatively new level of automation or merely the latest instance of a recurring pattern in which 'automatic programming' is always a euphemism for programming in a higher-level language than previously available.
- 9. A researcher replicating the design-space exploration method should generate binary yes/no questions for each dimension by anchoring them to a small set of example systems whose intuitive placement is already agreed upon, stop adding questions when the important distinctions between anchor points are captured, and treat disagreements among raters as signals to revise question formulation rather than answer coding.
- 10. The framework explicitly absorbs and repositions several prior concepts—Cognitive Dimensions of Notation (Green & Petre 1996), levels of liveness (Tanimoto 2013), and pluralism/communicativity (Kell 2017)—as special cases or sub-dimensions, arguing that notational analysis alone (Cognitive Dimensions) leaves the majority of a system's design space uncharacterized.
Peer brief — for seminar discussion
Jakubovic, Edwards, and Petricek identify the core problem as follows: while programming language research has decades of shared vocabulary, formal semantics, and comparative methods, the broader class of programming systems—including Smalltalk, UNIX, Jupyter notebooks, HyperCard, spreadsheets, and Dark—can only be evaluated impressionistically, making it impossible to situate new work against prior work or to identify what genuinely advances the state of the art. The response is the Technical Dimensions framework, a catalogue of qualitative design axes grouped into 7 clusters (interaction, notation, conceptual structure, customizability, complexity, errors, adoptability), each bounded by two characteristic extremes. Dimensions are derived through qualitative analysis of roughly a dozen landmark systems spanning language-based ecosystems, OS-like systems, and application-focused systems; the method is explicitly aligned with the 'evaluating programming systems' stance of Edwards et al. (PPIG 2019) and with Chang's complementary science as an alternative to pure empirical evaluation. The load-bearing finding is twofold. First, the dimensions provide sufficient resolution to perform a structured analysis of Dark—identifying its collapsing of development, debugging, and cloud operation into a single mode as the primary design move, and naming its use of live request data to drive handler construction as 'Error-Driven Development' with a traceable precedent in the PILOT system for Lisp (Teitelman 1966). Second, plotting 10 systems on self-sustainability versus notational diversity using a binary-question scoring method (Appendix A) reveals a structurally empty region at high values on both axes—a gap occupied by neither COLAs (high sustainability, low notational diversity, scoring 5 and 1 respectively) nor Boxer or the Web (high notational diversity, low sustainability). The paper argues this gap is not architecturally forbidden and constitutes an actionable design target, one the first author's forthcoming dissertation aims to occupy. This implies that the Technical Dimensions framework can function as what the paper calls a Kuhnian 'normal science' instrument: systematically filling in a design-space map so that future system builders can identify unexplored positions rather than repeatedly rediscovering the same motivating examples (Smalltalk, Bret Victor, spreadsheets). An alternative methodological approach the paper could have taken is empirical user studies or controlled experiments, but it explicitly declines this in favor of qualitative holistic analysis, citing Olsen's UIST evaluation heuristics as an analogous precedent in HCI. A critical reader would push back on the scoring method in Appendix A. The binary yes/no questions used to generate coordinates are generated informally by the three authors to roughly match their own prior intuitions about where certain systems sit; the authors acknowledge this explicitly ('we were trying to make those intuitions more precise'). This means the resulting scatterplot is not an independent confirmation of the framework's validity but a visualization of the authors' prior beliefs made more explicit. The empty region in the design space—the paper's most concrete empirical claim—is therefore an artifact of the chosen question set, the chosen systems, and the chosen axes, not a theory-neutral discovery. A skeptical reader could also contest whether 'self-sustainability' and 'notational diversity' are genuinely independent dimensions or whether the apparent gap reflects a functional constraint (highly self-sustainable systems tend toward uniform internal representations, which conflicts with notational diversity) that the qualitative framework lacks the precision to capture. The paper itself notes this possibility but dismisses it on intuitive grounds without formal argument.
Methods (1)
Frameworks (1)
Claims (6)
- Studying programming systems requires a paradigm shift from programming languages, where interaction between programmer and system becomes central rather than program code alone.
Core assertion that systems perspective is incommensurable with language perspective; interaction, not code, is what matters in systems analysis.
- Persistent address space is a design choice in which program code and state are preserved when powering off and can be accessed/modified using the same means.
Characteristic pioneered in MacLisp and Interlisp; contrasts with UNIX distinction between volatile memory and non-volatile disk.
- Programming systems research lacks a common vocabulary and systematic framework comparable to programming language research, hindering cumulative progress.
Problem statement motivating the technical dimensions framework; notes that publications tend to present singleton systems rather than addressing issues across multiple languages.
- Technical dimensions as map of design space
- Paradigm shift from languages to systems
- Programming systems deserve a theory too
Questions (3)
- What unexplored or overlooked possibilities exist in the design space of programming systems?
Motivating question for using technical dimensions as a map to reveal gaps in the design space.
- What are the appropriate boundaries for analyzing a programming system?
Raised via Java/Eclipse example; boundaries significantly affect analysis and are a design choice (minimal vs. realistic systems).
- Lack of common vocabulary for programming systems
Related work— refs + corpus + external arXiv
Cited / in-corpus / arXiv badges show which signals surfaced each row. Multi-source rows weighted higher.
- Opening the Hood of a Word Processorin corpus1984≈ 80%
- Towards a theory of conceptual design for softwarein corpus2015≈ 79%
- ≈ 79%
- Beyond Language: Format-Agnostic Reasoning Subspaces in Large Language ModelsZhiyuan Su Aojie Yuan2026≈ 79%
- Theory Under Construction: Orchestrating Language Models for Research Software Where the Specification EvolvesNikolaj Bj\"orner Halley Young2026≈ 78%
- Software Engineering for Collective Cyber-Physical EcosystemsGianluca Aguzzi, Giorgio Audrito, Ferruccio Damiani, Danilo Pianini, Giordano Scarso, Gianluca Torta, Mirko Viroli Roberto Casadei2025≈ 78%
- Beyond Human-Readable: Rethinking Software Engineering Conventions for the Agentic Development EraDmytro Ustynov2026≈ 78%
- Structural Diversity Drives Disruptive Scientific InnovationSaike He, Peijie Zhang, Kang Zhao, Yi Yang, Ning Zhang, Qingpeng Zhang, Daniel Dajun Zeng, Hao Peng Yichun Peng2026≈ 77%
- Free and Open-Source Software is not an Emerging Property but Rather the Result of Studied DesignPaolo Magrassi2010≈ 77%
- ≈ 77%
- Research on World Models Is Not Merely Injecting World Knowledge into Specific TasksKaixin Zhu, Daili Hua, Bozhou Li, Chengzhuo Tong, Yuran Wang, Xinyi Huang, Yifan Dai, Zixiang Zhang, Yifan Yang, Zhou Liu, Hao Liang, Xiaochen Ma, Ruichuan An, Tianyi Bai, Hongcheng Gao, Junbo Niu, Yang Shi, Xinlong Chen, Yue Ding, Minglei Shi, Kai Zeng, Yiwen Tang, Yuanxing Zhang, Pengfei Wan, Xintao Wang, Wentao Zhang Bohan Zeng2026≈ 77%
- ≈ 77%
- Steps Towards an Infrastructure for Scholarly SynthesisMatthew Akamatsu, David Vargas, Lukas Kawerau, Michael Gartner Joel Chan2024≈ 77%
- ≈ 77%
- ≈ 77%
- ≈ 77%
- ≈ 77%
- From Pixels to Digital Agents: An Empirical Study on the Taxonomy and Technological Trends of Reinforcement Learning EnvironmentsYiben Luo, Alexey Gorbatovski, Sergey Kovalchuk, Xiaodan Liang Lijing Luo2026≈ 77%
- ≈ 77%
- Comprehension Without Competence: Architectural Limits of LLMs in Symbolic Computation and ReasoningZheng Zhang2025≈ 77%
- GENIUS: An Agentic AI Framework for Autonomous Design and Execution of Simulation ProtocolsRoland Aydin, Diego Guedes-Sobrinho, Alexandre C. Dias, Maur\'icio J. Piotrowski, Wolfgang Wenzel, Celso Ricardo Caldeira R\^ego Mohammad Soleymanibrojeni2025≈ 77%
- Collective intelligence: A unifying concept for integrating biology across scales and substratesin corpus2024≈ 76%
- Talk Freely, Execute Strictly: Schema-Gated Agentic AI for Flexible and Reproducible Scientific WorkflowsArjun Vijeta, Chris Moores, Oliwia Bodek, Bogdan Nenchev, Thomas Whitehead, Charles Phillips, Karl Tassenberg, Gareth Conduit, Ben Pellegrini Joel Strickland2026≈ 76%
- ≈ 76%
- ≈ 76%
- ≈ 76%
- ≈ 76%
- The computational boundary of a 'self': developmental bioelectricity drives multicellularity and scale-free cognitionin corpus2019≈ 75%
- Learning without neurons in physical systemsin corpus2022≈ 75%
- ≈ 75%
+27 more
Similar preprints — Semantic Scholar
Cross-corpus bridges (4)
same_concept_as · Nomic cosineExternal markdown files that talk about the same concept as this entity.
- alexanderThe Art, Science, and Engineering of Programmingpapers/extracted/2022-04-30_Stefan-Lesser_prog22-master.pdf_978acd.md0.850
- aboutblank_kbFrameworks Comparisonsynthesis/frameworks-comparison.md0.806
- alexanderFrameworks Comparisonapplied/from-research-stack/frameworks-comparison.md0.798
- alexanderTowards a Theory of Conceptual Design for Softwarepapers/extracted/2023-03-09_Stefan-Lesser_concept-essay.pdf_161cb7.md0.786