(Not to be confused with the ACM SIGPLAN conference of the same name.)

1Designing the language / a language

  • Instead of designing a new language from scratch, improve and unify existing languages (dialects of Lisp); otherwise we will end up reimplementing the same thing.
  • Only MIT/BSD/Apache2/similar/public-domain; no GPL/AGPL/LGPL; no proprietary.
  • Call-by-value (applicative-order evaluation) is OK; we can always make abstract interpreters later.
  • Typing more should produce more return of effort: types are optional, but they help the computer help the programmer. Programmers should declare types (or use any feature of the language) because they want to (as in Common Lisp or TypeScript), not because they have to (as in Java).
  • Design a language.

2Instead of functions and macros, do partial evaluation?

  • Problem:
    • Macros are functions.
      • Macros are functions from AST to AST.
      • Macros are functions that are called at compile-time; ordinary functions are called at run-time.
    • My biggest beefs with macros:
      • It is unclear which arguments are evaluated. Example: Why does in-package not evaluate its argument but use-package does?
      • Can't pass and to zip (although we can work around this in Racket with set!-transformers).
  • Solution:
    • Partial evaluation cleanly unifies macros and ordinary functions?
  • for all x y z: (let ((x y)) z) = ((lambda (x) z) y)
    • Staging/compilation side-effects should not affect run-time.
    • Compile-time state and run-time state should be totally separate.

3Biggest problems in Lisp

  • module/namespace
  • macro
  • quoting
    • I think quoting should satisfy these properties for all x:
      • (equal (quote x) x)
      • (equal (unquote (quote x)) x)
    • Important questions?
      • What is the difference between unquote and eval?
      • Is quote a function? If so, what is its domain and codomain?
  • multiple return values?

4Which intermediate representation/language should we target?

5Features for programming in the large

Module/namespace/name-management/what do you call it

5.1<2019-11-07> Distinguishing between modules, namespaces, and compilation units

A namespace is a mapping.

A compilation unit is the smallest thing that can be compiled.

They are often conflated into a module, in the sense that a module often serves multiple functions like separate namespace, separate compilation, and dependency management, but perhaps it is time that we distinguish between them.

Racket units are namespaces (in my sense) and Racket namespaces are namespaces (in my sense) too. Confusing.

6Toward a curriculum for programming language design and implementation

Alternative title: "Exploring the programming language design space"

6.1What is a good language?

  • A good language efficiently represents meanings.

6.2How do we design a programming language?

  • Reuse Lisp syntax (S-expressions).
  • Design the semantics (the meaning).
    • Consider implementability.
    • Consider the future; consider probability/risk of changes.

6.3Implementation?

  • Interpretation vs compilation
  • Garbage collection
    • Tracing; not moveable; simple to implement
    • Copying; moveable; performant

6.4What?

We can begin designing from the semantics (our desires) or from the run-time system (our constraints). All design is a compromise between the ideal and the practical. But if the ideal needs to be compromised, is it truly ideal?

To consider:

6.5When should you make a DSL (domain-specific language)?

  • When you will be writing many programs in the language so that it saves you effort. Example:
    • Write 100 similar systems in Java: 100 * 10,000 = 1,000,000 units of effort
    • Write 100 similar systems in DSL implemented in Lisp: 10,000 (translator) + 100 * 1,000 = 110,000 units of effort
    • Compare: writing math in math notation \(x^2+1=0\) vs writing math in spelled-out English ("a number such that its square plus one is zero").
  • And when you can maintain the DSL.
  • What? Mernik et al. 2005 [5]4

6.6How do we specify semantics?

6.7What are some recent developments?

What are some recent innovations in programming languages?

What are some recent innovations in In programming-language semantics?

??? https://www.semanticscholar.org/paper/Practical-partial-evaluation-for-high-performance-W%C3%BCrthinger-Wimmer/1d4c0211549a8fe259a273da88c63e8f00fef463

6.8What we are going to do here

Here we design a Lisp, its semantics, and its interpreter. (Note that, in the 21st century, Lisp means a family of languages, not a particular language.)

We design a Lisp because we want to focus on semantics; the syntax can always be improved later.

We name our language "PL-0" (not to be confused with Wirth's PL/0).

We want a programming language for implementing programming languages. We want to easily specify alternative semantics. There are many prior arts, such as Racket; Felleisen et al. 2008 [2] discusses some problems of Racket.

6.9Should we make a Forth or a Lisp?

The question is wrong. The answer is: we should begin from mathematics.

We did not make a stack-based language such as Forth or PostScript because we did not know how to implement optional arguments in a stack-based language other than by explicitly passing a possibly empty list.

So we choose to make a Lisp instead.

But, why not both? Why not make a Lisp interpreter on Forth?

The question of Forth vs Lisp is: How hard should we allow the user to crash without foreign procedures: just an exit due to an unhandled exception (Lisp), or an abort due to a segmentation fault (Forth)?

6.10Important psychological issues

  • The language must come with a benevolent dictator.
  • The language must come with an official formatter (like gofmt). Otherwise people will waste energy bickering about code style. With an official formatter, we can use authority to nip such wasteful debates in the bud.
    • The official formatter does not have to be the best, but should be good enough to prevent others from wasting energy writing competing formatters.

7Marketing

  • Who decides what language is used?
    • Programmers in their hobby projects?
    • Managers in their companies?

8What?

On programming by examples. Erik Meijer has an interesting presentation about machine learning, that is, programming computers by examples. Can we create a programming language in which programming by examples is elegantly expressible?

If it is impossible to invent better primitives, then the only way to make programming easier is to make computers program themselves more.

What is the significance of the Ballerina programming language5?

9On the mathematics of programming languages

9.1Science?

A programming language should be derived from observations and principles, in the same way physical theories are derived from observations and principles?

9.2Programming languages are formal systems

(Hofstadter's BlooP and FlooP?)

A programming language is a formal system.

This is a point of view:

prog. lang. theory formal logic
syntax formal language
program formula
primitive value axiom
primitive combinator inference rule
programming language formal system

Another point of view is the Curry–Howard isomorphism:

prog. lang. theory intuitionistic logic
program constructive proof
type proposition
programming language logic system

See also Curry–Howard–Lambek isomorphism (called "computational trinitarianism" by Harper 2011).

Recall that:

formal system        = formal language + interpretation
programming language = syntax          + semantics

9.3The essence of syntax is finite describability

A formula does not have to be a string, but may be a graph, for example. A language does not have to be textual, and even text does not have to be one-dimensional. Syntax does not have to be one-dimensional; the only restriction is that the syntax is finitely describable.

The semantics (the meaning) of a formula is a mathematical object, such as a number, a set, a state transformer (a special case of functions).

The semantics of a processor instruction is usually a state transformer.

We can map a state transformer to a number by forgetting the parts of state that are preserved by the transformer. (Compare "forgetful functors" in category theory.)

A computation is pure iff its state transformer semantics can be reduced to value semantics???

10Creating languages with the "tower of semantics" approach

10.1Semantics of values / arithmetic expressions

Let \(P\) be the set of programs.

An example value semantics (calculator): \(S : P \to \Nat\). \[ S(a+b) = S(a)+S(b) \]

10.2Semantics of errors

We add errors.

Let \(V = \Nat \cup \Set{\bot}\).

\[ S(n/0) = \bot \]

10.3Semantics of bindings

Let \(N\) be the set of names.

Let \(V = \Nat\) be the set of values.

Let \(C = N \to V\) be the set of contexts.

We add binders (basic math expression with eager dynamic binding): \(S : C \times P \to \Nat\).

\[ S(c,\text{var}(v)) = c(v) \] \[ S(c,\text{let}(n,e,b)) = S(c + \Set{(n,S(n,e))},b) \]

What use is mathematics, if the mathematics ends up looking like a Lisp/Prolog/Haskell program!?

The argument for eager "let": If you don't use a variable, simply don't define it.

The argument for lazy "let": Immutable cyclic data cannot be done with eager "let".

10.4Semantics of application?

\[ S(f(x)) = [S(f)](S(x)) \]

10.5Semantics of state / mutation?

We add mutable binders.

An example state-transformer semantics: \(M : (P \times E \to \Nat) \to (P \times E \to \Nat)\).

10.6Garbage collection simplifies semantics?

It is surprising that programming languages that have higher implementation complexity can have simpler semantics.

It is surprising that higher-level languages are more complex to implement but simpler to reason about.

If all we have is machine code (that is, if we are writing the first electronic computer program ever), is it harder to implement a minimal assembler or a minimal Lisp interpreter?

11Informal program complexity

The complexity of a program is the difficulty of describing what it does.

The complexity of a program is the length of the description of what it does.

12Programs that write programs

Programs that write abstract syntax trees?

(defmacro (make-prin1 x)
  "A trivial example."
  `(progn
    (prin1 ,x)
    (terpri)))

Programs that write concrete syntax trees?

  • Instead of read, there should be read-cst and cst->ast. Preserve source location for formatter and refactoring tools.

It should be possible to parse programs without executing them?

13Specification and complexity

13.1Motivation

We want to define a complexity measure of specifications.

We want to be able to determine whether specification X is easier to implement than specification Y.

But isn't this NP-hard? What are our chances? If it is impossible to determine whether specification X is easier to implement than specification Y, then is "software engineering" possible at all?

13.2The syntax of specifications

A specification is a positive first-order logic formula.

Informally, a positive formula is a formula that does not contain negation.

In the rest of this document, by "formula" we usually mean "positive first-order logic formula".

The abstract syntax of formulas (expressions) is:

  • Each atom is an expression. An atom is an identifier such as \(p\), name_123, \(\wedge\), \(\forall\), etc.
  • If each of \(\beta,\alpha_1,\ldots,\alpha_n\) is an expression, then the application \(\beta(\alpha_1,\ldots,\alpha_n)\) is an expression.

Should we have these?

  • An atom is a zero-arity functor: \( \alpha = \alpha() \).
  • Currying: \( [\alpha(\beta_1,\ldots,\beta_n)](\gamma,\ldots) = \alpha(\beta_1,\ldots,\beta_n,\gamma,\ldots) \).

13.3Syntactic complexity partial order

What do we mean by "X is less complex than Y"?

The key idea of the syntactic-complexity partial-order \(\le\) is "If \(\alpha\) occurs in \(\beta\), then \(\alpha \le \beta\)":

\[\begin{align*} \alpha &\le \beta(\ldots, \alpha, \ldots) \\ \alpha &\le \alpha(\ldots) \end{align*} \]

Examples:

\[\begin{align*} A &\le \wedge(A,B) \\ B &\le \wedge(A,B) \\ A &\le f(A) \\ A &\le [A(B)](C) \\ x &\le +(x,y) \\ z &\le \wedge(x,\wedge(y,z)) \\ \forall &\le \forall(x,y) \\ A(x) &\le \forall(x,A(x)) \end{align*} \]

13.4Syntactic complexity measure

The syntactic complexity of a formula \(x\) is a natural number \(c(x)\) where:

\[\begin{align*} c(x) &= 1 & \text{if \(x\) is an atom} \\ c(\beta(\alpha_1,\ldots,\alpha_n)) &= c(\beta) + c(\alpha_1) + \ldots + c(\alpha_n) \end{align*} \]

Numbers are encoded as Peano numerals like \(z,s(z),s(s(z))\), etc., because we want bigger numbers to have bigger syntactic complexity.

13.5Semantic complexity measure

The semantic complexity of a formula \(x\) is \(C(x) = c(y)\) where \(y\) is the simplest (the least syntactically complex) formula that is equivalent to \(x\). \[ C(\alpha) = \min_{\alpha \equiv \beta} c(\beta) \]

Iff \(\alpha \equiv \beta\) then \(C(\alpha) = C(\beta)\).

Example: In classical logic, \(C(\wedge(\alpha,\alpha)) = C(\alpha)\).

In classical logic, \(C\) is unmonotonic, for example: \(C(\neg\neg\alpha) \le C(\neg\alpha)\).

The semantic complexity of a formula is the syntactic complexity of its simplest equivalent formula.

? "essential complexity" = "irreducible complexity" = "semantic complexity"

? "accidental complexity" = "reducible complexity" = "syntactic complexity"

(Brooks et al.?)

13.6Specification complexity

Let x, y be positive literals.

It is easier to do less:

c(x) leq c(x wedge y) c(y) …

c(y) leq c(x to y)

c(x) leq c(x to y)

The complexity of a formula is its proof complexity?

A specification is a logical formula in conjunctive normal form in which negation is only allowed in the antecedent of implications. (Conjunction of Horn clauses?)

Example of specification of state/memory using discrete-time logic? This assumes a temporal ordering of commands.

Forall t in nat: if command(t) = increment then state(t+1) = state(t) + 1

13.7Problems

It does not make sense that \(\forall(x,p(x))\) and \(q(x,y,z)\) have the same semantic complexity?

Similar issue: CS SE 48329.

13.8What?

1982 A measure of logical complexity of programs Iyengar et al. https://www.sciencedirect.com/science/article/abs/pii/0096055182900030

Some second-order logic can be embedded into first-order logic with this trick: p(x, …) can always be rewritten to apply(p, x, …)

Rijnders 2008 First-Order Logic as a Lightweight Software Specification Language https://homepages.cwi.nl/~paulk/thesesMasterSoftwareEngineering/2008/MichelRijnders.pdf

[1] Coblenz, M. et al. 2018. Interdisciplinary programming language design. Proceedings of the 2018 acm sigplan international symposium on new ideas, new paradigms, and reflections on programming and software (2018), 133–146.

[2] Felleisen, M. et al. 2018. A programmable programming language. Commun. ACM. 61, 3 (Feb. 2018), 62–71. DOI:https://doi.org/10.1145/3127323. url: <https://cs.brown.edu/~sk/Publications/Papers/Published/fffkbmt-programmable-prog-lang/paper.pdf>.

[3] Harrison, W. and Ossher, H. 1993. Subject-oriented programming: A critique of pure objects. Citeseer.

[4] Ingalls, D.H. 1981. Design principles behind smalltalk. BYTE magazine. 6, 8 (1981), 286–298.

[5] Mernik, M. et al. 2005. When and how to develop domain-specific languages. ACM computing surveys (CSUR). 37, 4 (2005), 316–344.


  1. <2019-12-11> mirror 1 (html)

  2. <2019-12-11> mirror 2 (pdf)

  3. <2019-12-11> mirror (pdf)

  4. <2019-12-20> http://people.cs.ksu.edu/~schmidt/505f14/Lectures/WhenDSL.pdf

  5. <2019-11-27>