1What are types good for?

How useful is that question?

Types, ontologies, meta-models, reasoning?

Even if a language does not have types, we reason about our program with types, and type errors have real consequences. We may say to ourselves things like "This variable is supposed to refer to an integer." Types answer "What does this represent?", and that question is very often asked when developing and maintaining a program, and thus a type system is very useful because it provides a framework for answering "What is this?". Types also answer "What can I do with this thing?": with types, the computer can present the methods understood by an object, and those methods are what one can do with the object, or, indeed, those methods are all one can do with the object.

The reason why a type system is so useful is because it helps computers understand our programs. To understand something is to have a useful model of it.

Even in assembly language, we reason with types. For example, to the computer, there is no difference between pointers and numbers: They are all bits. However, to humans, it does not make sense to add two pointers as if they were numbers. Humans interpret bits. If humans tell a computer to add the content of two registers, it will do that unquestioningly; it does not care about the meaning. Computers do what we say, not what we mean. The program is the law, and computers follow the letter of the law, not the spirit. If computers could reason morally, they might refuse some of the things we ask them to do, such as using them in weapons that kill others more efficiently, or using them to spy on billions of people.

How do we answer "What is this thing?" Ontology studies that. Genus differentia: "An X is a Y that is/has Z." (Logically, \( x(t) = y(t) \wedge z(t) \)). Ostensive definition (that is, definition by examples): "A T is something like X, Y, Z." (But how do we teach computers to generalize examples? Because we don't know how, we don't have ostensive definitions in our type systems.) Thus a type system can be thought of an ontology but with closed-world assumption (but doesn't an ontology require open-world assumption?). A type system can be thought of as a meta-model. The program is a model of reality, and the type system is a model of the program (a meta-model of reality).

Usually, we do not expect the type of a variable to ever change at runtime.

What has type: variables or values? Both. The type of a variable limits the type of values that the variable can represent.

Sometimes we care about the shape of a function more than we care about its name, perhaps because there is only few such functions that make sense. For example, if I didn't know that "split" were named "split", I would probably look for a function with type String -> String -> List String. Haskell has Hayoo or Hoogle. In this way, programming becomes arranging functions so that their types fit, like solving a jigsaw puzzle.

2Overloading

Overloading is making the meaning of a variable depend on its name and type. Normally, in Scheme, the meaning of a variable depends only on its name.

3Lisp/Scheme/Racket

When in doubt, use equal?. The worst thing it can do is slow down your program, but it won't be wrong.

What sucks of Racket?

  • Scribble is slow.
  • The racket/gui module refuses to be loaded twice, so you can't (require (for-syntax racket/gui))?

I think it is evil to make modules have side-effect when they are instantiated; what are the alternatives?

define is a side-effect? Should have no global variable? Pass the world in a parameter to main, like in Newspeak, Nix, Dhall, Morte?

There are no global variables in lambda calculus? But what are the advantages of this purity? Supercompilation? First-class modules?

4Nanopass

The simplest way to make a compiler in the 21st century?

But I think one should make interpreters and partial evaluators, and derive compilers, instead of making compilers manually?

5Macros

What is a macro? A macro can be thought of as a syntax transformation: a syntax-object endofunction.

A natural way of expressing macros is pattern-matching rewrite rules.

The implementation of Typed Racket is discussed in [3]. It also shows three communication patterns for macros: static bindings, syntax properties (Typed Racket's type-label syntax property), and persistent effects (Typed Racket's declare-type! macro).

A newbie often wonders, why is something so complex? Because it has evolved to handle many problems that a newbie would not think of. Why is the Linux kernel complex? Because it has to handle out of memory, out of disk space, and other corner cases, because they have happened, and they will happen again; the kernel is not supposed to just crash or hang when it encounters those reasonably common problems. Adding hygiene complicates Scheme's macro system, but hygiene is often necessary for us to be able to reason about macros.

Hygiene means static/lexical/eager/compile-time binding: The meaning of symbols are determined by their definition site. Unhygienic means dynamic/lazy/run-time binding: The meaning of symbols are determined by their use site.

There are many ways to define things, but not all are simple for machines. https://en.wikipedia.org/wiki/Definition

It is impractical for a software system to be stateless: A stateless software system cannot remember anything. What is practical is to minimize or centralize the stateful parts so that it does not infect the stateless parts.

6On programming languages

6.1What do we mean by extensible languages?

What do we mean by an "extensible" language? Turing complete? New syntactic constructs with arbitrary custom semantics? English is extensible: one can define new forms and words. Lambda calculus is extensible: one can add forms and reduction rules. A CPU's instruction set is extensible: one can add new instructions. If one tries hard enough, C is extensible, in the sense that one can modify the compiler's source code to add new forms. The semantics of the extension must be expressed in terms of the base semantics. Interoperation of L and M is defining a common language C that can express both L and M. It is easier to write Lisp code that generates Lisp code than to write C code that generates C code, because, in Lisp, the AST pretty much consists of cons+symbol+string+number, and there are "read" and "write".

The description-simplicity of syntax aids metaprogramming. The metaprogramming complexity of a grammar is the length of the shortest Haskell model of its AST specification.

Lisp syntax is easier to describe than C syntax (although not necessarily easier to read/write/use). Easily describable syntax simplifies metaprogramming because there are fewer cases to handle.

Thus, all languages are in principle extensible, but what about in practice? The question is thus: How hard is it to extend a particular language?

6.2All programs are ultimately imperative

Every fancy language construct must ultimately be translated into a sequence of CPU instructions, because the CPU only understands those instructions.

6.3Static vs dynamic: bad terminology?

We should ask these questions instead:

  • Is type information available at runtime?
  • How far can types be manipulated at compile-time?
  • Is there a distinction between compile-time and run-time?

7Programming language research

7.1Dream?

We dream of creating the best programming system. We think we should program directly in the language of the mind. We think programming languages need to interoperate. We dream of writing translating meta-programs to free programmers from design mistakes such as Java. We are deep-diving into declarative programming. We want a programming language with minimal accidental complexity.

7.2Where should we publish our programming language research findings?

Should we publish at ICFP, PLDI, POPL, SPLASH/OOPSLA, or Salon des Refusés?

POPL has the "Off the beaten track".

We want maximum open access.

7.3Similar researchers

University of Kent, faculty of computer science, programming languages and systems1

7.4What is the state of the art of declarative programming in 2019?

7.4.1What is declarative programming?

What do we declare?

7.4.2What is the trend? Where is the research going?

We are still trying to unify functional programming and logic programming.

What is the state of the art? Unifying logic programming and functional programming?

ALF (Algebraic Logic Functional), LambdaProlog, etc.?

Some formalisms for declarative programming:

  • first-order logic
  • calculus of constructions
  • lambda calculus
  • set theory
  • System F
  • relational algebra, relational calculus

Declaratively and essentially.

How should we make computers do what we want? Does programming a computer require a language?

Wolfram language2 meta-algorithms? Wolfram language is proprietary.

7.4.3Two camps of declarative programming languages

It is easier to add nondeterminism to lambda calculus than it is to add lambda-expression to Horn clauses?

There are at least two camps of the declarative programming languages: First: Start from Prolog, move toward Haskell. Second: Start from Haskell, move toward Prolog. Example of second camp is Curry.

Where is ALF? https://www.informatik.uni-kiel.de/~mh/systems/ALF.html

Seres 2001 [12]

Hoare & He 1998 [6]

7.4.4The problem with functional-logic programming: when should functional terms be reduced?

The problem: sometimes we want to reduce an argument of a term. However, if we could do that, we would soon run into the opposite problem: sometimes we do not want to reduce an argument of a term. These problems await those who want functional logic programming.

Mercury, ALF, Curry, Lambda-Prolog.

Prolog sucks for list transformation (such as deriving a list by applying a function to each element). Auxiliary predicates force us to name something that should not have to be named.

Prolog does not reduce any function symbols. Prolog's interpretation of function symbols is the identity function.

In first-order logic, we may interpret a function symbol \(f\) of arity \(n\) as a function \(\semantics{f} : D^n \to D\) where \(D\) is the domain of discourse.

Milicevic 2015 [10] builds on Ruby.

Lorenz 2017 [9]: thinking of an application as a language? Mind-blowing?

7.4.5Unifying functional and logic programming as a rewriting system

Both SLD-resolution and normal-order beta-reduction can be seen as term-rewriting systems.

All we have to do is add non-determinism to Pure3, and we have a Prolog-Haskell hybrid.

A Horn clause translates to several rewrite rules. with disjunctive-normal-form body is A :- B1 ; ... ; Bn is the rewrite rule \(A \to B_1 \vee A \to B_2 \vee \ldots \vee A \to B_n\).

A function definition translates to one rewrite rule. A = B is the rewrite rule \(A \to B\).

prolog {
    a(X) :- b(X), c(X).
    parent(dad,kid).
    parent(daddy,kiddo).
}

haskell {
    what n = \exists x y : parent x y && length + length y == n
}
  1. Lambda-calculus with choice/disjunction

    Nondeterministic programming: Prolog, Amb, Alisp4. Other nondeterminstic extensions to lambda calculus are already in the literature: 1998 [8], 2009 [5].

    "Lambda logic is the union of first order logic and lambda calculus."[2]

    Nondeterminism-by-backtracking may be implemented in Scheme with continuations.

    How do we combine lambda-calculus beta-reduction and Prolog unification?

    [7]

    Here we describe \(\lambda_|\)-calculus (lambda-calculus with choice; that is a vertical bar). Other names:

    • lambda-calculus with nondeterminism
    • lambda-calculus with disjunction

    We add expression syntax for disjunction. Thus the syntax becomes:

    1. Every name is an expression.
    2. If \(x\) is a name and \(y\) is an expression, then \(\lambda x.y\) is an expression (abstraction).
    3. If \(x\) is an expression and \(y\) is an expression, then \(xy\) is an expression (application).
    4. If \(x\) is an expression and \(y\) is an expression, then \(x | y\) is an expression (choice). For consistency with Prolog's ; operator (disjunction), the choice operator \(|\) associates to the right: \(x | y | z = x | (y | z)\).

    We add two rules to beta-reduction. Thus the beta-reduction now consists of three rules. The last two rules makes the beta-reduction ambiguous.

    \[\begin{align*} (\lambda x. y) z &\to y[x := z] \\ x | y &\to x \\ x | y &\to y \end{align*} \]

    We define two constant expressions: \(false\) and \(true\).

    Now we define a ternary logic mapping from expression to \(\{false,unknown,true\}\).

    \[\begin{align*} T(false) &= false \\ T(true) &= true \\ T(\lambda x. y) &= true \\ T((\lambda x. y) z) &= T(y[x := z]) \\ T(A = B) &= unknown \end{align*} \]

    A Prolog-like operational semantics:

    \[\begin{align*} x | y &\to x \text{ if \(x\) succeeds} \\ x | y &\to y \text{ if \(x\) fails} \end{align*} \]

    We define one constant \(false\).

    We define that an expression fails iff it reduces to the constant \(false\). We define that an expression succeeds if it does not fail.

    \[\begin{align*} fail | x &\to x \\ x | fail &\to x \end{align*} \]

7.4.6What is the difference between program synthesis and program derivation?

They look the same to me.56

  1. <2019-02-27> Idea: automatic derivation of program quotients

    That computer has limited memory is accidental complexity. We should program with natural numbers and infinite sets. The translator should automatically compute the quotient of the program.

    Built-in types: Natural numbers.

    Given a function f : nat -> nat, the translator should be able to derive the quotient f' : uint32 -> uint32 where \( f' n = (f (n \bmod 2^{32})) \bmod 2^{32} \).

7.4.7Problem: Java and SQL are oblivious of each other

7.4.8How do we program as close as possible to the internal language of thought?

I think logic is the internal language of thought.

I think a man who does not know any language will still know causality; he is merely unable to describe it.

A man who does not know any language knows that touching fire causes pain, and that eating causes satiety, but does he know the concept of causality itself?

7.4.9How can we reach the ideal system?

If we keep improving existing systems, will we get there? Or does it require a fundamental change?

7.5Old goal?

The goal is to make the programming language.

The ultimate best programming language?

7.5.1What is programming language research for?

A goal of programming language research is to make a better programming language (if not the best). Do more with less.

7.5.2Jargon is necessary for concision (high information transfer rate)

In different fields of studies, we invent jargons for concision, to speed up information transmission, to convey more meaning in shorter time with less words.

7.5.3But we can't just assume that users are going to wait forever, that memory is unlimited, that network is always up and fast, etc.?

  1. Communities

  2. People

    • <2018-10-04> David Van Horn "I work toward making the construction of reusable, trusted software components possible and effective"
  3. Finding giants whose shoulders we want to stand on, for building the programming language

    1. People who share the vision for the programming language

    2. People who share some of the vision but don't go far enough

8Computer programming

8.1What is computer programming?

We say "we program a computer" to mean that we make the computer (1) do what we want it to do and (2) do only that. Security problems arise from neglecting the second part.

A programmer is someone who programs a computer. The essence of a programmer's job is to formalize the ontology implicit in the user's requirements.

Why do we program computers? The same reason we make machines. Because we are lazy tool-using curious animals.

8.2How should we program computers?

8.2.1The programmer should be able to only care about what the end-user cares about

The user has a goal. The user treats the system as a black box (the user does not know how the system works; the user only cares about what the system does). The user is concerned about these and only these:

  • What do I have to input into the system?
  • How do I understand the system's output?

Finite integers are accidental complexity. We should program with mathematical numbers, and the computer should automatically take the quotient of the program (translates the ideal program into a finite/limited/bounded implementation that is correct for the first \(2^64\) inputs).

We should do as little as possible, and the computer should do as much as possible.

We should be able to tell the computer as little as possible. We should be able to tell the computer only the essential complexity. Implementation is accidental complexity.

8.2.2When will we have telepathy?

Telepathy. Directly with our mind. Computers should be prosthetic minds. We can lift more with a prosthetic arm than with a natural arm. We can think more With a prosthetic mind than with a natural mind.

Such technology does not exist in 2019, so we are stuck with the second most ideal solution: declarative programming languages.

8.2.3A language that is expressive enough to describe itself?

We want a language that can be its own meta-language.

Logic can describe itself?

8.2.4How should programmers specialize?

Programmers should not be divided into backend and frontend, for two reasons:

  1. Doing so causes much code duplication.
  2. Backend and frontend are accidental complexity, not essential complexity. Everyone who specializes in handling accidental complexity is going to be obsolete.

Instead, programmers should be divided into language designer and language user.

The language user should program in a use case specification language.

An example of essential complexity:

discount_percent age =
  if age < 18 or age > 55
    then 10
    else 0

8.2.5Two camps of programming language design

There are two camps in programming language design: bottom-up and top-down.

Bottom-up language design:

  • Begin with physics.
  • Claude Shannon abstracted the underlying physics into bits and boolean algebra.
  • Machine code.
  • Two's-complement signed integers.
  • Variables abstract away finite registers.
  • Garbage collection abstracts away finite memory.
  • Go up until the language is convenient enough to use.

Top-down language design:

  • Begin from logic and mathematics, the internal language of thought.
  • Go down until the language is convenient enough to realize.

8.3Must a programming system be textual or arboreal?

Why confine ourselves to texts and trees? Why not graphs?

There are visual programming languages.

Problem: Visual programming system encumbers blind people.

9Philosophy of computer programming?

9.1Program and execution

In imperative programming, a program is a sequence of commands, and execution is what we expect.

In logic programming, a program is a logical formula, and executing that program is proving that formula.

In functional programming, a program is an expression, and executing that program is reducing that expression until a normal form is reached.

The common view that includes both logic and functional programming is term-rewriting.

9.2What is computation, computer, programming, language?

Programming is making a computer do what we want it to do.

Languages are divided into several categories:

  • hardware programming, system programming
  • enterprise application programming
  • scripting? network administration
  • markup: XML, Markdown
  • data: JSON, YAML

The question: What is the least-effort way to make a computer do X?

9.3Who use programming languages?

A programming language serves as a means of communication in three cases:

  • human-to-human:
  • human-to-machine:
  • machine-to-machine:

We advance technologically when we raise the bar for machines (we expect more from machines), not when we expect more from humans. (?)

9.4The fundamental reason why there is no perfect language: The pigeonhole principle, encoding length trade-off

If we choose to encode something shorter in a language, then other things must be encoded longer. If we make it easier to do something, something else has to become harder.

A language is essentially a Huffman code, a compression scheme, where often-used concepts are encoded more shortly than rarely-used concepts.

Language encodes concept into symbols.

If there are only 26 letters and 1,000 concepts, then, by the pigeonhole principle, it is simply impossible to encode all those concepts using only 1-letter symbols.

The reason why there is no perfect language is simple: the pigeonhole principle precludes encoding all possible concepts into short words.

9.5Ontology: what exists in a programming language?

  • numbers
  • byte strings
  • character strings
  • maps
  • functions
  • relations

9.6Philosophical principles

9.6.1Philosophical principle, meta-thought, how to find essence

Everything (every language element) should have as few properties as possible. If something has as few properties as possible, then what is left is its essence. Example: In C, the name of a function is a property of that function. In JavaScript, the name of a function is not a property of that function. var add = function(x,y) { return x+y; }

Is the name "x" a property of the variable x in the lambda expression \ x -> x + 1? We can have nameless parameters with de Bruijn indexes.

9.6.2The essence of description and computation

The essence of description is the composition of primitives.

Computation is the execution of a computing description.

All computations have repetitions. Turing machine computation is repeated primitive computation. Lambda-calculus beta-normalization is repeated beta-reduction.

Computation is the normalization of an expression to a value.

A logical axiom corresponds to a computing primitive.

A proof corresponds to a computing description (program).

9.6.3Programming

  1. What is a program?

    • A program is represented by
      • a sequence of instructions (procedural programming)
      • a lambda expression (functional programming)
  2. Metaprogramming

  3. Comparing Ocaml and SML

  4. Interoperation

  5. Interesting languages?

    1. Rebol? Rebol metaprogramming?

      https://en.m.wikipedia.org/wiki/REBOL

    2. Carp lisp dialect?

  6. Scheme vs Lisp:

    • A Lisp implementation doesn't have to have proper tail calls.
    • A Scheme implementation must have proper tail calls.
  7. Type systems

    • Types help computers help us.
      • Types prevent some mistakes.
      • Types are part of documentation.
        • Types help us write an IDE.

9.7A lambda abstraction is not a function

A mathematical function is a triple \((A,B,F)\) where \(A\) is the domain, \(B\) is the codomain, and \(F \subseteq A \times B\) is the mapping.

A lambda abstraction \( \lambda x. y \) is not the same as a function \( x \mapsto y \).

The expression 1 + 2 is not the same as the number 3. That expression evaluates to that number.

9.8The philosophical foundation of object-orientation?

Identity?

Properties?

What does "X is a Y" mean?

What does "Every X is a Y" mean?

First-order logical meaning of object-oriented definitions?

Car my_car = new Car(); // my_car is a Car.
class What extends Car {} // Every What is a Car.
\[\begin{align*} Car(my\_car) \\ \forall x : What(x) \to Car(x) \end{align*} \]

9.9Programming is computable mathematics?

https://en.wikipedia.org/wiki/Semantic_domain

"A programmable analog neural computer and simulator" https://pdfs.semanticscholar.org/5f6b/579b1f4166ea536f5ed188e9976390729303.pdf

To rewrite a part of a program without introducing errors, we need to preserve the meaning of the program, and thus we need a theory of semantics.

See also Rapaport 2015 section 7.2 ("What is computation?") from page 233.

See Rapaport 2015 page 267 about other computation models.

What can we get from this? https://plato.stanford.edu/entries/computer-science/

<2014-05-07> Reddit user TezlaKoil shows how to derive Russell's paradox in untyped lambda-calculus, and shows the relationship between Russell's paradox and Curry's Y combinator.7

9.10Programming language ontology?

2005 "Towards a programming language ontology" https://pdfs.semanticscholar.org/028f/3f5df68e49b2d42663e935f1615ba46f41a0.pdf

  • paraphrase of Frege & Dummett: "the ontological implications of a language are to be identified with the entities required to provide its constructs with a semantics"
  • a possible ontology for computer science: something exists if and only if it is computable
  • "properties of programs such as efficiency, flexibility and elegance are, from the [Computer Science] view of things, absolutely essential. But these properties are obliterated by the [Denotational Semantics]."
    • Denotational semantics can't reason about a program's resource usage.
  • "CS build programs and systems from data

types. So instead of sets we propose to turn matters around and use data types [instead of sets] as our basic ontology."

  • Calculus of constructions (inductive data types) may provide an ontology.

"The 'existence' referred to need not be 'real', but exist only in a universe of discourse." https://en.wikipedia.org/wiki/Ontological_commitment

9.11Should we strive to create a programming language with ontological parsimony (minimal ontological commitment)?

How do we design a programming language whose ontology coincides with how we think about the real world?

How do we design a programming language whose ontology coincides with the ontology of mathematics?

How can we be a mathematician without being an implicit Platonist? Anyone who believes that perfect circles exist is a Platonist. Anyone who believes that the square root of two exists is a Platonist.

Mathematics is language.

https://www.iep.utm.edu/mathplat/

10Philosophy of length, dimension, measure, size, cardinality

Similar questions89

10.1What is length?

"Length" means "the distance measured along the longest dimension of an object"10, "property of being long or extended in one direction; distance along a line"11, "the most extended dimension of an object"1213, "the longest extent of anything as measured from end to end"14. To use the word "length" for anything else (such as ADTs, which are trees), we must first define what the "longest dimension" of a tree is.

We can think of "length" as a nominalization of "long".

In a programming language, the word "length" implies that things have a geometry.

In order to define length, we must define a measure for every value in the language.

\(m(x)\) should be the number of bits required to represent \(x\)?

\(m(n) = \log_2 n\) or \(m(n) = n\)?

\(m([a,b]) = something + m(a) + m(b)\)?.

When we say "the length of this text is 1,000 characters", we tacitly assume that a character is a unit of length.

"Length" does not make sense for heterogenous sequence. "The length of [abc, 1, [1,2,3]] is 3 elements". Should an element, which has an arbitrary size, be a unit of length?

"Length" implies unit of length.

We say "this movie is 90 minutes long".

When we say "the length of X is Y", we implicitly see X as a line.

10.2What is dimension?

what151617

10.3What is size?

what1819

10.4Why do we measure length?

Because we want to place an object inside a container, and we want to know whether the container is large enough. Because we want to tie an object, and we want to know whether we have enough rope. And so on.

10.5How do we measure length?

The length of a line is how long that line is.

The length of a box (a rectangular prism) is the length of its longest axis. A rectangular prism has three axes. Why is the length of a prism not the length of its diagonal?

Unit of length.

We can take "the length of a sphere" to mean "the length of the longest line that we can put in a sphere", and therefore we are justified in saying "The length of a sphere is its diameter". What are the axes of a sphere?

If we think of a triple as something like a rope, that is, if we think of each element as a point occupying no space, then:

length (_,_,_) = 3

If we think of a triple as something like a cube, that is, if we think of each element as a dimension (as a side of a cube), then:

length (a,b,c) = maximum [length a, length b, length c]

10.6Why do we conflate length and cardinality?

We seem to think that the elements of a sequence occupy no space, like point mass in Newtonian mechanics. Why should sequence length be element count?

10.7<2019-02-06> Response

My philosophical analysis leads me to the conclusion that the problem is our conflating "length" and "cardinality".

-- Let `huge` be a text taking 1 gigabytes of memory.
length      [huge] = 1  -- does not make sense
cardinality [huge] = 1  -- makes sense

"Length" implies a longest dimension and a unit of length.

length "hello" = 5 makes sense because of two things: (1) it makes sense for a character list to be visualized as a one-dimensional line and (2) it makes sense for character to be a unit of length (if we assume that there are finitely many characters and each character is one unit of length).

length [[1],[2,3]] = 2 does not make sense. It stretches English too much for a list to be a unit of length, because each list may have different size, unless we declare by fiat that the size of each list is one, but this tyranny only begets more philosophical problems.

length (Identity "hello") = 1 does not make sense for the same reason length [[1],[2,3]] = 2 does not make sense.

However, if we generously assume that Haskell "length" means "cardinality", all the above make sense, but that generous assumption still does not justify length (undefined,undefined) = 1 which violates both the meaning of "length" and the meaning of "cardinality".

10.8<2019-03-02> Length

(I'm assuming that we want to distinguish between "cardinality" and "length" here. I hope I don't misunderstand your question.)

What about the equivalent let a = [1]; b = [2,3] in length [a,b]? Why doesn't the fact that the expression equals 2 make sense?

For example, let A = {1,2,3} and B = {4,5,6}. The set {A,B} has fewer elements than the set A. But the set {A,B} occupies more space than the set A.

The equality length [a,b] = 2 does not make sense, because length is about the amount of space occupied by a thing, not about the number of things. It does not make sense that b = [2,3] occupies less space when it is in a list (such as [b]) than when it is outside a list (such as b by itself).

It does not make sense that a box is shorter than its contents. But it does make sense that a box appears fewer than its contents.

Cardinality is about counting of the number of things in a box (provided that we agree on what a "thing" is). Length is about measuring the amount of space occupied by the things in the box.

To what extent are we being bewitched by syntax and distracted from the semantics?

I think this is not a problem of syntax at all: We are dealing with the semantics directly, that is, how we are supposed to interpret the word "length".

An example of how we may define "length" in a philosophically sound manner (here called "amount of space" instead of "length"):

In logic, the amount-of-space function can be thought of as a semantic function whose codomain is the set of natural numbers:

space : D -> N, where D is the domain of discourse and N is the set of natural numbers
space Nil = 1
space (Cons x y) = 1 + space x + space y
space Z = 1
space (S x) = 1 + space x
...

In Haskell, the best approximation is something like this:

class Space a where
    space :: a -> Nat

instance (Space a) => Space (List a) where
    space Nil = 1
    space (Cons x y) = 1 + space x + space y

instance Space Nat where
    space Z = 1
    space (S x) = 1 + space x

...

I think "amount of space" is not even expressible in Haskell because its type has to be D -> Nat (equivalent to forall a. a -> Nat) where D is the set of all Haskell values.

10.9History

<2019-02-06> I was triggered by Abdullah's pointing out to me that in Haskell, length "Hello" = 5, length (Identity "Hello") = 1, and length (undefined,undefined) = 1. Each of those viewpoints has its philosophical justifications. The question is: Which one has the strongest justification? What is "length"?

11Language mixing issues

11.1Why can't we mix programming languages?

Multilingual people routinely mix languages. For example, compare the English "Why is your shirt very fancy today?" and the English-in-Indonesian "Hari ini baju kamu kok very fancy?"

Natural languages are so mixable because their syntax are similar and they share the same semantics.

Why can't we mix programming languages as easily as we mix natural languages? Why don't programming languages compose?

We can mix programming languages. The problem is that existing tools don't support such mixing. The problem is that it is very hard to make such tools.

We need something like a lathe, a machine that we can use to make more sophisticated machines.

11.2How should languages interoperate?

Can we do it semantically?

Can we do better than FFI (foreign function interface)?

Languages are usually small. Standard libraries are huge.

12Enhancing existing languages

12.1Prolog, second-order logic enhancements

12.1.1What language is like Prolog but has anonymous predicates and anonymous modules?

We are using Prolog, but we are unsatisfied.

12.1.2Reducing functional form, and alternative proof strategies

Prolog does not reduce any functional term.

Update <2019-03-04>: We can achieve that with goal_expansion/2. We thought Prolog was too weak. goal_expansion/2 is more powerful than a global interpretation of function symbols. goal_expansion/2 enables contextual interpretation of functional terms: the interpretation may depend on the outer logical term that contains the functional term. For example, the same f in pred1(f(1)) and pred2(f(2)) may be interpreted differently.

Our biggest worry: What if we made a mistake in our goal_expansion/2? How are we going to know where the mistake is?

We can specify the interpretation of functional terms. The following reduce/2 is a relation from a functional term to its Prolog meaning (a Prolog value, an element of Prolog's domain of discourse).

reduce(f(A),B) :- B is A+1.

One possible solution: reduce/2 and meta-predicate for determining predicate parameter reduction. Now a parameter mode has direction and reduction mode.

:- multifile(reduce/2).
:- dynamic(reduce/2).
reduce(f(A), B) :- integer(A), B is A + 1.

normalize(A,B) :- reduce(A,R), !, normalize(R,B).
normalize(A,B) :- A = B.

:- reduction(predicate(none,normalize,none)).
predicate(A,B,C) :- ...

Even more general than that, another possible solution: Custom proof strategy with prove/1 meta-predicate/hook. Isn't this just goal_expansion/2?

prove(predicate(A,B,C)) :- normalize(B,B0), predicate(A,B0,C).

12.1.3Enhancing (name-separating) Prolog with anonymous predicates and modules

let([
    pred = (X => Y => (male app X) and (child app X app Y))
], pred app bar)

12.1.4How declarative is Prolog?

Apt 1993 [1]

12.2ML

Rossberg 2018 [11] 1ML enhances ML with first-class modules.

13Complexity

13.1How should program complexity be measured?

"A Denotational Approach to Measuring Complexity in Functional Programs" http://www.cs.cmu.edu/afs/cs/user/brookes/www/papers/CMU-CS-03-150.pdf

14Enterprise application programming

14.1What is an enterprise application?

Enterprise application is about using computers to enhance business operation. The usage of computers is an accidental complexity.

Whereas factory machines enhance physical production, computers enhance mental production.

If we treat each of us as an enterprise, then enterprise application is scaled-up personal programming.

14.2Enterprise applications model extrinsic properties

Specification language Attempto Controlled English

Parsing with Prolog DCGs

https://philpapers.org/browse/ontology-of-mathematics

"Advances in a DSL for Application Integration"

https://www.itu.int/ITU-D/tech/OLD_TND_WEBSITE/network-infrastructure_OLD/Bangkok-02/5-5_Demo.pdf http://www.inquisition.ca/en/info/gepsypl/rules.htm

Enterprise applications model extrinsic properties. The name of a person is an extrinsic property.

Google search of software modeling language https://en.wikipedia.org/wiki/Modeling_language https://en.wikipedia.org/wiki/EXPRESS_(data_modeling_language) https://www.martinfowler.com/articles/languageWorkbench.html

14.3Enterprise programming language? Enterprise meta-language?

14.3.1What?

  • Describe data shapes.
  • Describe databases.
  • Describe how the user should interact with the application.
  • Describe CRUD: describe views that read from or write to databases.
  • Describe business logic.
  • Parallelize unrelated database queries.
  • Translate to Java and interoperate with Java.

We naturally think of mathematical objects as tuples/sequences or dictionaries. Each entry of the dictionary represents a property of the object.

Are these the criteria of a good meta-language?

  • algebraic data types and pattern-matching
  • set-theoretic type system
  • tree manipulation

Will we end up reinventing C# LINQ?

Enterprise programming language contains sublanguages that work together.

14.3.2Table expression language

The semantics of a table expression is a sequence of rows. The sequence may be infinite.

The operational semantics is to lazily evaluate the minimum number of rows.

A table has column names.

"empty" means the empty table.

"project(C,T)" means table T but with only the columns in C.

"all(T)" means all rows of table named T. Each row is a dictionary.

"filter(P,T)" means all rows of T that satisfy P.

"first(N,A)" means the first N rows of table A.

"product(A,B)" means Cartesian product.

"foldl(R_0,F,T)" is left-fold: R_k = F(Tk-1,Rk-1) where k ≥ 1 The result is R_n where n is the number of rows in T. T must be finite.

Should collapse as much as possible to SQL.

https://www.postgresql.org/docs/current/sql-select.html

plan : table_exp -> sql_select % select(Columns,From,Where,Group,Having,Order,Limit,Offset) % select_modifywhere(S1,W1,S2,W2) n means not present plan(all(N),select(C,T?,n,n,n,n,n,n)) :- table_columns(N,C). plan(project(C,T), select(C, T, ))

Prolog type system https://github.com/SWI-Prolog/roadmap/wiki/Prolog-type-system

any : any value atom : any atom literal(A) : atom or number or string that equals A literal_in(Vals): in list Vals; each Val is a literal functor(Name,ArgTypes) list(T) : list in which each element has type T union(Types) : disjunction intersection(Types) predicate(ArgTypes)

14.4Saving Java programmers?

We want something like this, to free the Java programmers: "Java program representation and manipulation in Prolog"20

14.5DSL in Java?

14.5.1Some options for modeling the AST

  1. Each class is an AST node type

    final class Const { ... }
    final class Add { ... }
    
    Object eval (Exp x) {
      if (x instanceof Const) {
        return ((Const)x).value;
      }
      if (x instanceof Add) {
        final Add y = (Add) x;
        return (int)eval(y.left) + (int)eval(y.right);
      }
      throw new IllegalArgumentException("" + x);
    }
    
  2. One class Ast_node

    final class Ast_node {
      static final int CONST = 0;
      static final int ADD = 1;
      // ...
      final int type;
      final Object[] arg;
      // ...
    }
    

    Then we need a way to pattern-match.

15GUI programming?

We dream of demystifying and simplifying GUI programming. What is the essence of GUI programming? Can we do better than FRP (functional reactive programming)? Can we do declarative GUI programming better than HTML+CSS? If UX is the next UI, then is there UX programming instead of UI programming? GUX instead of GUI?

How do we formalize user experience? How do we program user experience? Can a programming language facilitate good programmer experience and good end-user experience?

16Human and social aspects of computer programming

16.1Software user experience? Human-computer interface?

Jef Raskin http://wiki.c2.com/?TheHumaneInterface

16.2<2019-02-12> Do computers handicap non-verbal people?

Computer programming forces people to think in words. What about people who think in pictures? What about people who think in sounds?

UX for the blind21

16.3How do we use programming languages?

16.3.1Most people stick to languages they are familiar with

They prefer shitty-but-familiar to great-but-unfamiliar. They condemn themselves to apathy and mediocrity. They never wonder whether there is a better way.

Science (evolution and neuroscience) explains why people stick to shitty-but-predictable languages

The brain reward system rewards correct predictions. If person P finds language L predictable (according to P's background knowledge), then P will like L. Procedural languages are predictable. Thus people stick to them, no matter how shitty those languages are. People prefer predictable shitty things to unpredictable great things. People are risk-averse.

Curiosity of finding a better way to program is the exception; the norm is "we have always done it this way".

We can dumb down the language, or we can smart up the people, but people are naturally lazy, because laziness promotes survival.

"David Liddle's idea on application user interfaces give us a clue as to why lower-level languages draw more people in than higher-level ones (Liddle, 1989). He claims that the most important aspect of a good user interface is how well it leads the user to an accurate conceptual model of how the application works. If the user develops an accurate conceptual model, then the application works as expected. This leads the user to try more things, which also work as expected, leading to an even better understanding, thus drawing the user further and further into the tool." https://www.amzi.com/articles/prolog_under_the_hood.htm

16.4Increasing language adoption

16.4.1What

In order for a language to be adopted, people must perceive its risk as low.

The language must work with existing codebases.

The language designer must think from the language user's point of view. Let's say I have 100,000 lines of Java that I've been writing and testing for the past 5 years. Are you expecting me throw away all of them?

Thus the language must work with C, C++, C#, Java, Go, JavaScript, Python, Ruby, and everything else. This should be possible because the essence of all programming languages is the same: every programming language is a formal system. It should be possible to translate a program P1 in language L1 to program P2 in language L2 with the same semantics.

Improve/enhance, not supersede.

Mixing languages should be easy.

2013, article, "Empirical analysis of programming language adoption", pdf

The language must be suitable for systems programming. - System programming is hardware-aware programming. Application programming assumes abstract machine, infinite memory, and all convenience provided by the operating system. - Why do we make this distinction?

The language must facilitate metaprogramming. Everything must be a first-class citizen. It has to have EVAL. The language must provide a way for interpreting/compiling/loading a program at runtime. The compiler becomes a part of every program.

What is the reason for the name "metacircular evaluator"? What is circular? What is metacircular?

To make syntax first-class, we need QUOTE and UNQUOTE (such as in Lisp/Scheme)?

To prevent syntax flamewar, we should define the canonical linearization of the abstract syntax tree. Go does this with go fmt. I think that is wise.

  • Basic assumptions
    • Computer (machine) is embodied formal system.
      • Assume no hardware fault.
    • Software is executable mathematics.

16.4.2Other people's opinions

  • 2012 article "Socio-PLT: Principles for Programming Language Adoption" pdf

17Memory management

Is there a lambda-calculus interpreter without garbage collection? Is that PreScheme22? PreScheme is limited Scheme with manual memory management.

Stack allocation is simple and nice.

Is dynamic memory allocation really necessary?

Suppose we are writing a program that asks the user for a line of strings.

int read_line (char*, size_t);

What if the buffer not big enough? Return ENOMEM or EAGAIN?

How do we make a console to-do list application without dynamic memory management?

How do we make an in-memory database without dynamic memory management?

That is only one function. What if there are several functions?

If every function is restartable/continuable, then it simplifies memory allocation?

How can we manage memory in programming?

  • manual and static, like in embedded programming
  • manual and dynamic, C malloc and free
  • Rust borrow
  • garbage collection
  • allocate and never free, useful in programs that run for a short time

What are the consequences of not wanting manual memory management?

  • garbage collection
  • Rust-style management?

If a process lives short enough, then it can allocate memory without freeing, and all the memory it uses can be freed when it ends.

We can write a program that sorts an array of 100 integers. But what if there are more than 100 integers?

How can we manage memory sanely?

  • one-more-indirection principle: pass an allocator to every class/subroutine that needs to allocate memory
  • restartable architecture; a subroutine never allocates memory; a subroutine throws an exception when the memory allocated by the parent is insufficient; the root method allocates memory, and re-calls everything in the call tree; use dependency injection

Why is there no memory management in lambda calculus? Is it possible to implement lambda calculus without garbage collection?

Simplifying C style:

  • Every function must free everything that it mallocs.
  • Everything else must be passed pre-allocated by a caller in the call stack.
  • If there is no enough memory, a function must return ENOMEM.
  • Reify the state into a struct.

Example of state reification:

// The function

int indexof (char* s, char* c) {
    for (int i = 0; s[i] != 0; ++i) {
        if (s[i] == c) { return i; }
    }
    return -1;
}

// becomes the class

struct indexof {
    char* s;
    char c;
    int i;
    indexof (char* s_, char c_) : s(s_), c(c_), i(0) {}

    int step () {
        if (s[i] == 0) { i = -1; return 0; }
        if (s[i] == c) { return 0; }
        ++i;
        return EAGAIN;
    }
};

Allocating memory is easy. The hard thing is deciding when to free it. We have several choices:

  • Never free.
  • Who allocates always frees.
  • Caller always frees?

18Games and crashes?

Suppose that you are a game programmer, and you have just released your game.

If a few gamers complain about irreproducible crashes, then it may be a hardware issue or driver issue in the gamer's machine. Or, it's too difficult to report crashes, and people are asking for refunds in anger. Make it easy to report crashes.

If a lot of gamers complain about crashes, then it's probably a programming error in your part.

If your game does not contain sensitive information, you should always ship your game with automatic crash dump, always show a message with clear reporting instructions when the program crashes.

19Modules

What problem does module solve? Why do we have modules?

Separate compilation.

Why?

Because it is slow to recompile the whole program.

Why?

Because we don't want to ship the compiler with the program, because it either bloats the program or complicates distribution? But why can't we distribute the compiler in a slightly less stupid way? And why can't we make small compilers?

Thus, we don't need separate compilation if we compile fragments just-in-time at runtime, and if we ship the compiler with the program. I think we should do that.

Thus, the only reasons left for having modules are namespacing (avoiding name collision), local reasoning, unit testing, implementation hiding (selective exports), dependency management (explicit imports and exports).

But must all of those features be coupled to modules? Why can't (or shouldn't) dependency management be per-function instead of per-module?

All programming features should be made with the purpose of alleviating mental burden?

Features should be designed to play nice with each other and to avoid incomprehensible interactions?

20Bibliography

[1] Apt, K.R. 1993. Declarative programming in prolog. Centrum voor Wiskunde en Informatica. url: <https://ir.cwi.nl/pub/10456/10456D.pdf>.

[2] Beeson, M. 2004. Lambda logic. International joint conference on automated reasoning (2004), 460–474. url: <https://pdfs.semanticscholar.org/9721/c012782ef25023ac300e33d6ce34ffed7e18.pdf>.

[3] Culpepper, R. et al. 2007. Advanced macrology and the implementation of typed scheme. Workshop on scheme and functional programming (2007).

[4] Elliott, C. 2017. Compiling to categories. Proceedings of the ACM on Programming Languages. 1, ICFP (2017), 27. url: <http://conal.net/papers/compiling-to-categories/compiling-to-categories.pdf>.

[5] Fischer, S. et al. 2009. Purely functional lazy non-deterministic programming. ACM sigplan notices (2009), 11–22. url: <http://users.eecs.northwestern.edu/~clk800/rand-test-study/_pflnp/pflnp-2009-10-8-12-02-00.pdf>.

[6] Hoare, C.A.R. and Jifeng, H. 1998. Unifying theories of programming. Prentice Hall Englewood Cliffs. url: <ftp://ftp.cs.kent.ac.uk/people/staff/phw/utp/processed/hoare_start.pdf>.

[7] Kfoury, A. 1999. Beta-reduction as unification. Banach Center Publications. 46, 1 (1999), 137–158. url: <http://matwbn.icm.edu.pl/ksiazki/bcp/bcp46/bcp4617.pdf>.

[8] Kutzner, A. and Schmidt-Schauß, M. 1998. A non-deterministic call-by-need lambda calculus. ACM. url: <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.590.8440&rep=rep1&type=pdf>.

[9] Lorenz, D.H. and Rosenan, B. 2017. Application embedding: A language approach to declarative web programming. arXiv preprint arXiv:1701.08119. (2017). url: <https://arxiv.org/pdf/1701.08119.pdf>.

[10] Milicevic, A. and others 2015. Advancing declarative programming. Massachusetts Institute of Technology. url: <https://people.csail.mit.edu/aleks/website/papers/mit15-milicevic-phd.pdf>.

[11] Rossberg, A. 2018. 1ML–core and modules united. Journal of Functional Programming. 28, (2018). url: <https://people.mpi-sws.org/~rossberg/papers/Rossberg%20-%201ML%20--%20Core%20and%20modules%20united%20[Draft].pdf>.

[12] Seres, S. 2001. The algebra of logic programming. University of Oxford. url: <https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.108.6851&rep=rep1&type=pdf>.


  1. https://www.cs.kent.ac.uk/research/groups/plas/researchinterests.html

  2. https://www.wolfram.com/language/principles/

  3. https://en.wikipedia.org/wiki/Pure_(programming_language)

  4. https://en.wikipedia.org/wiki/Nondeterministic_programming

  5. https://en.wikipedia.org/wiki/Program_synthesis

  6. https://en.wikipedia.org/wiki/Program_derivation

  7. https://www.reddit.com/r/math/comments/24wk6f/are_there_other_alternatives_to_set_theory/chboelc

  8. https://www.reddit.com/r/askmath/comments/6zuiob/why_is_it_called_cardinality_and_not_length_or/

  9. https://math.stackexchange.com/questions/158602/number-of-elements-vs-cardinality-vs-size

  10. https://en.wiktionary.org/wiki/length

  11. https://www.etymonline.com/word/length

  12. https://en.wikipedia.org/wiki/Length

  13. http://wordnetweb.princeton.edu/perl/webwn?s=length

  14. https://www.dictionary.com/browse/length

  15. https://www.etymonline.com/word/dimension

  16. https://en.wiktionary.org/wiki/dimension

  17. https://en.wikipedia.org/wiki/Dimension

  18. https://www.etymonline.com/word/size

  19. https://en.wiktionary.org/wiki/size

  20. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.158.4524&rep=rep1&type=pdf

  21. http://www.dinf.ne.jp/doc/english/Us_Eu/conf/csun_98/csun98_069.html

  22. https://thintz.com/resources/prescheme-documentation