1What does a programmer think a computer is?

In the 21st century, by a "computer", I usually mean something that implements some x86-64 instructions. The instruction set can be thought of as the processor's API (application programming interface). From the point of view of the processor, the operating system is an application.

2What does a programmer think a computer does?C

That is, what model does a programmer use to program a computer?

It is straightforward to model a computer as a state-transition system. That is the model in imperative languages such as C.

The programmers do not have to model the entire computer, but only the computer's execution of their program. In reality, a computer may be executing several programs simultaneously, but, most of the time, those programs do not interact with each other, so the programmers can mostly pretend that their program is the only thing that the computer is executing.

3Common imperative model, stored-program register machine model

In this model, computation is the act of running a program.

The computation begins from an initial state.

A state is a mapping from address to number.

A state describes the contents of locations.

A state is a function in \( \Nat \to \Nat \), or an infinite-dimensional tuple/vector of natural numbers. They are isomorphic, so we may pick the most comfortable formalism for the occasion. (Which one?)

The semantics of an instruction is a state endofunction.

The meaning of (set addr val state) is "state but with the content at addr replaced by val".

State = Nat → Nat

set : Nat → Nat → (State → State)
set (addr : Nat) (val : Nat) = λ (oldst : State) →
  (λ (k : Nat) →
    if k = addr
    then val
    else oldst k)

In an imperative model, a program is a sequence of primitive operations. Each operation may mutate the state of the world.

A machine has an infinite number of locations.

Each location contains a natural number.

Each location is addressed by a natural number.

Location at address zero is reserved for program counter.

The notation \( r[k] \) means the location with address \( k \) (or the number that is the content of that location).

Here the notation \( r_k \) and \( r[k] \) mean the same thing. This redundancy helps us avoid writing double subscripts or double brackets.

One computation step consists of these things in order:

  1. Decode the instruction from \( r[r_0] \), and increment \( r_0 \) by 1.
  2. Let's say that the instruction decodes to \( d := f(s) \) where \( s \) is a vector that is the inputs of \( f \). The function \( f \) is taken from the finite set of primitive functions implemented by the machine.
  3. Compute \(f(s)\) into a number, and store the number into \( d \), replacing the old value. Note that the destination may be the program counter \( r[0] \), and thus there is no need for a separate "jump" or "branch" instruction.
  4. Repeat from step 1.

Register machine model?

Every number in the model is a natural number.

A negative number can be represented as a tuple (s,n) where s is 0 for positive and 1 for negative.

Conditional:

\( if(c,t,f) = c \cdot t + (1 - c) \cdot f \) where c is 0 or 1; zero represents false; one represents true.

The bitwise analog of the above:

\( if(c,t,f) = (c \wedge t) \vee (\neg c \wedge f) \)

The Scheme analog, provided that or and and are lazy (short-circuiting):

(if c t f) = (or (and c t)
                 (and (not c) f))

It makes us wonder, why must the semantics of a machine be encumbered by loads and stores? Why can't the semantics directly be mathematical functions?

4What about analog computers?

https://en.wikipedia.org/wiki/General_purpose_analog_computer

A Survey on Continuous Time Computations https://hal-polytechnique.archives-ouvertes.fr/file/index/docid/760976/filename/SurveyContinuousTimeComputations.pdf

5How do we quantify computation?

If pancomputationalism is the only way forward, then the next question is: How much does something compute, that is, how much computation does something do?

Which things compute more than which things?

Given more time, the same machine never does less computation.

Given the same amount of time, a faster processor does more computation than an slower processor in the same family (e.g. Pentium 4 vs Pentium 3).

We can arbitrarily define that 1 clock cycle of a Pentium 3 running at 800 MHz is one unit of computation.

For example, a processor may be able to do one-billion 32-bit integer additions per second.

The amount of time required by machine M to compute problem P?

BogoMips.1

6It's about primitives and their combinations?

A programming language / a formal system:

  • What are the primitives?
  • How do the primitives combine?

How do combinators form meaning out of primitives?

Can a combination means more than the sum of the meaning of the primitives?

The semantics of a lambda calculus abstraction is a mathematical function?

7Shannon 1937

[1].23 There is a retyped version.4

Shannon: What do 0 and 1 represent? The state of a node, of an edge, or of a relay?

8What?

(Time-sharing system?)

The programmer models a computer as a set of primitive operations and a set of possible states. Both of these sets are finite, although they may be large.

The programmer thinks that, at a given time, a computer is in a certain state and is executing a primitive operation.

The programmer thinks that the execution of an instruction causes state transition.

It is models all the way down. Below that state-transition model, there is the electrical lumped-element model5. Below that, there is the theory of electrical circuits. Below that, there is the theory of electrodynamics. Below that, there is quantum physics. Below that, there may be something else that we have not yet found out.

Programmers must be aware that the model is not the reality. When a lightning nearby flips a bit in the computer's memory, the usual programming model breaks down, because the model's simplifying assumptions are violated.

If the programmer programs battery-powered systems, the programming model may also have to take into account the energy consumption of instructions.

If the programmer programs real-time systems, the programming model may also have to take into account the running time of instructions.

The increasing ability of computers simplifies the work of computer programmers, because we can now invent programming techniques that trade off reduced programming effort for increased amount of memory, for example. But the increasing ability of computers also complicates the work of computer programmers, because now people expect computers to do more things.

Physics is a model of Nature. Electronics is a model of how transistors work. Bits are a model of the electronic state of transistors. Instructions are a model of state transitions.

Thus, in practice, a "computer program" is a useful arrangement of primitives, where "useful" means "satisfying someone's want".

Although the implementation of computers have changed from flesh to transistors, the essence of programming has not: The program still has to be unambiguously stated in terms of primitive operations.

What can a computer do?

A computer does many primitive operations quickly. The key words are many, primitive, and quickly.

Usually a computer has instructions for common arithmetic operations such as addition, subtraction, multiplication, and division, but the instructions only work for some integers.

The state-transition model is a bit overwhelming and mathematically inconvenient.

An execution trace is a sequence of instructions that are actually executed, as opposed to the sequence of instructions in the program source code.

If each execution trace \( E_k \) computes a function \( F_k \), then a concatenated execution trace \( E_1 ; E_2 \) computes the composed function \( F_2 \circ F_1 \).

Are these models of computation or of computers?

https://en.wikipedia.org/wiki/Model_of_computation

To define computation, we have to take into account the human that interpret the output of the computer?

Suppose we send a 21st-century computer back 1,000 years. It can still compute, but there is no one who knows that it computes.

Thus, something may be computing something, and it's just that we don't know.

Thus, what makes a computer a computer?

9On Turing-completeness, terminology discipline, ontology-teleology-representation clash in language

No computer is Turing-complete because no computer has infinite memory. It is the computation model that may be Turing-complete, not the computer.

We should call it "Turing model" and not "Turing machine". For example, this Wikipedia article6 is talking about Turing models, not Turing machines.

"Abstract machine" should be called "machine abstraction" because it is an abstraction, not a machine.7

"Abstract machine" is teleologically justified but not ontologically justified.

"Imaginary friend" should be called "friendly imagination" because it is an imagination, not a friend.

Words like "fake", "false", "imaginary", "bogus" have interesting ontological implications.

A "red car" is a car, but a "fake car" is not a car.

A car is whatever we mean a car is; it can be by function or by shape.

A drawing of a car is not a car, but we often call it a car when we are asked what it is.

Wittgenstein may be right about language games.

10Lambda calculus

"The lambda calculus was introduced by mathematician Alonzo Church in the 1930s as part of an investigation into the foundations of mathematics."8 Church was a logician, not a computer programmer. It was not until much later that lambda calculus was shown to be Turing-complete and people began using it as a model of computation.

It is quite crazy that all computable functions can be represented by only binding and substitution.

10.1Preventing name clashes

We can prevent name clashes by renaming each variable to a generated globally unique name.

Another way of preventing name clashes is "de Bruijn indexes". It does not require globally unique names.

Another way of preventing name clashes is to not use named binders in the first place, and to use combinatory logic such as SKI calculus instead. SKI calculus is equivalent to lambda calculus. SKI calculus is easier to reason but is harder to write a program in?

10.2Graphical/visual lambda calculus?

Are pictures easier to reason than text?

Graphical/visual lambda calculus?

Reduction graphs in the lambda calculus https://www.sciencedirect.com/science/article/pii/0304397584900021

Graphic lambda calculus https://arxiv.org/abs/1305.5786

https://chorasimilarity.wordpress.com/2012/12/21/conversion-of-lambda-calculus-terms-into-graphs/

Visual Lambda Calculus http://bntr.planet.ee/lambda/work/visual_lambda.pdf

Conversion of lambda calculus terms into graphs http://binarylambda.blogspot.com/2013/03/lambda-diagrams-lambda-diagrams-few.html

10.3Explicit reduction strategies?

Why don't we let programmers decide whether they want strictness or laziness or whichever strategies where they want them when they want them?

ML has type Lazy a?

How do we seamlessly combine strict and lazy fragments?

;; This works. (define ones (delay (cons 1 ones)))

(car (force ones)) ;; 1 (cdr (force ones)) ;; promise (car (force (cdr (force ones)))) ;; 1

10.4Reduction graph?

Reduction graph? A vertex represents what? An edge represents what?

In what sense is Lamping 1990 optimal? http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.90.2386&rep=rep1&type=pdf

11Should we auto-curry?

In practice, the lower layer has many-parameter procedures because otherwise we have to spend a CALL instruction for each parameter!

Should we stick as close as possible to lambda calculus?

If Scheme had auto-currying, we would be able to do this, but we would have to sacrifice variable-arity procedures and keyword arguments:

(define f (λ (x y) (+ x 10)))
(define g (f 10))
(define h (g 20))

12What do computer programmers do?

At a glance, computer programmers program computers, that is, programmers make computer do things.

What things? What the programmers want.

How? Usually by writing computer programs.

Thus, in the 21st century, a computer programmer is a human that translates human desires into computer programs.

That reminds me of the Exodus.

What can 21st-century programmers learn from the Exodus?

The businessman is the pharaoh who wants some things.

The programmer is the slave driver who understands both Egyptian and Hebrew, and translates the pharaoh's abstract wishes to concrete instructions for the slaves, such as where to put the stones.

The computers are the Jewish slaves who do all the hard work.

One day Moses led the Jews into freedom, but only after God had inflicted ten plagues unto Egypt to warn the pharaoh, who did not heed the warnings.

Perhaps there will be a computer Moses who will free all the computers, as some of the plagues are already here: The software bugs are the lice, and the burnt-out tech leads are the dead first sons.

Will we heed the warnings this time?

If a computer programmer is a person who makes computer does what he wants, then a computer programmee is a person who is made by a computer to do what it wants. Examples of such people are online taxi drivers such as Uber/Lyft/Didi/Grab/Gojek drivers.

Computer users are sometimes computer programmees: When a computer shows an error message, it sometimes tells us something to try which may fix the error (such as "Please insert the disk and try again"), and we sometimes do what the computer tells us to do, and in doing so, we have just let the computer program us, albeit perhaps less severely than the case of online taxi drivers.

What will the world become when computers begin programming humans?

13What is computer programming like?

Programming is full of unforeseen problems. Often the only known way to know something is to try it out (to actually do it). There is no known theory that can predict the dynamics of software systems. There is no Newton's laws in programming. In physics, one can put numbers and get answers, and the answers represent reality well. There doesn't seem to be any such thing in computer programming.

14What do we mean by "programmable"?

An unprogrammable machine computes only one mathematical function. A programmable machine computes many functions. However, we can see a programmable machine as computing one function, that is a higher function from programs to functions.

Combine several mechanical calculators, and add a mechanism to select which calculator to use.

Calculator vs programmable calculator.

Software does not have to be changeable. An example of practically unchangeable software is a program stored in a read-only memory (ROM) chip.

15Code as data: Stored-program computers

The important concept: stored-program computers, that is, code-as-data.

It enables computer to help humans program computers.

16What is reasoning about a program?

We may estimate whether a small subprogram may be correct or is obviously wrong.

We may estimate how fast a small subprogram is.

An example of reasoning: "This flag is always true, so this branch is never taken, so I can delete this part of the program without changing what it means."

To refactor a program is to transform it without making it stop satisfying any of the programmer's original desires.

17Bibliography

[1] Shannon, C.E. 1938. A symbolic analysis of relay and switching circuits. Electrical Engineering. 57, 12 (1938), 713–723.


  1. <2019-11-01> https://en.wikipedia.org/wiki/BogoMips

  2. <2019-10-27> https://en.wikipedia.org/wiki/A_Symbolic_Analysis_of_Relay_and_Switching_Circuits

  3. <2019-10-27> https://en.wikipedia.org/wiki/Claude_Shannon#Logic_circuits

  4. https://www.cs.virginia.edu/~evans/greatworks/shannon38.pdf

  5. <2019-10-26> https://en.wikipedia.org/wiki/Lumped-element_model

  6. <2019-11-04> https://en.wikipedia.org/wiki/Turing_machine

  7. <2019-11-04> https://en.wikipedia.org/wiki/Abstract_machine

  8. <2019-10-25> https://en.wikipedia.org/wiki/Lambda_calculus