1Note: This page was automatically converted from markdown to org; there may be some LaTeX formatting breakage

2from math.md

3Category theory

  • What is the history of category theory? Who invented it? Why?

  • What is a graph?

    • A graph has vertices and edges.
  • What is an undirected graph? What is a directed graph?

    • A undirected graph is a graph whose edges are unordered pairs.
    • A directed graph is a graph whose edges are ordered pairs.
  • Which one is the default?

    • In this document, when we say just "graph", we mean "directed graph".
  • What is a pre-category?
  • What is a category?

    • A category has objects and morphisms, and satisfies the category axioms.

      • What is an object?

        • Anything. It's undefined, like "point" in geometry.
      • What is a morphism?
      • What are the category axioms?

    • A category can be thought of as a graph whose edge relation is reflexive and transitive.

  • What is a functor?

    • A functor maps a category to another category.
    • The endofunctors on a category \(C\) form a category called the functor category of \(C\).

      • Category theory is recursive.
  • What is a natural transformation?
  • What is an adjunction?
  • What is a monad?

    • A monad is a triple \((T,\mu,\eta)\).

3.1Undigested

4Solving the Clay millennium prize problems

4.2The Riemann hypothesis

4.3Birch and Swinnerton-Dyer conjecture

5Function

5.1Prerequisites

5.2Relation

An example ordered pair is \((a,b)\). It is not the same as \((b,a)\).

A relation \(r\) is a triple \((A,B,R)\) where \(A\) is a set called the domain, \(B\) is a set called the codomain, and \(R \subseteq A \times B\) is a set that is the relation's mapping. Such mapping is a set of ordered pairs.

Iff \((a,b) \in R\), then we say that \(r\) relates \(a\) to \(b\). The word "to" implies that direction is important.

A relation is not just a subset of a Cartesian product. The domains and codomains matter.

5.3Function

A function is a relation in which each element of the domain is related to exactly one thing.

Let \(f = (A,B,F)\) be a function.

The notation of "applying \(f\) to \(x\)", written \(f(x)\), means the \(y\) such that \((x,y) \in F\).

5.4Partial functions

A partial function is a relation in which each element of the domain is related to at most one thing. If you replace "at most one" with "exactly one", you get the definition of a function.

If a partial function \(f\) does not relate a domain element \(x\) to anything, then we say that \(f(x)\) is undefined.

The symbol \(\bot\) is called bottom.

We can totalize a partial function \(f = (A,B,F)\) into a total function \(f_\bot = (A,B_\bot,F_\bot)\) where \(B_\bot = B \cup \\{ \bot \\}\), and we require that \(\bot \not\in B\), and, for each \(a \in A\):

  1. if \((a,b) \in F\), then \((a,b) \in F_\bot\);
  2. otherwise, \((a,\bot) \in F_\bot\).

What is the difference between "undefined" and "bottom"? The "bottom" of a set \(B\) is defined as something that is not equal to anything in \(B\). On the other hand, something "undefined" is not defined at all, for example the result of \(1 / 0\). Thus, we should not write \(f(x) = \bot\) to mean that \(f(x)\) is undefined.

Agda Wiki: Totality

ncatlab: partial function

The function \(f : A \to B\) has input type \(A\) and output type \(B\).

5.5Function equality

  • Two relations are equal iff

    • their domains are equal,
    • their codomains are equal, and
    • their mappings are equal.
  • Formally, \((A_1,B_1,R_1) = (A_2,B_2,R_2)\) iff

    • \(A_1 = A_2\),
    • \(B_1 = B_2\), and
    • \(R_1 = R_2\).
  • Functions are relations.

    • Equality of functions is equality of relations.
  • Intension vs extension

    • Consider:

      • \(f : \Nat \to \Nat\), ( f(n) = n + n ),
      • \(g : \Nat \to \Nat\), ( g(n) = 2 ⋅ n ).
    • Observe:

      • They are extensionally equal: \(f(n) = g(n)\) for all \(n\).
      • They are not intensionally equal: \(f \neq g\).
      • Their outputs match, but they are not the same function.
    • Problem:

      • When do we care about intension?
      • Do we ever care at all?

5.6Relationship between functions in functional programming and functions in mathematics

These functions idn and idz are different functions.

\[\begin{align*} [idn] = (\Nat, \Nat , x \to x) \\ [idz] = (\Integers, \Integers, x \to x) \end{align*} \]

The slides the lambda calculus has some explanation of the semanticses of lambda calculus. The slides are a part of Yale CS-430/CS-530 formal semantics course.

6Limit

We motivate the definition of the limit of a sequence.

A sequence \(s\) consists of the elements \(s_0\), \(s_1\), and so on, possibly endlessly. The notation ( s = ( 1/n )/{n=1} ) means the sequence

\[\begin{align*} s_1 &= 1/1, \\ s_2 &= 1/2, \\ s_3 &= 1/3, \end{align*} \]

and so on where \(s_n = 1/n\) for each integer \(n\) greater than or equal to one. The index does not have to begin with one. For example, the notation ( ( n2 )/{n ∈ } ) means the sequence

\[\begin{align*} s_0 = 0^2, \\ s_1 = 1^2, \\ s_2 = 2^2, \end{align*} \]

and so on where \(s_n = n^2\) for each \(n\) in \(\Nat\). Unless indicated otherwise, an index is a natural number, which begins from zero. Instead of using indexes such as \(s_n\), we can also use brackets \(s[n]\), and we can even use parentheses \(s(n)\). We define \(S(A)\) to be the image of \(A\) through \(s\), that is the set of all possible outputs of \(s\), that is the set ( { x | a ∈ A, ~ s(a) = x } ). The set \(S(A)\) is obtained by applying \(s\) to each element of \(A\), that is, if ( A = { a1, a2, a3, … } ), then ( S(A) = { s(a1), s(a2), s(a3), … } ). Let \(A\) be a set from which the sequence elements are taken. Let \(A\) also has a partial order \(\le\). The statement \(\beta \le \alpha\) can also be written \(\alpha \ge \beta\). The notation \(s(k)\) suggests that a sequence can be thought of as a function \(\Nat \to A\). Due to the partial order, we can define an upper bound of $s$ as a \(z\) such that \(s_n \le z\) for all \(n\), and we can define a lower bound of $s$ as an \(a\) such that \(s_n \ge a\) for all \(n\). The supremum of \(s\) is its least upper bound. The infimum of \(s\) is its greatest lower bound. If and only if the infimum \(\inf s\) and the supremum \(\sup s\) have the same value \(L\), then we say that \(s\) converges to \(L\), and we say that the limit of \(s\) is \(L\), and we write \(\lim s = L\).

Note that the bounds do not have to occur in the sequence itself.

If we have a sequence \(s\), then we can apply a function \(f\) to each member of \(s\) to create a new sequence \(t_n = f(s_n)\). Every function can be lifted into a function that works on sequences by applying that original function to each element of the sequence.

Now that we have defined the limit of a sequence, we can now define the limit of a function as its input approaches a value. Let \(f\) be a function. By saying "the input of \(f\) approaches \(a\)", we mean that there exists a sequence \(x\) that converges to \(a\). We define the limit of a function \(f\) as its input \(x\) approaches \(a\) to be \(L\), written \(\lim_{x \to a} f(x) = L\), if and only if for every sequence \(u\) that converges to \(a\), the limit of the sequence ( lim  (f(un))n ∈ = f(a) ).

Another way we can define \(\lim_{x \to a} f(x)\) is by the epsilon-delta definition of limit as done by Cauchy, Bolzano, and Weierstrass.

A generalized sequence is a function from an index set \(I\) to an element set \(A\), where the index has an ordering \(\le\).

7Teaching mathematics

8Learning mathematics

  • Setting the attitude

    • "Young man, in mathematics you don't understand things. You just get used to them." John von Neumann (1903–1957) (source)
    • Mathematics is English and some rules (or whatever your language is). People who can understand English can get used to mathematics (with effort).
    • Mathematics is a tower of concepts.
    • Mathematics is a chain of concepts.
    • Mathematics is a tree of concepts.
  • Definitions

    • Definitions are important. Your first reaction when you encounter an unknown word should be "What's its definition?" For example, a group is a set and a binary operation satisfying the group axioms. If you're new, your questions should be "What is a set?", "What is a binary operation?", and "What are the group axioms?", and then you should search them on the Internet. (Don't do it right now; it's just an example.)
    • Read the definition. Try to use the definition. Find some examples. You will forget. It's OK. Read them again later after you forgot. It takes several times of forgetting to get used to it, and eventually you'll get a feel of it. You must expect to forget, because forgetting is part of learning.
    • Mathematicians invent new words to do more with less words. It's more efficient to say "\(G\) is a group" than write the group axioms in full every time a group needs to be declared.
  • Notation

    • A mathematical notation, like the Latin alphabet, is a way of writing English (or whatever your language is). When you read "math", you are really reading the same language that you speak everyday.
    • When you encounter symbols in a sentence, think about how they should read in English to make the whole sentence grammatically correct.
    • Mathematical notations also have slangs and inconsistencies, just like English. We're human after all.
    • An equation \(A=B\) means you can change every \(A\) with \(B\), and every \(B\) with \(A\), everywhere you find them, as long as the context is still the same.
    • Don't hate the notation. Used properly, it saves your time and others' time.
    • Note: Please memorize the 48 symbols in the Greek alphabet (24 capital letters and 24 small letters).

9Probability and statistics

"Probstat" is "probability and statistics".

  • Basic probability modeling

    • Model coins and dices, both fair and biased, and combination
    • Multiple coin tosses
    • Conditional probability
    • Bayes rule

      • Sensitivity and specificity
      • False positive and false negative
      • Testing the test
    • Counting exhaustively

    • Basic combinatorics (counting combinations)
    • Randomness

  • Basic statistics

    • Collection? Dataset? Sample?
    • Variable vs parameter
    • Collect data
    • Design a survey

      • Avoid leading questions (WP)
      • Avoid loaded questions (WP)
    • WP: Statistical data type
    • Calculate descriptive statistics

  • More rigor

    • Measure theory
    • Continuous probability
  • Where should these be? How should we organize this?

    • Determining coin fairness
    • Random variable

      • Not a variable, and not random?
    • Inferential statistics

      • Infer something about the population from samples.
      • How do we sample the population correctly?
      • Case study: election "quick count" or "exit poll"
    • Estimation

      • Bias
      • Point estimation
      • Likelihood

        • Maximum likelihood estimation?
    • Fitting

      • Consider replacing "regression" with "fitting".
      • Line fitting with one independent variable
    • Random process
    • Random walk
    • Stochastic integral
    • Markov process
    • How do we interpret very small probabilities?
    • Expectation, expected value
    • Distribution
    • Normal distribution
    • Central limit theorem
    • Renewal theory
    • Decision theory
    • WP: Optimal stopping

    • Betting
    • Bookmaking

9.1Meta-research

9.2Education

9.4Monty Hall problem

There are three doors. Let the set of doors be ( D = { 1, 2, 3 } ). One door \(p\) has the prize. The other two doors \(q, r\) have nothing.

The host knows which is which. The contestant doesn't.

The host asks the contestant to pick a door.

The contestant chooses a door \(c\).

The host opens another door \(e\) (empty) that is neither \(c\) (your choice) nor \(p\) (the prize). Formally, ( e ∈ D - { c, p } ). Remember that sets don't care how many times an element occurs: ( { a,a } = { a } ).

The host asks the contestant whether he wants to switch (to the door that is neither \(c\) nor \(e\)).

Should the contestant switch?

Yes.

A calculation is in WP: Monty Hall problem.

9.5Randomness

"Random" means "we don't know why".

We see randomness because we ignore details.

Randomness is due to the details ignored by our models.

Coin tosses are unpredictable, but the statistics of coin tosses is predictable.

9.5.1Example of "random" in English

"The software crashes randomly." means "We don't know why it crashes." There is a cause, but we're ignoring it.

9.5.2Unanswered questions

What is the difference between these words: random, haphazard, chaotic, unpredictable, uncertain, noisy?

10Relation

The arity of a relation is its number of parameters. For example, if \(A\) has arity 2, then \(A(x,y)\) is a formula.

Relations of arity 0, 1, and 2 are also called nullary, unary, and binary, respectively.

A constant is a nullary relation.

A set is a unary relation. We can write either \(A(x)\) or \(x \in A\) to mean that \(x\) is an element of the set \(A\).

What is the difference between a relation and a predicate?

A pedantic note: Theoretically, a formula is not a truth value, and it is the interpretation that maps formulas to truth values. For example, if \(A\) is a unary relation, then \(A(x)\) is a formula, not the truth value, and therefore it does not make sense to say "\(A(x)\) is true". Practically, the interpretation is implicit, and we say "\(A(x)\) is true" to mean "the implied interpretation maps \(A(x)\) to truth value 1".

10.1Binary relations

These are isomorphic:

Every binary relation \(A\) is also a directed graph where \((x,y)\) is an edge iff \(A(x,y)\). We will mix terms. For example, a relation is acyclic iff its graph is acyclic.

Let \(A\) be a binary relation.

Iff \(A(x,y)\), then \(x\) is in the domain of \(A\).

Iff \(A(x,y)\), then \(y\) is in the range of \(A\).

\(x\) reaches \(y\) (\(y\) is reachable from \(x\)) iff there is a path from \(x\) to \(y\).

\(x\) is initial iff its in-degree is zero (no \(y\) satisfying \(A(y,x)\)).

\(x\) is terminal iff its out-degree is zero (no \(y\) satisfying \(A(x,y)\)).

The composition of \(A\) and \(B\) is \(A \circ B\) where ( (A ∘ B)(x,y) = ∃ m ( B(x,m) ∧ A(m,y) ) ). Note the swap: first \(B\) maps \(x\) to \(m\), and then \(A\) maps \(m\) to \(y\). Function composition is special case of relation composition.

The $n$th self-composition of \(A\) is ( An = A ∘ An-1 ).

The infinite self-composition of \(A\) is ( A^∞ ), the smallest relation satisfying \(A \circ A^\infty = A^\infty\). We also say that \(A^\infty\) is the least fixed point of \(F\) where ( F(X) = A ∘ X ).

The transitive closure of \(A\) is the union of all self-compositions of \(A\). Formally, ( tc(A) = ⋃k=1∞ Ak ).

11Unary algebra

A mono-unary algebra \((A,f)\) is a set \(A\) and a unary function \(f : A \to A\).

There is always an injection from a unary algebra \((A, f)\) to the magma \((A, F~f)\) where \(F~f~x~y = f~x\). The binary operation \(F~f\) ignores its second argument. This magma happens to also be a noncommutative semigroup: if we let \(g = F~f\), then \(g~(g~x~y)~z = g~x~(g~y~z)\). Therefore the variety of unary algebras is isomorphic to a subvariety of magmas.

There is also always an injection from a magma \((A, g)\) to the unary algebra \((A^2, G~g)\) where \(G~g~(x,y) = (g~x~y, ~ y)\). The unary operation \(G~g\) passes through the second component.

If \(A\) is infinite, there is always a bijection between \(A\) and \(A^2\).

If \(A\) and \(B\) have the same cardinality, then for each injection \(m : A \to B\) and function \(f : A \to A\), there is always a \(g : B \to B\) such that \(m~(f~x) = g~(m~x)\). More understandably, \[ f~x = y \iff g~(m~x) = m~y \]

If \(A\) and \(B\) have the same cardinality, then for each function \(f : A \to A\), there is always an injection \(m : A \to B\) and a function \(g : B \to B\) such that \(m~(f~x) = g~(m~x)\).

If \(A\) and \(B\) are isomorphic (have the same cardinality), then the variety of \(A\)-unary algebras and the variety of \(B\)-unary algebras are isomorphic.

The variety of \(A\)-unary algebras is the set of all unary algebras whose underlying set is \(A\).

Lemma: There is always an isomorphism between two varieties of unary algebras whose underlying sets have the same cardinality.

Corollary: if \(A\) is infinite, then the variety of \(A\)-unary algebras and the variety of \(A^2\)-unary algebras are isomorphic.

If \(A\) is infinite, there is always a bijection \(m : A \to A^2\) such that \(m~(f~x) = g~(m~x)\).

If \(A\) is infinite, then there is always an injection from a unary algebra \((A^2,f)\) to the unary algebra \((A,g)\).

If \(A\) is infinite, then there is always a bijection between a unary algebra \((A,f)\) and a unary algebra \((A^2,g)\), for every \(f\) and \(g\).

Lemma: If there is a bijection between \(A\) and \(B\), there is also a bijection between \(2^A\) and \(2^B\). (Axiom of choice?)

http://math.stackexchange.com/questions/243590/bijection-from-mathbb-r-to-mathbb-rn

Lemma: If there is a bijection between \(A\) and \(B\), there is also a bijection between \(A \to A\) and \(B \to B\). (Since \((A \to A) \subset 2^A\)).

Conclusion: there is an isomorphism between the set of $(A,f)$s and the set of $(A2,g)$s.

A homomorphism from \((A, f)\) to \((B, g)\) is \(h : A \to B\) such that \(h~(f~x) = g~(h~x)\).

Let there be these structures:

  • The unary system \((A, f)\) where \(f : A \to A\).
  • The fixpointed unar \((A, f, p)\) where \(f~p = p\).
  • The magma \((A, g)\) where \(g : A \to A \to A\).
  • The semigroup \((A, g)\) where \(g\) is associative.
  • The semigroup \((A, g, a)\) with left-absorbing element \(a\).
  • The unar \((A^2, h)\) where \(A^2 = A \times A\).

A fixpoint in the unar becomes a left-absorbing element in the magma.

The semigroup is non-commutative: \(F~f~x~y \neq F~f~y~x\).

Therefore there is a homomorphism from the algebra of unary systems to the algebra of non-commutative semigroups.

A left-absorbing element in the binar becomes the left component of a fixpoint in the unar. \[ (g~p~y, ~y) = (p,y) = h~(p,y) \]

Another way to embed: \[ \begin{align*} (g~x~y, ~ g~y~x) = h~(x,y) \\ (g~p~y, ~ g~y~p) = (p, ~ f~y) = h~(p,y) \\ (g~x~p, ~ g~p~x) = (f~x, ~ p) = h~(x,p) \end{align*} \]

Flip, like negation: \[ m~(x,y) = m~(y,x) \]

Lemma: If \(c\) is a cardinal number, then \(c! = 2^c\). WolframAlpha for the factorial of aleph-0

11.1A unar and its square are isomorphic?

\[ h~(g~x~y) = f~(h~x)? \]

11.2Graph

There is always an injection from a unary algebra \((A, f)\) to the directed graph \((A,E)\) where \(f~x = y\) iff \(E~x~y\).

There is always an injection from a directed graph \((V, E)\) to the unary algebra \((2^V, F)\) where \(F~X = \{ y ~|~ x \in X, ~ E~x~y \}\).

11.3Every magma can be commutative

Every magma \((A,f)\) can be embedded into an \(n\)-anticommutative magma \((A^2,g,n)\) where \(g~a~b = n~(g~b~a)\).

\[\begin{align*} n~(x,y) = (y,x) \\ g~(u,v)~(x,y) = (f~u~v, f~x~y) \end{align*} \]

11.4Conclusion

There is an isomorphism between unary systems and magmas.

12Measure theory 0

12.1Goals

  • Measure
  • Integral

12.2Where to go from here

12.3Measure

A Jordan content is…

A content \(m : 2^A \to \Real\) satisfies:

  1. \(m(x) \ge 0\),
  2. \(m(\emptyset) = 0\), and
  3. if \(X\cup Y = \emptyset\), then \(m(X \cup Y) = m(X) + m(Y)\).

A measure is a content that is countably additive.

A Lebesgue measure for \(\Real\) is \(m((a,b)) = m([a,b)) = m((a,b]) = m([a,b]) = |b-a|\). From a measure \(m : \Real \to \Real\) we can define a measure \(\mu : \Real^n \to \Real\) where \(\mu \left( \prod_{i=1}^n X_i \right) = \prod_{i=1}^n m(X_i)\).

A measure space is a set and a measure on that set.

A measurable function is …

13Writing mathematics

  • 1989, article, "Mathematical writing", Donald E. Knuth, Tracy Larrabee, and Paul M. Roberts, [pdf](http://jmlr.csail.mit.edu/reviewing-papers/knuth_mathematical_writing.pdf)
    • "Many readers will skim over formulas on their first reading of your exposition. Therefore, your sentences should flow smoothly when all but the simplest formulas are replaced by 'blah' or some other grunting noise. (p. 3)

14Which is more primitive: function or relation?

  • Which point of view should we use?
    • We can see a function as a special kind of relation.
    • We can see a relation from \( A \) to \( B \) as a function from \( A \) to \( 2^B \).
      • Every binary relation \((A,B,r)\) is a function \((A,2^B,f)\) where \(f(x) = \{ y ~\|~ r(x,y) \}\).
    • Which is more primitive: function or relation?

15Math, learning, and brain?

https://dyske.com/paper/825 "When I was in 7th grade, my math teacher answered an obvious but difficult question: Why do we have to study advanced math, if we are never going to use it in our adult lives? He said it’s because the same parts of our brains used for math can be used for many other things in life. With this short explanation, he utterly convinced me the importance of studying math, and of any other subjects for that matter."

16Mathematical modeling?

  • WP:Mathematical model
  • WP:Many-body problem
  • WP:Few-body systems
  • WP:N-body problem
  • Articles
  • What is the difference between variable and parameter?
  • How many parameters do we need to model a system?
  • Discrete Newtonian kinematical model
    • A system at time \( t \) is a set of particles ( { 1, …, n } ).
      • Time is a real number: \( t \in \Real \).
      • The number of particle is constant \( n \in \Nat \).
      • For each particle \( k \):
        • It has position \( x_k \in \Real^3 \).
        • Simplifying assumptions
          • particle
            • It is a point.
              • It doesn't occupy any space.
            • Its mass is not modeled.
          • Time is global and absolute (the same everywhere).
  • Discrete Newtonian dynamical model (N-body problem) extends discrete Newtonian kinematical model.
    • A system at time \( t \) is all that above, plus:
      • For each particle \( k \):
        • It has mass \( m_k \in \Real \).
        • It has resultant force \( F_k \) acting on it.
        • Simplifying assumptions about the particle
          • It is rigid.
            • It doesn't deform.
            • It doesn't break.
          • Its mass is constant.
          • It don't interact with other particles.
            • Particles don't merge or collide.
  • WP:Continuum mechanics
  • Skippable philosophical issues?