Notes on well-ordering

(Content warning: maths. This was written as an exercise in approaching the underlying ideas with rigour and understanding. If it's of any help to other people, this is a happy side effect.)

In discrete mathematics, the Well-Ordering Principle for natural numbers states that any non-empty set of natural numbers contains a least element of itself.

Slightly more formally: \[\forall S \subseteq \mathbb{N} : \; S = \emptyset \lor \exists x \in S (\forall y \in S (x \leq y))\]

Example: Let \(S = \{12,3,189,40,25\}\). Then the least element of \(S\) is \(3\).

Example: Let \(S = \{\}\). The Well-Ordering Principle doesn't apply because \(S\) is empty.

Example: Let \(S = \mathbb{N}\). Then the least element of \(S\) is \(0\).

Example: Let \(S = \{n \in \mathbb{N} \;|\; n\textrm{ odd}\} = \{1,3,5,\ldots\}\). Then the least element of \(S\) is \(1\).

The interesting thing here is that this applies to every subset of \(\mathbb{N}\), of which there are uncountably many. Some of them we can't compute membership for, e.g. the image of the busy beaver function. Most of them we can't even give a finite description for.

Another principle, Mathematical Induction, states that for any set of natural numbers, if it contains \(0\), and if it contains the successors of all its elements, then that set is \(\mathbb{N}\).

Slightly more formally: \[\forall S \subseteq \mathbb{N} : \; 0 \in S \land (\forall x \in \mathbb{N} : x \in S \Rightarrow x+1 \in S) \Rightarrow S = \mathbb{N}\]

This is more commonly phrased in terms of predicates: if \(P(0)\), and if \((\forall x) P(x) \Rightarrow P(x+1)\), then \(P\) is true for all natural numbers.

Claim 1: the Well-Ordering Principle is equivalent to the Principle of Mathematical Induction.

(This is assuming we're starting with some basic axioms of Peano Arithmetic: unique successors, \(0\) as the only number without a predecessor, addition. We can define \(\leq\) in terms of addition.)

Proof (\(\Rightarrow\)): Let \(S \subseteq \mathbb{N}\) be given such that it contains 0 and also contains the successors of all its elements. Assume for contradiction's sake that \(S \neq \mathbb{N}\). Then \(\mathbb{N} \setminus S\) is non-empty. Let \(n \not \in S\) be its least element: this exists, by our Well-Ordering Principle assumption, and can't be \(0\), by our base case assumption.

Then \(n\) has a unique predecessor \(n-1\), which, since it is less than \(n\), must be in \(S\). But by our inductive case assumption, \(n-1 \in S \Rightarrow n \in S\). This is a contradiction. Thus \(S = \mathbb{N}\).

Proof (\(\Leftarrow\)): Let \(S \subseteq \mathbb{N}\) be given. Assume that \(S\) has no least element. Then we will prove by induction that \(S\) must be empty. Our inductive hypothesis will be \(P(x) := \forall 0 \leq i \leq x : i \not \in S\).

Base case: As with the inductive case below.

Inductive case: Assume that \(i \not \in S\) for all \(0 \leq i < n\). We need to show that \(n \not \in S\). But this is obvious, since otherwise \(n\) would be the least element of \(S\).

(The above proof hides the innocent-seeming assumption that \(n\) is the least element of \(\mathbb{N} \setminus [0, n-1]\). Fortunately this is easy to show with a separate inductive proof.)

Side note!: The Well-Ordering Principle, along with the Peano axioms, uniquely characterises the natural numbers. (It can't be expressed in first-order logic; that's why it skirts around Gödel's Completeness + Incompleteness Theorems, which, combined, say you can't characterise the natural numbers using only first-order statements.)

Extending the Well-Ordering Principle

The ordinals (or: ordinal numbers) are an extension of the natural numbers which capture the same notion of well-orderedness that allows us to do induction.

Definition (total order): A total order over a set \(S\) is a reflexive, transitive relation \(\preceq\) such that for all \(a, b \in S\), at least one of \(a \preceq b\) or \(a \succeq b\). (A strict total order is one where \(a \preceq b \preceq a \Rightarrow a = b\).)

Definition (well order): A well order is a strict total order such that every subset of \(S\) contains a least element of itself.

Example: \(\langle \mathbb{N}, \leq \rangle\), i.e. the natural numbers with the usual ordering, is a well-ordered set (by the Well-Ordering Principle).

Example: \(\langle \mathbb{Z}, \leq \rangle\), i.e. the integers with the usual ordering, is not a well-ordered set, since the set has no least element.

Example: \(\langle \mathbb{R}^{\geq 0}, \leq \rangle\), i.e. the non-negative reals with the usual ordering, is not a well-ordered set. For instance, the subset \((0,1]\) does not contain a least element of itself.

Example: \(\mathbb{N}^2\), ordered lexicographically as follows: \[ (x,y) \preceq (x', y') \Leftrightarrow \left\{\begin{array}{ll} x < x', &\textrm{ or} \\ x = x' \land y \leq y' \end{array}\right. \] ...is a well-ordered set. (Every subset has a least \(x\) value, and a least \(y\) value in that row.)

Well-ordered sets allow us to perform an analogue of induction:

Theorem 2 (transfinite induction): Given a well-ordered set \(\langle S, \leq \rangle\) and a predicate \(P\) over that set, then: \[ \left(\forall s \in S : (\forall s' < s : P(s')) \Rightarrow P(s)\right) \Rightarrow \forall s \in S : P(s)\]

(I.e. if if \(P\) is true everywhere before an element, then \(P\) is true at that element, then \(P\) is true everywhere.)

Proof (sketch): As with normal induction: pick the least element where \(P\) is false, and thereby obtain a contradiction.

Example: Setting \(S = \mathbb{N}\) gives us general mathematical induction.

Example: Let \(S = \mathbb{N}^2\) ordered lexicographically. Then if:

\(P(0,0)\), and
\(\forall x, y: P(x,y) \Rightarrow P(x, y+1)\), and
\(\forall x : (\forall y : P(x, y)) \Rightarrow P(x+1, 0) \)

if \(P\) is true on all of the previous row, \(P\) is true at the start of this row

...then \(P\) is true on all of \(\mathbb{N}^2\).

In the above example we have broken down true everywhere before an element into three cases specific to the structure of \(\mathbb{N}^2\).

Analogously to the natural numbers, this principle of transfinite induction is equivalent to well-orderedness. We can see it break down for the following set which isn't well-ordered:

Example: Set \(S = [0,1]\) with the usual ordering on the reals. Let the predicate \(P(x)\) denote \(x = 0\).

\(P(0)\) is true.
Let \(x \in (0,1]\) be given, and assume \(P(x')\) for all \(0 \leq x' < x\). Then:\[\begin{align*} & 0 < \frac{x}{2} < x \\ \Rightarrow & P\left(\frac{x}{2}\right) \\ \Rightarrow & \frac{x}{2} = 0 \\ \Rightarrow & x = 0 \end{align*}\] ...and hence \(P(x)\).

If transfinite induction was valid on \(\langle [0,1], \leq \rangle\) then this would constitute a proof that \([0,1] = \{0\}\), which is clearly not true.

Characterising well-ordered sets

We set out to prove that all well-ordered sets have similar structure: in fact, given any two well-ordered sets, (at least) one of them is isomorphic to a subset of the other.

Definition (isomorphism, well-ordered sets): Two well-ordered sets, \(\langle S, \leq_S \rangle\) and \(\langle T, \leq_T \rangle\), are isomorphic if there exists a bijection \(f: S \leftrightarrow T\) such that \(x \leq_S y \Leftrightarrow f(x) \leq_T f(y)\).

Example: \(\langle \{0,1,\ldots,25\}, \leq \rangle\) is isomorphic to the set of upper case letters A-Z, ordered lexicographically.

Example: \(\langle \mathbb{N}, \leq \rangle\) is isomorphic to \(\langle \mathbb{Z}^+, \leq \rangle\) (the positive integers). We map \(n\) to \(n+1\).

Example: \(\langle \mathbb{N}, \leq \rangle\) is not isomorphic to \(\langle \{0,1,\ldots,25\}, \leq \rangle\), since the sets have different sizes.

Example: \(\langle \mathbb{N}, \leq \rangle\) is not isomorphic to \(\langle \mathbb{N}^2, \leq \rangle\), even though both sets have the same size (i.e. countably infinite). Notice that \((1, 0) > (0, x)\) for infinitely many values of \(x\), but if we map \(n \in \mathbb{N}\) to \((1,0)\), then it is only greater than \(n\) elements — a contradiction. Hence there is no mapping that preserves the well ordering.

Theorem 3 (unique isomorphisms): If the well-ordered sets \(S\) and \(T\) are isomorphic, there is exactly one such bijection \(f: S \leftrightarrow T\) between them.

Proof: Let isomorphic well-ordered sets \(S,T\) be given, with bijections \(f,g: S \leftrightarrow T\). Assume \(f \neq g\), and consider the least element \(x \in S\) such that \(f(x) \neq g(x)\).

We have \(g^{-1}(f(x)) > x\), and thus \(f(x) > g(x)\). But similarly \(g(x) > f(x)\), a contradiction.

Definition (initial segments): Given a well-ordered set, \(\langle S, \leq \rangle\), an initial segment of \(S\) is a well-ordered set \(\langle S', \leq \rangle\), where \(S' \subseteq S\) and for all \(s \in S, s' \in S'\), if \(s \leq s'\) then \(s \in s'\). (Or: an initial segment is a subset of a well-ordered set which is closed under picking a smaller element.)

Example: \(\langle \{0,1,\ldots,25\}, \leq \rangle\) is an initial segment of \(\langle \mathbb{N}, \leq \rangle\).

Example: \(\langle \{0,1,\ldots,22,23,25\}, \leq \rangle\) is not an initial segment of \(\langle \mathbb{N}, \leq \rangle\), because it does not contain the smaller element \(24 < 25\).

Example: \(\langle \{(0,n) \;|\; n \in \mathbb{N}\}, \leq \rangle\) is an initial segment of \(\langle \mathbb{N}^2, \leq \rangle\).

Theorem 4 (initial segments are well-ordered): The initial segments of \(S\) are well-ordered under subset inclusion.

Proof: It is easy to show that they are totally ordered (namely, two initial segments cannot be incomparable). The crux of this proof, then, is showing that a collection of initial segments has a least element.

Let \(\{S_i\}\) be any collection of initial segments of \(S\). The only plausible candidate for least element of \(\{S_i\}\) is: \[S' := \bigcup S_i\]

We must show that \(S' \in \{S_i\}\). Assume otherwise. Then for each \(S_i\) define \(x_i\) to be the least element of \(S_i \setminus S'\). By well-orderedness of \(S\), \(\{x_i\}\) has a least element \(x\) such that \(\forall S_i : x \in S_i\). But then this implies \(x \in S'\), a contradiction!

This result gives us the ability to characterise well-ordered sets themselves by doing transfinite induction of their initial segments! For instance:

Theorem 5 (well-ordered sets are comparable): Given two well ordered sets \(\langle S, \leq_S \rangle\) and \(\langle T, \leq_T \rangle\), at least one of the following holds:

(a) \(S\) is isomorphic to an initial segment of \(T\), and/or
(b) \(T\) is isomorphic to an initial segment of \(S\).

Proof: We proceed by induction on \(S\).

Base case: \(S = \emptyset\). This trivially gives case (a).
Successor case: \(S = S' \cup \{x\}\), for some initial segment \(S'\) and some maximal element \(x\). If any initial segment of \(S'\) is isomorphic to \(T\), we have case (b).
Otherwise, by our inductive hypothesis, \(S'\) is isomorphic to a strict initial segment of \(T\): call it \(T' \subset T\). Let \(f: S' \leftrightarrow T'\) be the isomorphism, and \(y\) be the least element of \(T \setminus T'\). We augment \(f\) with \(f(x) := y\), giving case (a).
Limit case: \(S\) has no maximal element. If any strict initial segment \(S' \subset S\) is isomorphic to \(T\), we have case (b).
Otherwise, let \(\{S_i\}\) be the set of all strict initial segments of \(S\). \(S' = \bigcup S_i\) by our no maximal element assumption.
By our inductive hypothesis, each \(S_i\) is isomorphic to some initial segment of \(T\): call it \(T_i\).
Let \(T' = \bigcup T_i\). This is trivially an initial segment of \(T\). We construct \(f: S' \leftrightarrow T'\) as per case (a) by taking the union of all \(f_i: S_i \leftrightarrow T_i\). (I.e. let \(f(x) = f_i(x)\) for any \(f_i\) with \(x\) in its domain.)
This is well-defined as a consequence of Theorem 3.

Corollary 6 (well-ordered sets are totally ordered): Given two well ordered sets \(\langle S, \leq_S \rangle\) and \(\langle T, \leq_T \rangle\), exactly one of the following holds:

(a) \(S\) is isomorphic to an initial segment of \(T\) (which is not \(T\)), or
(b) \(T\) is isomorphic to an initial segment of \(S\) (which is not \(S\)), or
(c) \(S\) is isomorphic to \(T\).

Informally, the ordinals can be thought of as equivalence classes of well-ordered sets. Examples of ordinals include:

\(0\), which corresponds to the structure of the empty set.
\(1\), which corresponds to the structure of a singleton set.
\(26\), which corresponds to the structure of \(\{0, 1, \ldots, 25\}\), and also the structure of \(\{A, B, C, \ldots, Z\}\). (As it so happens, this is isomorphic to all well-ordered sets with that many elements.)
\(\omega\), which corresponds to the structure of \(\mathbb{N}\) and \(\mathbb{Z}^+\) (...and any other well-ordered set which has the structure of the natural numbers).
\(\omega^2\), which corresponds to the structure of \(\langle \mathbb{N}^2, \leq \rangle\).

(We can't actually define such equivalence classes per se in Zermelo–Fraenkel set theory, since they're too big: e.g. were we able to define the equivalence class of \(1\), we would be able to transform it into a set containing all sets, thus hitting Russell's Paradox. We'll look at a feasible definition of ordinals later.)

By Corollary 6, the ordinals are totally ordered. Furthermore, with an argument similar to that in Theorem 4, we can show the ordinals are well-ordered!

References:
Allard, W., notes for Basic Analysis I, retrieved from here 25th January 2014.
...many other sources besides.

Open Endings

Sunday, January 26, 2014