(Content warning: maths. This was written as an exercise in approaching the underlying ideas with rigour and understanding. If it's of any help to other people, this is a happy side effect.)
In discrete mathematics, the WellOrdering Principle for natural numbers states that any nonempty set of natural
numbers contains a least element of itself
.
Slightly more formally: \[\forall S \subseteq \mathbb{N} : \; S = \emptyset \lor \exists x \in S (\forall y \in S (x \leq y))\]
Example: Let \(S = \{12,3,189,40,25\}\). Then the least element of \(S\) is \(3\).
Example: Let \(S = \{\}\). The WellOrdering Principle doesn't apply because \(S\) is empty.
Example: Let \(S = \mathbb{N}\). Then the least element of \(S\) is \(0\).
Example: Let \(S = \{n \in \mathbb{N} \;\; n\textrm{ odd}\} = \{1,3,5,\ldots\}\). Then the least element of \(S\) is \(1\).
The interesting thing here is that this applies to every subset of \(\mathbb{N}\), of which there are uncountably many. Some of them we can't compute membership for, e.g. the image of the busy beaver function. Most of them we can't even give a finite description for.
Another principle, Mathematical Induction, states that for any set of natural numbers, if it contains \(0\), and if it contains the successors of all its elements, then that set is \(\mathbb{N}\)
.
Slightly more formally: \[\forall S \subseteq \mathbb{N} : \; 0 \in S \land (\forall x \in \mathbb{N} : x \in S \Rightarrow x+1 \in S) \Rightarrow S = \mathbb{N}\]
This is more commonly phrased in terms of predicates: if \(P(0)\), and if \((\forall x) P(x) \Rightarrow P(x+1)\), then \(P\) is true for all natural numbers.
Claim 1: the WellOrdering Principle is equivalent to the Principle of Mathematical Induction.
(This is assuming we're starting with some basic axioms of Peano Arithmetic: unique successors, \(0\) as the only number without a predecessor, addition. We can define \(\leq\) in terms of addition.)
Proof (\(\Rightarrow\)):
Let \(S \subseteq \mathbb{N}\) be given such that it contains 0 and also contains the successors of all its elements. Assume for contradiction's sake that \(S \neq \mathbb{N}\). Then \(\mathbb{N} \setminus S\) is nonempty. Let \(n \not \in S\) be its least element: this exists, by our WellOrdering Principle assumption, and can't be \(0\), by our base case
assumption.
Then \(n\) has a unique predecessor \(n1\), which, since it is less than \(n\), must be in \(S\). But by our inductive case
assumption, \(n1 \in S \Rightarrow n \in S\). This is a contradiction. Thus \(S = \mathbb{N}\).
Proof (\(\Leftarrow\)): Let \(S \subseteq \mathbb{N}\) be given. Assume that \(S\) has no least element. Then we will prove by induction that \(S\) must be empty. Our inductive hypothesis will be \(P(x) := \forall 0 \leq i \leq x : i \not \in S\).
Base case: As with the inductive case below.
Inductive case: Assume that \(i \not \in S\) for all \(0 \leq i < n\). We need to show that \(n \not \in S\). But this is obvious, since otherwise \(n\) would be the least element of \(S\).
(The above proof hides the innocentseeming assumption that \(n\) is the least element of \(\mathbb{N} \setminus [0, n1]\). Fortunately this is easy to show with a separate inductive proof.)
Side note!: The WellOrdering Principle, along with the Peano axioms, uniquely characterises the natural numbers. (It can't be expressed in firstorder logic; that's why it skirts around Gödel's Completeness + Incompleteness Theorems, which, combined, say you can't characterise the natural numbers using only firstorder statements.)
Extending the WellOrdering Principle
The ordinals (or: ordinal numbers) are an extension of the natural numbers which capture the same notion of wellorderedness that allows us to do induction.
Definition (total order): A total order over a set \(S\) is a reflexive, transitive relation \(\preceq\) such that for all \(a, b \in S\), at least one of \(a \preceq b\) or \(a \succeq b\). (A strict total order is one where \(a \preceq b \preceq a \Rightarrow a = b\).)
Definition (well order): A well order is a strict total order such that every subset of \(S\) contains a least element of itself.
Example: \(\langle \mathbb{N}, \leq \rangle\), i.e. the natural numbers with the usual ordering, is a wellordered set (by the WellOrdering Principle).
Example: \(\langle \mathbb{Z}, \leq \rangle\), i.e. the integers with the usual ordering, is not a wellordered set, since the set has no least element.
Example: \(\langle \mathbb{R}^{\geq 0}, \leq \rangle\), i.e. the nonnegative reals with the usual ordering, is not a wellordered set. For instance, the subset \((0,1]\) does not contain a least element of itself.
Example: \(\mathbb{N}^2\), ordered lexicographically
as follows:
\[
(x,y) \preceq (x', y') \Leftrightarrow \left\{\begin{array}{ll} x < x', &\textrm{ or} \\ x = x' \land y \leq y' \end{array}\right.
\]
...is a wellordered set. (Every subset has a least \(x\) value, and a least \(y\) value in that row.)
Wellordered sets allow us to perform an analogue of induction:
Theorem 2 (transfinite induction): Given a wellordered set \(\langle S, \leq \rangle\) and a predicate \(P\) over that set, then: \[ \left(\forall s \in S : (\forall s' < s : P(s')) \Rightarrow P(s)\right) \Rightarrow \forall s \in S : P(s)\]
(I.e. if if \(P\) is true everywhere before an element, then \(P\) is true at that element
, then \(P\) is true everywhere.)
Proof (sketch): As with normal induction: pick the least element where \(P\) is false, and thereby obtain a contradiction.
Example: Setting \(S = \mathbb{N}\) gives us general mathematical induction.
Example: Let \(S = \mathbb{N}^2\) ordered lexicographically. Then if:
 \(P(0,0)\), and
 \(\forall x, y: P(x,y) \Rightarrow P(x, y+1)\), and
 \(\forall x : (\forall y : P(x, y)) \Rightarrow P(x+1, 0) \) (i.e.
if \(P\) is true on all of the previous row, \(P\) is true at the start of this row)
In the above example we have broken down true everywhere before an element
into three cases specific to the structure of \(\mathbb{N}^2\).
Analogously to the natural numbers, this principle of transfinite induction is equivalent to wellorderedness. We can see it break down for the following set which isn't wellordered:
Example: Set \(S = [0,1]\) with the usual ordering on the reals. Let the predicate \(P(x)\) denote \(x = 0\)
.
 \(P(0)\) is true.
 Let \(x \in (0,1]\) be given, and assume \(P(x')\) for all \(0 \leq x' < x\). Then:\[\begin{align*} & 0 < \frac{x}{2} < x \\ \Rightarrow & P\left(\frac{x}{2}\right) \\ \Rightarrow & \frac{x}{2} = 0 \\ \Rightarrow & x = 0 \end{align*}\] ...and hence \(P(x)\).
If transfinite induction was valid on \(\langle [0,1], \leq \rangle\) then this would constitute a proof that \([0,1] = \{0\}\), which is clearly not true.
Characterising wellordered sets
We set out to prove that all wellordered sets have similar structure
: in fact, given any two wellordered sets, (at least) one of them is isomorphic to a subset of the other.
Definition (isomorphism, wellordered sets): Two wellordered sets, \(\langle S, \leq_S \rangle\) and \(\langle T, \leq_T \rangle\), are isomorphic if there exists a bijection \(f: S \leftrightarrow T\) such that \(x \leq_S y \Leftrightarrow f(x) \leq_T f(y)\).
Example: \(\langle \{0,1,\ldots,25\}, \leq \rangle\) is isomorphic to the set of upper case letters A
Z
, ordered lexicographically.
Example: \(\langle \mathbb{N}, \leq \rangle\) is isomorphic to \(\langle \mathbb{Z}^+, \leq \rangle\) (the positive integers). We map \(n\) to \(n+1\).
Example: \(\langle \mathbb{N}, \leq \rangle\) is not isomorphic to \(\langle \{0,1,\ldots,25\}, \leq \rangle\), since the sets have different sizes.
Example: \(\langle \mathbb{N}, \leq \rangle\) is not isomorphic to \(\langle \mathbb{N}^2, \leq \rangle\), even though both sets have the same size (i.e. countably infinite). Notice that \((1, 0) > (0, x)\) for infinitely many values of \(x\), but if we map \(n \in \mathbb{N}\) to \((1,0)\), then it is only greater than \(n\) elements — a contradiction. Hence there is no mapping that preserves the well ordering.
Theorem 3 (unique isomorphisms): If the wellordered sets \(S\) and \(T\) are isomorphic, there is exactly one such bijection \(f: S \leftrightarrow T\) between them.
Proof: Let isomorphic wellordered sets \(S,T\) be given, with bijections \(f,g: S \leftrightarrow T\). Assume \(f \neq g\), and consider the least element \(x \in S\) such that \(f(x) \neq g(x)\).
We have \(g^{1}(f(x)) > x\), and thus \(f(x) > g(x)\). But similarly \(g(x) > f(x)\), a contradiction.
Definition (initial segments): Given a wellordered set, \(\langle S, \leq \rangle\), an initial segment of \(S\) is a wellordered set \(\langle S', \leq \rangle\), where \(S' \subseteq S\) and for all \(s \in S, s' \in S'\), if \(s \leq s'\) then \(s \in s'\). (Or: an initial segment is a subset of a wellordered set which is closed under picking a smaller element
.)
Example: \(\langle \{0,1,\ldots,25\}, \leq \rangle\) is an initial segment of \(\langle \mathbb{N}, \leq \rangle\).
Example: \(\langle \{0,1,\ldots,22,23,25\}, \leq \rangle\) is not an initial segment of \(\langle \mathbb{N}, \leq \rangle\), because it does not contain the smaller
element \(24 < 25\).
Example: \(\langle \{(0,n) \;\; n \in \mathbb{N}\}, \leq \rangle\) is an initial segment of \(\langle \mathbb{N}^2, \leq \rangle\).
Theorem 4 (initial segments are wellordered): The initial segments of \(S\) are wellordered under subset inclusion.
Proof: It is easy to show that they are totally ordered (namely, two initial segments cannot be incomparable
). The crux of this proof, then, is showing that a collection of initial segments has a least element.
Let \(\{S_i\}\) be any collection of initial segments of \(S\). The only plausible candidate for least element of \(\{S_i\}\) is: \[S' := \bigcup S_i\]
We must show that \(S' \in \{S_i\}\). Assume otherwise. Then for each \(S_i\) define \(x_i\) to be the least element of \(S_i \setminus S'\). By wellorderedness of \(S\), \(\{x_i\}\) has a least element \(x\) such that \(\forall S_i : x \in S_i\). But then this implies \(x \in S'\), a contradiction!
This result gives us the ability to characterise wellordered sets themselves by doing transfinite induction of their initial segments! For instance:
Theorem 5 (wellordered sets are comparable): Given two well ordered sets \(\langle S, \leq_S \rangle\) and \(\langle T, \leq_T \rangle\), at least one of the following holds:
 (a) \(S\) is isomorphic to an initial segment of \(T\), and/or
 (b) \(T\) is isomorphic to an initial segment of \(S\).
Proof: We proceed by induction on \(S\).
 Base case: \(S = \emptyset\). This trivially gives case (a).

Successor case: \(S = S' \cup \{x\}\), for some initial segment \(S'\) and some
maximal element
\(x\). If any initial segment of \(S'\) is isomorphic to \(T\), we have case (b).Otherwise, by our inductive hypothesis, \(S'\) is isomorphic to a strict initial segment of \(T\): call it \(T' \subset T\). Let \(f: S' \leftrightarrow T'\) be the isomorphism, and \(y\) be the least element of \(T \setminus T'\). We augment \(f\) with \(f(x) := y\), giving case (a).

Limit case: \(S\) has no
maximal element
. If any strict initial segment \(S' \subset S\) is isomorphic to \(T\), we have case (b).Otherwise, let \(\{S_i\}\) be the set of all strict initial segments of \(S\). \(S' = \bigcup S_i\) by our
no maximal element
assumption.By our inductive hypothesis, each \(S_i\) is isomorphic to some initial segment of \(T\): call it \(T_i\).
Let \(T' = \bigcup T_i\). This is trivially an initial segment of \(T\). We construct \(f: S' \leftrightarrow T'\) as per case (a) by taking the
union
of all \(f_i: S_i \leftrightarrow T_i\). (I.e. let \(f(x) = f_i(x)\) for any \(f_i\) with \(x\) in its domain.)This is welldefined as a consequence of Theorem 3.
Corollary 6 (wellordered sets are totally ordered): Given two well ordered sets \(\langle S, \leq_S \rangle\) and \(\langle T, \leq_T \rangle\), exactly one of the following holds:
 (a) \(S\) is isomorphic to an initial segment of \(T\) (which is not \(T\)), or
 (b) \(T\) is isomorphic to an initial segment of \(S\) (which is not \(S\)), or
 (c) \(S\) is isomorphic to \(T\).
Informally, the ordinals can be thought of as equivalence classes of wellordered sets. Examples of ordinals include:
\(0\)
, which corresponds to the structure of the empty set.\(1\)
, which corresponds to the structure of a singleton set.\(26\)
, which corresponds to the structure of \(\{0, 1, \ldots, 25\}\), and also the structure of \(\{A, B, C, \ldots, Z\}\). (As it so happens, this is isomorphic to all wellordered sets with that many elements.)\(\omega\)
, which corresponds to the structure of \(\mathbb{N}\) and \(\mathbb{Z}^+\) (...and any other wellordered set which has the structure of the natural numbers).\(\omega^2\)
, which corresponds to the structure of \(\langle \mathbb{N}^2, \leq \rangle\).
(We can't actually define such equivalence classes per se in Zermelo–Fraenkel set theory, since they're too big
: e.g. were we able to define the equivalence class of \(1\), we would be able to transform it into a set containing all sets, thus hitting Russell's Paradox. We'll look at a feasible
definition of ordinals later.)
By Corollary 6, the ordinals are totally ordered. Furthermore, with an argument similar to that in Theorem 4, we can show the ordinals are wellordered!
Allard, W., notes for
Basic Analysis I, retrieved from here 25th January 2014.
...many other sources besides.