Over the last couple of years I've become increasingly interested in various foundational approaches to mathematics -- logic, set theory, category theory, lambda calculus, and so on. It's fascinating to see how these different schools of thought provide accounts for each other, and themselves.

Classes, simply put, are an elegant dodge to get around Russell's Paradox while retaining some of the expressive power of universal quantification.

To elaborate: if we allow a set theory to express "the set of all sets", or "the set of all cardinal numbers", or even just "the set of all singleton sets", we quickly run into the ground. Contemporary set theories like ZF give us restricted quantification, allowing us to say "\(\forall x \in S \cdot P(x)\)" (given a set \(S\) and a predicate \(P\)) but not "\(\forall x \cdot P(x)\)". But it's still useful to be able to quantify over collections of things that are too large to be sets. E.g. "all sets either contain an element or are empty"; "all sets admit a well-ordering".

Classes (effectively) correspond to *predicates* in language. For example, "is a singleton set" might be described by "\(\textrm{Singleton}(x) := \exists y \forall z \cdot (z \in x \Leftrightarrow y = z) \)". In the usual first-order presentations of ZF, "is a set" is a predicate that always returns true. And "represents a well-ordering of \(S\)" might be shorthand for "\(x \subseteq S^2 \land \textrm{IsWellOrdering}(x)\)", where \(\textrm{IsWellOrdering}\) itself is shorthand for a more complicated predicate that checks whether something encodes a reflexive antisymmetric transitive relation without infinite descending chains.

Claims that all members of a class satisfy some additional property can be encoded as universal quantifiers in the language, "\(\forall x \cdot P(x) \Rightarrow Q(x)\)". (The logic *itself* is still allowed universal quantifiers: it's the set building axioms which must be denied access to these.)

Classes typically exist in the *metalanguage* (i.e. are not first-class citizens of the theory itself). Classes can correspond to sets (e.g. the class of all empty sets, the class of natural numbers) but this is not guaranteed. This distinction is enough to get us past the classical presentation of Russell's paradox:

"Does the set of all sets that don't contain themselves contain itself?"

This question is ill-formed! Our set theory's language doesn't allow us to express the concept, "set of all sets". The meta-language lets us talk about "all sets" as a *class*.

"Well, does the class of all sets that don't contain themselves contain itself?"

Still slightly ill-formed! Sets can't contain classes. Classes aren't *objects* in our language the same way sets or natural numbers are, whereas the containment relation, "\(\in\)", exists within the language and is only defined on pairs of objects within the language.

"Does the class of all sets that don't contain themselves contain the predicate corresponding to itself?"

That really depends on how we encode predicates within the language. Presumably, yes, we need some kind of Godelian encoding of predicates into objects, since predicates are prima facie not objects in our language.

If we encode predicates as specially structured sets, then the answer is trivially yes: the set doesn't contain itself (otherwise we have unsoundness). If we have a multi kind language with predicates encoded as tuples or other specially marked constructs, the answer is trivially no: the predicate for the class is not a set so it's not part of the class.

Either way, we've avoided paradox, largely because there's a complete disconnect between collections' encodings and the collections themselves.

We might try to restore the paradox by moving away from sets altogether and asking only about classes and their representations:

"Does the class of all predicates (encodings) that don't apply to their own encodings contain the predicate (encoding) corresponding to itself?"

This is thorny! Except it's ill-formed in a far more subtle way than before.

The catch is, we need to capture the concept of “this encoding of a predicate applies to that” within the object language. Which is to say, we need an *interpreter* for encodings. But to implement that, we’d effectively be recreating the preconditions for Godel’s incompleteness theorem. That is, an interpreter could be used to give a *truth predicate*, which is thus vulnerable to Lob’s theorem.