“Not everything that can be counted counts”
—William Bruce Cameron
It’s clear that a set containing the Seven Dwarfs has the same size as a set containing the Days of the Week, and that both sets are smaller then the set of all college students in the US. But what does size mean when we have infinite sets? How does the infinite set of WFFs of Predicate Logic compare (in size) with the infinite set of all Natural Deduction proofs in Propositional Logic?
Georg Cantor invented a very simple and general criterion to decide whether one set has more elements than another. For familiar finite sets we get exactly the results we would expect. For infinite sets, there are surprises.
Most famously, there are more real numbers than there are natural numbers (not all infinities are the same size), but there as many natural numbers as there are rational numbers (even though there are infinitely many rational numbers in-between each pair of consecutive natural numbers).
For this section, we will just take the natural numbers to be the familiar set of non-negative integers.
It is common to use the notation to denote the size of set .
However, that immediately raises the question of what sizes are! For example, the size of a set can’t always be a natural number, because the set of integers is infinite, and we already said that infinity is not a natural number.
So instead, we will use a clever idea of Cantor to directly define whether one set is no larger than another; it sidesteps the need to calculate sizes and compare them.
Recall that function is injective when that implies .
Equivalently, when is injective we know that guarantees .
Set is infinite if , and finite otherwise.
A set is infinite if there is an injection from to the set, and it’s usually easy to see if a set is infinite. A more interesting question is whether there is an injection from the set to , a concept known as countability.
All finite sets are countable, but deciding whether an infinite set is countable requires a bit more work. There are several approaches we can take.
One way to show a set is countable is to find a suitable injective function.
is countable.
Proof: It suffices to find a suitable injection, such as
is countably infinite.
Proof:
We’ve already proved that is infinite. It is countable () because the function defined by is injective.
Here’s another way to prove an infinite set is countably infinite.
An infinite set is countable if we can create an infinite list containing all the elements of (the first element of , the second element of , the third element, etc.).
It doesn’t matter what order we put the elements of as long as every element of must appear at some finite index in the infinite list. (Given any element of with enough time and effort we should be able to figure out its position in the list.) It is even OK if the list contains duplicates, as long as every element of appears somewhere.
Proof: If we can create such a list, then the function that maps each element of to its position in the list is an injection from to , and proves that .
(If the list has duplicates, we can get an injection from to by mapping each element of to the first index where that element occurs in the list.)
The following sets are clearly infinite, but they are also countable.
The set of all primes is countable.
If we list the primes in increasing order
then every prime is at some finite index in the list (and hence there is an injection from the set of primes to ).
The set of all integers is countable:
If we list the integers in order of increasing absolute values
then every integer occurs at some finite index in the list (and hence there is an injection from the set of integers to ).
The set of all pairs of natural numbers is countable: .
If we list all pairs (n,m) in order of their sum (and by increasing for pairs of the same sum) then every pair of natural numbers occurs at some finite index in the list:
(It’s true that rational numbers like 0 show up many times in this list, but as mentioned earlier, duplicates are fine as long as every element in the set eventually appears somewhere.)
The set of finite binary sequences is countable: .
If we list all sequences by length, and then alphabetically (or equivalently, in numerical order in binary) within sequences of the same length, then every finite sequence will appear at some finite position on this list:
Usually, there are many choices for the order of our list. For example, in Part (5), we could have chosen a different order for the strings of each length, and everything would still work.
However, we have to be careful that our list puts every element of the set at some finite index, and it is easy to get that wrong.
Here are some flawed proofs that sets are countable. (These sets are countable; it’s just that these aren’t valid proofs.)
Fatal flaw: There are infinitely many sequences of the form before we ever get to , so the pair does not have a finite index in this list.
The set of all finite sequences of ’s and ’s is countable; just list all sequences in alphabetical order.
Fatal flaw: If we list the sequences in pure alphabetical order, the the list starts out
and we have to get past infinitely many sequences in this list before we ever get to sequences containing a .
Or more simply, since all sequences starting with come alphabetically before all the sequences starting with , there’s no finite index for any of the sequences starting with .
Here is another useful fact:
The union of a countable number of countable sets is countable. In particular:
There are many other clever ways to prove a set is countable, but there is one that is particularly useful in Computer Science:
If each single element of could be represented using a computer file (given a sufficiently large, but finite, disk), then is countable.
Proof: On the disk, a computer file is nothing but a finite sequence of bits, i.e., finite sequences of 0’s and 1’s. If each element of can be stored in a computer file, then each element of corresponds to a (distinct) sequences of 0’s and 1’s. Since there are only countably many such sequences, there can only be countably many elements of .
In other words, the number of possible computer files is countable, so if each element of a set can be stored in a computer file, we can instantly conclude there are only countably many elements in that set.
(Note that we are only asking whether any one single element of a set could be stored in a computer file of finite size, not whether all the elements of a set can be stored in a single file at the same time!)
The infinite set is countable. Given any specific , we can imagine storing in a computer file.
For example, we could put into a one-line text file consisting
of a sign (+ or - ), the numerator in decimal (a finite natural
number), a slash, and the denominator (a finite natural number).
So there can’t be more rational numbers than computer files; since there are only countably many possible computer files, is countable.
Any finite set of rational numbers is countable. Given any particular set (with size 3 or 10 or one billion or …), we could list the members of the set in a text file, with one integer per line. Since the set contains finitely many rational numbers, and each rational number has a finite numerator and denominator, each such set corresponds to a text file with finitely many characters.
The infinite set of all finite binary trees labeled by integers is
countable. Given any one such tree, we can draw that tree and store
it in a (potentially large, but finite) .jpg or .png
image.
The set of all WFFs of Predicate Logic is countable. Each such WFF
can be written in LaTeX and saved in a .tex file.
The set of all possible essays on computer ethics is countable. Given any one such essay, we could store it in a Microsoft Word file.
The set of all possible books on Renaissance Art (with pictures) is countable. Given any book, we could scan it (e.g., to a resolution of 1 million dots per inch) and save the scan as a PDF. Since books only have finitely many pages, each PDF will be finite as well.
The set of all syntactially correct Java 8 programs is countable.
Each such program could be packaged into a single (finite)
.zip file containing all the code.
The set of all real numbers that a Python 3.12 program could
print is countable. Each Python program that prints digits could be
stored as a single .py file.
Of course, if you are talking to a mathematician rather than a computer scientist, instead of talking about computer files you could say something like, “Since every element of this set can be encoded into a distinct finite binary sequence, the set must be countable.”