I’ve been thinking about how to teach measure theory starting from building a probability measure on the sequence of infinite tosses of a fair coin instead of the standard treatment which starts with Lebesgue measure. This post is an outline of how to translate Vitali’s proof of the existence of nonmeasurable sets for Lebesgue measures to this setup.
The Problem
Our sample space consists of infinite strings from the alphabet . We want to prove that there is no probability measure defined over all subsets of such that the different tosses are independent events with the probability of heads and tails on each toss equal to .
The strategy
Our strategy is to closely follow Vitali’s classic proof by contradiction in the case of Lebesgue measure.
Suppose we have a probability measure defined over all subsets of .
We find a countable series of transformations which are measure preserving, i.e. for all and all .
We then construct a set with the property that the sets form a partition of .
Now we have a contradiction. Since each has the same measure as , if has a positive measure, would end up having infinite measure. If has measure zero, would end up having measure zero. Either possibility contradicts the claim that is a probability measure.
While Vitali’s original proof used the translation independence of the Lebesgue measure, our proof uses the symmetry arising from the assumption that the coin tosses are fair and independent.
The transformations
Consider a finite set of indices . Then for any infinite sequence of outcomes we define to be the sequence of outcomes obtained by taking and flipping (i.e. changing heads to tails and vice versa) the outcome at each of the indices contained in .
So, for example, would flip the outcomes at the 5-th and 6-th positions in and leave the rest of the sequence unchanged.
We first note two elementary properties of these transformations. First, because flipping the same indices twice restores them to their original state, each is its own inverse. Second, for two sets and the composition flips twice (and hence restores) the indices belonging to both and and flips once the indices belonging to exactly one of them. Thus it equals ( being the symmetric difference of and ) and hence belongs to the class of transformations we are considering.
Next, we will show that it follows from independence and fairness that the transformations are measure preserving. The logic is simple. Independence lets us delink what happens at a given finite set of indices from what happens at the rest of the indices. Fairness implies that any sequence of outcomes of a given finite length has the probability . So a sequence like and its flipped version both have the probability . So a flip over a finite index set should not change the probability of any event. Now we put in the grunt work to show this formally.
Let the operation of flipping every outcome in a sequence, to be the outcomes in at indices in in and to be the the outcomes in at indices in the complement of . Let the set of all finite sequences of of length equal to .
Then for any event and any finite set of indices of size , we have by countable additivity, [Here’s why we have to make finite. If were infinite the set of sequences of length equal to the length of would be uncountable and we would not be able to use countable additivity.]
Using independence, we have
Using fairness to calculate
Using the definition of (flip all outcomes) and (flip outcomes in ) we have
Now contains all finite sequence of length . So as ranges over , also ranges over . Therefore,
Once again using fairness to calculate
But by independence, and our proof that is measure-preserving is done.
The partition
Now we construct a set whose images under the different will be a partition of . To do so we first define an equivalence relation on elements of . We say that if there is a with . We check that this is really an equivalence relation.
- [Since is the identity.]
- . [If then because is its own inverse it follows that .]
- and implies . [If and then .]
Let be a set formed by taking one element each from each of the equivalence classes formed by the equivalence relation . Since this selection of an element from each class is arbitrary we have to rely on the Axiom of Choice to ensure that the set exists.
Now we claim that as ranges over all finite sets of indices is a partition of .
First we see that every belongs to some . This is because belongs to one of the equivalence classes of . If is the representative of this class that has been included in then by the definition of there is some such that and hence .
Next we see that if then . Suppose on the contrary that for some . Using the properties of we have By the definition of this means But since contains exactly one element from each equivalence class this means that (say), so that But since no sequence can remain unchanged if some outcomes are flipped it must be the case that which implies and contradicts our assumption that .
Thus we have established that the form a partition of .
The contradiction
We have from the previous section where the union is disjoint and ranges over all finite sets of indices. The collection of finite sets of indices is countable (check). Hence we are justified in using countable additivity to get
which because the ’s are measure-preserving gives us
Now there are two possibilities. in which case we get , or that in which case we get . In either case we have a contradition with the requirement from a probability measure that .
References
Other proofs of this result can be found in Blackwell and Diaconis (1996) and in Chapter 5 of Oxtoby’s Measure and Category.