Most of us who’ve studied probability theory at University level will have learned that it is formalised using the Kolmogorov axioms. However, there is an interesting alternative way to approach the formalisation of probability theory, due to R. T. Cox. You can get a quick overview from this Wikipedia page, although it doesn’t really motivate it very well, so if you’re interested you’re much better off downloading the first couple of chapters of Probability Theory: The Logic of Science by Edwin Jaynes, which is an excellent book (although sadly an incomplete one, because Jaynes died before he could write the second volume) and should be read by all scientists, preferably while they’re still impressionable undergraduates.

For Cox, probability theory is nothing less than the extension of logic to deal with uncertainty. Probabilities, in Cox’s approach, apply not to “events” but to *statements of propositional logic. *to say p(A)=1 is the same as saying “A is true”, and saying p(A)=0.5 means “I really have no idea whether A is true or not”. A conditional probability p(A|B) can be thought of as the extent to which B implies A*. *

* *There are a couple of interesting differences between Cox’s probabilities and Kolmogorov’s. Cox’s is more general, but also less formal (people are still working on getting it properly axiomatised). One important difference is that in Cox’s approach a conditional probability p(A|B) can have a definite value even when p(B)=0 (this can’t happen in Kolmogorov’s formalisation because, for Kolmogorov, p(A|B) is defined as p(AB)/p(B)). This means that, unlike the logical statement , the probabilistic statement p(A|B)=1 doesn’t mean that A is true if B is false. So conditional probabilities are like logical implications only better, since they don’t suffer from that little weirdness.

Anyway, that’s cool but what I really wanted to write about was this: in Cox’s version of probability theory, it’s meaningful to talk about *the probability of a probability. *That is, you can write stuff like p(p(A|B)=1/2)=5/6 and have it make sense. I’ll get to an example of this in a bit.