A while ago I wrote a little rant on the (mis)interpretation of P-values. I’d like to return to this subject having investigated a little more. First, this post, I’m going to point to an interesting little subtlety pointed out by Fisher that I hadn’t thought about before, in the second post, I will argue why P-values aren’t as bad as they are sometimes made out to be.
So, last time, I stressed the point that you can’t interpret a P-value as a probability or frequency of anything, unless you say “given that the null hypothesis is true”. Most misinterpretations, e.g. “the probability that you would accept the null hypothesis if you tried the experiment again”, make this error. But there is one common interpretation that is less obviously false: “A P-value is the probability that the data would deviate as or more strongly from the null hypothesis in another experiment, than they did in the current experiment, given that the null hypothesis is true”. This is something that you might think is a more careful statement, but the problem is that in fact when we calculate P values we take into account aspects of the data not necessarily related to how strongly they deviate from the prediction of the null hypothesis. This could be misleading, so we’ll build it up more precisely in this post.