## A brief case study in causal inference: age and falling grades

An interesting claim I found in the press: there is some concern because GCSE grades in England this year were lower than last year. What caused this?

One reason for the weaker than expected results was the higher number of younger students taking GCSE papers. The JCQ figures showed a 39% increase in the number of GCSE exams taken by those aged 15 or younger, for a total of 806,000.

This effect could be seen in English, with a 42% increase in entries from those younger than 16. While results for 16-year-olds remained stable, JCQ said the decline in top grades “can, therefore, be explained by younger students not performing as strongly as 16-year-olds”.

Newspapers seem to get worried whenever there are educational results out that there might be some dreadful societal decline going on, and that any change in educational outcomes might be a predictor of the impending collapse of civilisation. This alternative explanation of reduced age is therefore quite interesting, I thought it would be worth trying to analyse it formally to see if it stands up.

First, I think we can draw a graph, essentially a causal Bayes net, that looks like this:

The “decline of civilisation” is perhaps a bit hyperbolic, a less exaggerated alternative cause might be “falling educational standards” or something like that. Whatever it is, it is something that we aren’t sure of. The point is to try and use the GCSE results to work out whether it is happening or not.

On the other hand, the ages of people taking exams is clearly known quite precisely, as is the actual outcome measure, so these aren’t controversial. The other thing that isn’t particularly controversial is the arrow from “decline of civilisation” to “GCSE grades” — while we don’t know if the decline itself is happening or not, we can be fairly sure (I think) that if it did, it would tend to produce worse GSCE grades. That is to say, all else equal, falling standards increase the probability that students will fail their GCSEs.

The first interesting question is whether the arrow from age to GCSE grades should be there. I.e., can we also say that all else equal, younger students get worse GCSE grades?* It seems plausible, but is there any empirical evidence? The quote above in a roundabout way tells us that age and GCSE grades were correlated in this year’s data, because it said that the average results go down if you include all ages rather than looking only at a particular age (namely 16). This is a roundabout way of saying

$P(\mathrm{GCSE\ grade|Age}) \neq P(\mathrm{GCSE\ grade})$

This, according to Reichenbach’s principle of common cause tells us one of two things:

• Age affects GCSE grades; or
• There is a common cause of age and GCSE grades.

(There is a third theoretical possibility that the GCSE grades someone gets causally affect their age when they take them, but I’m going to exclude this on temporal grounds — your age is determined before you get your GCSE results).

So in order to be sure that Age causes GCSE grades, we need to exclude the second possibility, that there is a common cause.

It might be worth pointing out here, that when we say “age” as a variable, we mean “the age at which people take GCSEs”, not people’s age in general, which is presumably “caused” by their birth date and the passage of time. The age at which they take GCSEs however is caused by factors such as their own decisions and educational, school policies etc. This means that it may even (we don’t know) be caused by the very same “decline in standards” we have been talking about, but I’ll leave out that possibility for now.

So the alternative “common cause” graph (that we want to rule out) looks like this:

A possible alternative where there is a common cause of both age and grades which would explain the correlation. There are also presumably some other unmeasured causes of changing age. We can’t measure the common causes of age and grades directly either.

I think we can rule this out by the following logic: we know that not only is age correlated with GCSE grades, but that the statistical dependence seems to have remained exactly the same over two years. The article said that not only did conditioning on entrants’ age affect the overall GCSE grades this year, but that the grades for 16 year-olds was exactly the same year on year. However, we also know that the age distribution of entrants did change year-on-year. Since everything is caused by something, this I think strongly implies that the causes of the age distribution changed year on year. It isn’t a stretch to think that the hypothesised common causes here therefore also changed year on year. But, if the age-grade correlation was dependent on a common cause, we would not expect the statistical dependence to remain exactly the same when the common cause changed. This is because age isn’t just dependent on the common cause, but presumably also has some of its own exogenous causes (far left of the diagram), that would mean that it won’t be perfectly correlated with its common causes.

Here is another way to put it: we haven’t measured the common causes, but if the above graph is right, then if we did know the common causes C, they would “screen off” the age-grade (A-G) correlation. I.e.

$P(G|A,C) = P(G|C)$

Like I say we haven’t measured the common causes, but we do have good reason to believe that they would be different for the two years of data. In year 1, some unknown value C1, in year two, some other value C2. The statement from the article is then effectively saying that what we have determined empirically is this:

$P(G|A,C=C1) = P(G|A,C=C2)$

If the screening-off relationship is true, then this would mean

$P(G|C=C1)=P(G|C=C2)$

But if C1 and C2 are different, and they really are causes of G, then this should not happen: our definition of cause is that altering the causal variable changes the probability of its effect. It makes far more sense if it is A that screens-off C from G (i.e. A is the direct cause of G, not simply correlated by virtue of a common cause).

I’m not 100% sure about this argument. But let’s say it will do for now. This justifies our original graph. Once we have that, the only thing we don’t know on the original graph is the value of the “decline of civilisation” variable. I think we can rule out any change in the decline of civilisation quite easily if this graph is correct, because here we can see that age A does not (according to the graph) screen off decline D from grades G. I.e. it should generally be the case that

$P(G|A,D) \neq P(G|A)$

But if the decline in year 1 was D1 and in year 2 is was D2, we have seen:

$P(G|A,D=D1) = P(G|A,D=D2)$

Since there is no screening-off relationship, the only obvious way to explain this is that D1 and D2 are the same. Remember that D1 and D2 are some value representing educational standards that we can neither (directly) control nor measure. Thus the “conditioning” on D has been done “naturally” in the data (whereas conditioning on A was done by the choice to split the data by age).

There could be many problems with this line of reasoning. Also, there may be other plausible graphs that would fit with the data and not exclude the possibility of changing standards. So if anyone has any other ideas in the comments…

* EDIT: Just to add to this: it is obvious from the data that younger students did get lower GCSE grades, but this data does not fulfill the all else equal condition – it is pretty certain that all else was not equal (there are all sorts of confounding factors with age – they may have gone to different types of schools etc). The point is to work out if the claim that age causes a change in grades is reasonable, but, note, we can’t easily do a randomized controlled trial (and if we tried I would bet it would be pretty useless and highly unethical), so the question is whether any causal conclusions can be drawn without trying to randomize the age at which people take GCSEs).

### 2 Comments to “A brief case study in causal inference: age and falling grades”

1. I think I can make the first argument more intelligible.

Proposition Q is that C is the common cause of A and G.

Q => (implies) :

(1) $P(G|A=*,C=*) = P(G|C=*)$

where * means for all possible values (of A, C)

Q also implies

(2) $P(G|C=c_1) \neq P(G|C=c_2)$

for any two different values c1, c2

Substituting (1) into (2) and taking the arbitrary value A=16, Q also implies

(3) $P(G|A=16,C=c_1) \neq P(G|A=16,C=c_2)$

The data showed that $P(G|A=16,C=c_1) = P(G|A=16,C=c_2)$ – which is not-(3) / ¬(3)

So we have the syllogism

Q => (3)
¬(3)
———-
¬Q

Therefore C is not a common cause of A and G.

• Should have written

(1) $P(G|A=*,C=c) = P(G|C=c)$

The c can be anything, but it has to be the same on the LHS and RHS (otherwise this would obviously contradict (2))