The causal Markov confusion

by James Thorniley

Out in the pacific there are two islands named Foo and Bar. Two ferries, the good ship Fizz and the good ship Buzz pass between them once per day each (that is, if Fizz starts the day on Foo it will end the day on Bar). For each ship then, we can write out the list of islands it ends the day on using the initials: F for Foo and B for Bar:

Fizz:    F B F B F B F B F B F B F B F B
Buzz:    B F B F B F B F B F B F B F B F

However, on Foo island, Fry is currently searching for his friend Bender. But Bender is on Bar. Fry learns of this and resolves to hop aboard the good ship Buzz and be reunited with Bender in the evening. But alas! Bender has similarly reasoned that Fry is on Foo and therefore set sail with the Fizz. Having passed each other during the day, Fry is stranded on Bar and Bender on Foo, they will have to wait until tomorrow before they can do anything about it.

The morning comes, but in a cruel and ironic twist of fate, the two pals make much the same error they did on the first day: Bender hops back on the Fizz to meet Fry on Bar, but Fry has left Bar on the Buzz in search of Bender on Foo! There seems to be no way for the two to meet, and they are doomed to repeat this process over and over, ending their days on opposing islands:

Bender:  F B F B F B F B F B F B F B F B
Fry:     B F B F B F B F B F B F B F B F

You might have noticed that Fry always boards the Buzz and Bender always boards the Fizz. The patterns of their journeys look identical. But the underlying causes are very different: the Fizz and Buzz simply swap islands as a matter of course (it’s what they are there for), whereas Fry and Bender are moving islands in order to find each other.

It’s very depressing thinking of these two searching for each other in vain until the end of time. However, a small misfortune hits the Buzz one day, and it is stranded on Foo for a few nights until it can be repaired. At this point, the strangest thing happens:

Fizz:    F B F B F B F B F
Buzz:    B F F F F B F B F
Bender:  F B F F F F F F F
Fry:     B F F F F F F F F

This intervention – namely waylaying the Buzz on a single island for a few days – makes a drastic change in the overall game. Bender and Fry meet on Foo, and stay there drinking cheap cocktails. Once the Buzz is fixed, it resumes swapping islands, but this time ends each night on the same island as the Fizz.

This highlights that it is important to make an intervention if you want to unambiguously find causal relationships. When you have a pair of time series that looks like the original time series for Fizz and Buzz, it’s tempting to infer that there is no causal link between the two ships because of the statistical independence of their paths – you can predict where the Fizz will be tomorrow night without knowing anything about the Buzz. In the case of the Fizz and Buzz, this inference would actually be correct.

But Fry and Bender before the intervention also appear to swap islands independently and to an outside observer with only the time series data to look at they might not appear causally linked. But when a perturbation is introduced, Fry is forced to spend an extra night on Foo. This allows Bender to catch up with him, and after that Bender stays on Foo. This gives us a clue that perhaps Bender’s movements are in fact dependent on Fry’s – they are causally linked.

Furthermore, even with the intervention it is clear that the boats aren’t linked to each other – the Fizz keeps on flipping islands even when the Buzz is waylaid, and when the Buzz gets going again it has swapped round its relationship with Fizz, and makes no attempt to correct that.

This concept is fairly simple, but in a rush to analyse time series data it’s tempting to forget that you really can’t make unlimited inferences just from data – you have to know something about the story behind the data. Preferably you might intervene in the system yourself, but even without that, a perturbation generated within the system (such as the delay caused by Buzz randomly breaking down) might help.

About these ads

2 Comments to “The causal Markov confusion”

  1. I don’t disagree with any of this, so the following is just nitpicking: you didn’t necessarily actually intervene in the system at all. It sounded like the Buzz had an engine failure or something all by itself – which means the time series in question are actually something like

    Fizz: F B F B F B F B F B F B F B F B F B F B F B F B F
    Buzz: B F B F B F B F B F B F F F F B F B F B F B F B F
    Bender: F B F B F B F B F B F B F F F F F F F F F F F F F
    Fry: B F B F B F B F B F B F F F F F F F F F F F F F F

    (sadly I can’t make those line up properly but you get the point). I think you can infer a bit more about the causal structure from those time series than from just the part before the Buzz’s mishap. Ultimately I think the only way to test causality for sure is to intervene, but given enough data and a little bit of prior knowledge you can often make a good guess about it without intervening at all.

  2. Yep, agreed. In fact any “randomness” in the system would (in this simple case) probably be enough to clarify things, so long as you know (or explicitly assume) that the randomness is there.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Get every new post delivered to your Inbox.

Join 474 other followers

%d bloggers like this: