So it’s quite easy to make hi-res plots of map functions with a GPU. The result is cool and science fictiony:

## More weird properties of chaos: non-mixing

We’ve done “chaos is not randomness” before. Here’s another interesting property to do with mixing.

Mixing is a property of dynamical systems whereby the state of the system in the distant future cannot be predicted from its initial state (or any given state a long way in the past). This is pretty much the same as the kind of mixing you get when you put milk in a cup of tea and swirl it around: obviously when you first put the milk in, it stays roughly where you put it, but after time it spreads out evenly. The even spread of the milk will be the same no matter where you put the milk in originally. More formally, if

is a “distribution” or density function of where the “particles” of milk are when you have just put them in the tea, and

is the distribution after seconds. “Mixing” is formally defined as

You don’t have to think about these distributions as probability distributions, but I find it easier if you do. For those that know probability, it is obvious that what the above is saying is that the distribution of milk after a long time is probabilistically independent of its distribution at the start.

In cups of tea, this happens (mostly) because of the “random” Brownian motion of the milk (possibly enhanced by someone swirling it with a spoon).

## Small, far away

## Intuitive example of expectation maximization

I’ve been looking at the Expectation-Maximization (EM) algorithm, which is a really interesting tool for estimating parameters in multivariate models with latent (i.e. unobserved) variables. However, I found it quite hard to understand it from the formal definitions and explanations on Wikipedia. Even the more “intuitive” examples I found left me scratching my head a bit, mostly because I think they are a bit too complicated to get an intuition of what’s going on. So here I will run through the simplest possible example I can think of.

## A brief case study in causal inference: age and falling grades

An interesting claim I found in the press: there is some concern because GCSE grades in England this year were lower than last year. What caused this?

One reason for the weaker than expected results was the higher number of younger students taking GCSE papers. The JCQ figures showed a 39% increase in the number of GCSE exams taken by those aged 15 or younger, for a total of 806,000.

This effect could be seen in English, with a 42% increase in entries from those younger than 16. While results for 16-year-olds remained stable, JCQ said the decline in top grades “can, therefore, be explained by younger students not performing as strongly as 16-year-olds”.

Newspapers seem to get worried whenever there are educational results out that there might be some dreadful societal decline going on, and that any change in educational outcomes might be a predictor of the impending collapse of civilisation. This alternative explanation of reduced age is therefore quite interesting, I thought it would be worth trying to analyse it formally to see if it stands up.

## “Algorithm”… You keep using that word…

The Guardian has a feature article entitled How Algorithms Rule the World:

From dating websites and City trading floors, through to online retailing and internet searches (Google’s search algorithm is now a more closely guarded commercial secret than the recipe for Coca-Cola), algorithms are increasingly determining our collective futures.

The strange thing about this is that the algorithms mentioned are nothing like the algorithms you learn about in computer science. Usually, an algorithm refers to a (generally) deterministic sequence of instructions that allow you to compute a particular mathematical result; a classic example (the first offered by Wikipedia) being Euclid’s algorithm for finding the greatest common divisor (it doesn’t have to strictly involve numbers – anything symbolic will do: one could easily create an algorithm to transliterate this entire post so that it was ALL IN CAPITALS, for example).

By contrast the “algorithms” talked about by the Guardian are all about extracting correlations from data: working out what you are going to buy next, if and when you will commit a crime and so on. What they are talking about, I think, is *statistics* or *machine learning*. If you want a more trendy term, perhaps *data science, *but as far as I can tell these are all pretty much the same thing.

To say that the world was ruled by statistics would sound a bit twentieth century perhaps, so the hip and happening Guardian has maybe just found a more exciting term for an old phenomenon. But I think there is something more to their use of the word algorithm: I don’t think it is the right word, but there is something else they are trying to capture, as one of their interviewees says:

“… The questions being raised about algorithms at the moment are not about algorithms per se, but about the way society is structured with regard to data use and data privacy. It’s also about how models are being used to predict the future. There is currently an awkward marriage between data and algorithms. As technology evolves, there will be mistakes, but it is important to remember they are just a tool. We shouldn’t blame our tools.”

The issue is not the standard use of statistics to find interesting stuff in data. The problem is how the results of this are used in society: applying the results from statistics in an automated way. This automation is the only commonality that I can see with the traditional meaning of an algorithm. In the case of the crime detection, insurance calculations or banking systems the problem is not that there is some data with correlations in it, but that decisions are being at least in part automated, producing either a politically disturbing denial of people’s individual agency or simply some dangerous automatic trades that can crash a stock market.

The term algorithm is being used here to describe something that has a “life of its own” – something Euclid’s algorithm clearly does not have. Euclid’s algorithm couldn’t “rule the world” if it tried (and it can’t *try*, because you have to being a conscious agent to do that). Algorithms are being talked about here as if they have their own agency: they can “identify” patterns (rather than be used by people to identify patterns), they can make trades all by themselves. They are scurrying about behind the scenes doing all sorts of things we don’t know about, being left to their own devices to live (semi) autonomous lives of their own.

I think that’s what scares people. Not algorithms as such but the idea of autonomous computational agents doing stuff without oversight, particularly if that stuff (like stock market trading or making decisions for the police) might later have an impact on real people’s lives.

## More wrong interpretations of P values – “repeated sampling”

A while ago I wrote a little rant on the (mis)interpretation of P-values. I’d like to return to this subject having investigated a little more. First, this post, I’m going to point to an interesting little subtlety pointed out by Fisher that I hadn’t thought about before, in the second post, I will argue why P-values aren’t as bad as they are sometimes made out to be.

So, last time, I stressed the point that you can’t interpret a P-value as a probability or frequency of anything, unless you say “given that the null hypothesis is true”. Most misinterpretations, e.g. “the probability that you would accept the null hypothesis if you tried the experiment again”, make this error. But there is one common interpretation that is less obviously false: “A P-value is the probability that the data would deviate as or more strongly from the null hypothesis in another experiment, than they did in the current experiment, given that the null hypothesis is true”. This is something that you might think is a more careful statement, but the problem is that in fact when we calculate P values we take into account aspects of the data not necessarily related to how strongly they deviate from the prediction of the null hypothesis. This could be misleading, so we’ll build it up more precisely in this post.

## Randomised controlled trials – the “gold standard”?

The UK government’s (ever so slightly creepily named) “Behavioural Insights Team” released a report [PDF] (relatively) recently called “Test, Learn, Adapt” (the authors include Ben Goldacre, well known for the book “Bad Science”, and the director of the York Trials Unit, David Torgerson) arguing that more policy decisions should be made on the basis of evidence from randomised controlled trials (RCTs). The report is a really good plain-English explanation of what RCTs are and how they work. It also gives examples of how RCTs can perhaps help to inform policies, by testing whether interventions such as back-to-work schemes or educational programs, um, “work”. According to the report’s blurb:

RCTs are the best way of determining whether a policy or intervention is working.

It’s not hard to find opinion pieces backing up the report’s central idea, and the thesis that RCTs are the best way to “find things out”. Here’s one by Tim Harford, a writer who covers economics; a similar argument made by Paul Johnson who is the director of an economics research group, the Institute for Fiscal Studies; and Prateek Buch, who is a research scientist. A phrase that keeps popping up is “gold standard”. RCTs are “the gold standard in evidence”, says Johnson, or the “gold-standard for showing that medical interventions are effective” according to Buch. Mark Henderson’s book, “The Geek Manifesto” says that the RCT is “commonly considered the ‘gold standard’ for medical research because it seeks systematically to minimise potential bias through a series of simple safeguards”. What exactly does all this mean? I think it’s a question worth asking, since not all science involves RCTs. The Higgs boson for example, was recently “discovered” (if that’s the word) without (as far as I can tell) the need to randomise test subjects. So are RCTs in fact the “gold standard”?

## Ironic science, pragmatism, and the “is best viewed as” argument

I’ve read a couple of interesting books recently, one was “The End of Science” by John Horgan, and the other was “Radical Embodied Cognitive Science” by Anthony Chemero. Horgan’s theme was the question of whether the fundamentals of science are now so solid that before long nothing genuinely “new” will be left to find, and science will be reduced to either obsolescence, or puzzle-solving type application of existing theories to particular problems. The only other type of science that still exists, according to Horgan, is “ironic” science. A kind of semi-postmodern project to explain or describe what we already know in more “beautiful” or appealing forms, but which never produces hypotheses that are empirically testable, and for this reason, don’t *actually* advance knowledge. Horgan is distinctly dismissive of this kind of science, as being not “proper” science (he deliberately compares it to postmodern literary criticism, which he seems to have particular contempt for, having once been a student of it himself). Chemero would be, I’m sure, classified by Horgan as an ironic scientist. I don’t think Chemero would be able to deny that in a sense, his philosophy is empirically untestable, but he certainly argues that it *is* pragmatic in the sense of being useful to scientists engaged in solving real world problems.

## Are you afraid of equations?

Jellymatter is, we claim, not afraid of equations, but apparently scientists are. A study in PNAS claims to have found that theoretical biology papers are cited less when they are densely packed with mathematical language. The authors argue that this impedes progress, since empirical work needs to be backed up and commensurate with some theory to have deeper scientific meaning.