October 14, 2013

## Small, far away

Can you tell the difference between small and far away?

It’s funny because it should be obvious, but actually distinguishing between small and far away based on visual information is slightly tricky to explain.

October 1, 2013

## 3D printed bee colour spaces

Think3DPrint3D has generously donated 3D printer time and plastic filament to the unusual task of rendering the usually intangible concept of honeybee colour spaces into real, physical, matter!

The Spaces

The first two spaces we printed were chosen by me, in part because I think they are theoretically interesting, and in other part, because unlike some other colour spaces they are finite sized, 3D objects.

October 1, 2013

## Intuitive example of expectation maximization

I’ve been looking at the Expectation-Maximization (EM) algorithm, which is a really interesting tool for estimating parameters in multivariate models with latent (i.e. unobserved) variables. However, I found it quite hard to understand it from the formal definitions and explanations on Wikipedia. Even the more “intuitive” examples I found left me scratching my head a bit, mostly because I think they are a bit too complicated to get an intuition of what’s going on. So here I will run through the simplest possible example I can think of.

September 25, 2013

## Rotation matrix from one vector to another in n-dimensions

Sometimes you need to find a rotation matrix that rotates one vector to face in the direction of another. Here’s some code to do it for vectors of arbitrary dimension. The code is at the bottom of the post.

August 23, 2013

## Irrationality and Experience

Why should we care if someone else is being irrational? and in what way can we know that this is actually the case?

The idea of rationality is often thought of as a consistency within a collection of behaviours or beliefs. For example, if I think it is absolutely wrong to eat meat, I should think it is wrong to eat beef. As beef is a meat it would be irrational for me to think it was wrong to meat but also have the belief that it is OK to eat beef. Similarly, it would be irrational to think that eating meat was abolutely wrong, and then eat a plate of steak. To be rational, my behaviour should also match what I think.

How do we know when we are being irrational? Asking about the rationality of other people seems easier at first. When we observe others, we can spot things that they ought not be doing if they were rational. For example, they might claim to be vegetarian whilst eating a steak. According to most people, I would say, what they are doing is inconsistent. But this doesn’t mean that they do not have a perfectly consistent way of thinking that accounts for their actions and it is us, the majority, who have failed to grasp it. How do you know that someone else is being irrational, not you yourself?

August 22, 2013

## A brief case study in causal inference: age and falling grades

An interesting claim I found in the press: there is some concern because GCSE grades in England this year were lower than last year. What caused this?

One reason for the weaker than expected results was the higher number of younger students taking GCSE papers. The JCQ figures showed a 39% increase in the number of GCSE exams taken by those aged 15 or younger, for a total of 806,000.

This effect could be seen in English, with a 42% increase in entries from those younger than 16. While results for 16-year-olds remained stable, JCQ said the decline in top grades “can, therefore, be explained by younger students not performing as strongly as 16-year-olds”.

Newspapers seem to get worried whenever there are educational results out that there might be some dreadful societal decline going on, and that any change in educational outcomes might be a predictor of the impending collapse of civilisation. This alternative explanation of reduced age is therefore quite interesting, I thought it would be worth trying to analyse it formally to see if it stands up.

July 4, 2013

## Every bad model of a system is a bad govenor of that system

The Analytical Engine has no pretensions whatever to originate anything. It can do whatever we know how to order it to perform.Ada Lovelace

James recently posted about a Guardian article on “big data”. The article outlines the many roles which algorithms play in our lives, and some of the concerns their prevalence raises. James’ contention, as far as I understand, was with the focus on algorithms, rather than the deeper issues of control, freedom, and agency. These later issues are relevant to all types of automation, from assembly lines to artificial intelligences.

Automation flies in the face of any attempt to give some human activity special merit, whether this is our capability to produce, to create, make choices, procreate, socialise or whatever. It relentlessly challenges our existential foundation: I am not made human by what I make as I can be replaced by a robot, it’s not what I think, I can be replaced by a computer, etc. Each new automation requires us to rethink what we are.

July 2, 2013

## “Algorithm”… You keep using that word…

The Guardian has a feature article entitled How Algorithms Rule the World:

From dating websites and City trading floors, through to online retailing and internet searches (Google’s search algorithm is now a more closely guarded commercial secret than the recipe for Coca-Cola), algorithms are increasingly determining our collective futures.

The strange thing about this is that the algorithms mentioned are nothing like the algorithms you learn about in computer science. Usually, an algorithm refers to a (generally) deterministic sequence of instructions that allow you to compute a particular mathematical result; a classic example (the first offered by Wikipedia) being Euclid’s algorithm for finding the greatest common divisor (it doesn’t have to strictly involve numbers – anything symbolic will do: one could easily create an algorithm to transliterate this entire post so that it was ALL IN CAPITALS, for example).

By contrast the “algorithms” talked about by the Guardian are all about extracting correlations from data: working out what you are going to buy next, if and when you will commit a crime and so on. What they are talking about, I think, is statistics or machine learning. If you want a more trendy term, perhaps data sciencebut as far as I can tell these are all pretty much the same thing.

To say that the world was ruled by statistics would sound a bit twentieth century perhaps, so the hip and happening Guardian has maybe just found a more exciting term for an old phenomenon. But I think there is something more to their use of the word algorithm: I don’t think it is the right word, but there is something else they are trying to capture, as one of their interviewees says:

“… The questions being raised about algorithms at the moment are not about algorithms per se, but about the way society is structured with regard to data use and data privacy. It’s also about how models are being used to predict the future. There is currently an awkward marriage between data and algorithms. As technology evolves, there will be mistakes, but it is important to remember they are just a tool. We shouldn’t blame our tools.”

The issue is not the standard use of statistics to find interesting stuff in data. The problem is how the results of this are used in society: applying the results from statistics in an automated way. This automation is the only commonality that I can see with the traditional meaning of an algorithm.  In the case of the crime detection, insurance calculations or banking systems the problem is not that there is some data with correlations in it, but that decisions are being at least in part automated, producing either a politically disturbing denial of people’s individual agency or simply some dangerous automatic trades that can crash a stock market.

The term algorithm is being used here to describe something that has a “life of its own” – something Euclid’s algorithm clearly does not have. Euclid’s algorithm couldn’t “rule the world” if it tried (and it can’t try, because you have to being a conscious agent to do that). Algorithms are being talked about here as if they have their own agency: they can “identify” patterns (rather than be used by people to identify patterns), they can make trades all by themselves. They are scurrying about behind the scenes doing all sorts of things we don’t know about, being left to their own devices to live (semi) autonomous lives of their own.

I think that’s what scares people. Not algorithms as such but the idea of autonomous computational agents doing stuff without oversight, particularly if that stuff (like stock market trading or making decisions for the police) might later have an impact on real people’s lives.

June 24, 2013

## Measure Theoretic Probability for Dummies: Part I

Nothing makes me empathise more with those struggling with probability theory than reading things like this on Wikipedia:

Let (Ω, F, P) be a measure space with P(Ω)=1. Then (Ω, F, P) is a probability space, with sample space Ω, event space F and probability measure P.

This is written so that only the people who already know what it is saying can understand it. The only possible value of this sentence would be to someone who managed to study measure theory without being exposed to it’s most widespread application; in other words: no one! Whilst the attitude this, and soooo many Wikipedia pages displays encourages people to be precise in a way that mathematicians cherrish, it also alienates a lot of perfectly capable, intelligent people who just run out of patience in the face of the relentless influx of oblique statements.

Personally, I think that understanding probability spaces is very important, but for the reasons including those I mention above, most people find the measure theoretic formalisation daunting. Here I have tried to outline the most widely used formalisation, which has turned out to be far more work than I expected…