## A random foray into data visualisation

I was thinking today about the distribution of wealth in the UK.  It’s easier to understand small numbers than large ones, so it occurred to me that you could construct a fictional population of 100 people with the same wealth distribution as the UK, and then visualise that by drawing 100 circles with areas proportional to those people’s wealth.  I’d never seen that done before, so I thought I’d give it a go. It was something of a failed experiment.  This was my first attempt:

For this one I just calculated the figures (using data from HMRC, via Wikipedia) and then drew the circles in a vector art program.  Unfortunately it looks kind of ugly.

I’m sure I could make it prettier by positioning the circles more carefully, but the problem is that my choices about where to put the circles probably distort the appearance of the data.  One way to get around that is to position them randomly.  Here’s what that looks like:

I couldn’t be bothered to check whether the circles were overlapping, so I just drew the biggest ones first so that they can’t end up hiding smaller ones behind them.

The problem with this, though, is that it somehow fails to capture the scale of the inequality.  It makes the poorer people look less important – there are fifty of the smallest size of dot, but it’s not immediately clear that they make up half of the dots.  Your eye is more likely to fall on one of the larger dots than one of the small ones – but what I really want is for the viewer to see that (all other things being equal) you’re just as likely to be any of these dots as any other, and the chances of being one of the rich ones are small.  This is something of an unsolvable problem with this idea, but I made a couple of attempts to get around it.

The next thing I tried was getting rid of the randomness, and spacing the circles evenly along the x axis.

This looks kind of pretty but somehow still doesn’t get across what I want it to.  Perhaps the best I could do is somewhere between these last two – have the y axis random, but still space them evenly along the x axis.

At least now it’s fairly clear that the small dots take up about 50% of the figure’s general area, though the largest circle extending to the left does make it look like slightly less than 50%.  And because the big circle is big, it ends up taking up way more than 1% of the space.

I guess the moral of the story is probably that when it comes to data visualisation, you can’t go wrong with a good old graph, like this one:

This is based on the same data as the other figures, and gets the point I was trying to make across very starkly.  It’s immediately clear that there’s a big difference between the rich and the poor, and that the rich are few in numbers.  Since wealth is still proportional to area in this graph, you can also easily see that the total wealth held by the top percentile is quite a bit bigger than that held by the bottom 50%, something I hadn’t appreciated before I looked at it.

Finally, if you were teleported into the life of a random person in the UK (in a Rawlsian sort of a way), you’d lie at a uniformly random position on the x axis of this graph, and so it’s obvious that your chances of being poor are quite high and those of being very rich are very low.

Graphs are good.  People should use more graphs.

N