Since I’ve been writing about statistics quite a bit (here and here), and since I happened to come across a really cute illustration on a blog in (I think?) Bosnian (yay for Google translate!), this week’s Stuff Explained Sunday will be about mean, median and mode.
So, we can see in this illustration how much each of these guys (individuals in the population) earned last month.
Now, we could say that the arithmetical average wage in the population is $5,700; with the variant “look, the average worker at my company earned $5,700 last month, can the union get off my back already??”. Which is arithmetically correct.
Except this gives us a very skewed view of how most individual guys in the picture are actually for themselves: only 4 people in the group are earning $5,700 or more, 20 are earning less, and a lot of those 20 quite a lot less.
This happens because the distribution is skewed.
If we graph it, it looks like this:
And not like this:
The median is the numerical value separating the higher half of our population from the lower half (either the number in the middle, if we have an odd number of variables, or the average of the two number in the middle, if it’s even). The guy earning $3000 has just as many people earning less than him as he has people earning more than him: 12.
The reason why we have an average much higher than the median is that the rich guy at the top, with his $45000, is pulling the average up. The highest earner in the group, representing 4% of the population, has earned 3 times as much as the guy right behind him, nearly 8 times more than the average guy, 15 times more than the median guy and 22.5 times more than the guys at the very bottom. He earns as much as almost everyone below the average guy combined.
(altered image; via)
Now, the median may be giving us a more accurate insight of what “middle income” in our population would look like; however it doesn’t tell the whole story about how most of the people are doing. A better measure of this may be the mode: the value that appears most often in a set of data, in our case, the income occurring most often in our population, namely $2000, earned by 12 guys- that is 48% of our population.
(altered image; via)
So, when looking at income distributions/ grade distributions/ /what have you- should you be looking at the mean, the median or the mode?
The answer is all of them; together. Each of them summarizes the data in a different way and how they are positioned in relation to each may also be very relevant.