Statistics. Everyone loves them, especially in the policy arena. And why not? In the pragmatic world, the measure of any policy is simply “what works”, and statistics are how we identify problem areas and measure improvements. It’d be wholly irresponsible to try to create policy without underlying evidence, and there isn’t much room for argument based on first principles; the person who argues without cold, hard numbers is just an ideologue.
Here are a few statistics:
1. Women earn 77 cents for every dollar men earn;
2. The suicide rate for post-op transgendered people is 2.5 times that of the general population;
3. The United States has 4% of the world’s population, but 25% of the world’s prison population;
and my personal favorite:
4. There are 5 times as many white victims of black violence than black victims of white violence, and twice as many whites killed by police as blacks.
These statistics, such as they are, are true. They’re also profoundly wrong, in the sense that they naturally imply beliefs that are flat-out false. Looking at each in turn:
1. Women earn 77 cents for every dollar men earn. Most people, upon hearing this, react with alarm that a supposedly egalitarian nation like the United States could have such a significant gap in pay between genders. The statistic implies that women are being systematically discriminated against, and being paid substantially less than men for the same work.
The problem here is that this number is constructed by taking the total income earned by each gender, and dividing by the number of workers of the corresponding gender. In other words, it’s aggregating the pay of art historians and engineers within each gender; if there are more female than male art historians, or more male than female engineers, the statistic will show a pay gap even if all professions pay male and female workers equally. When you include relevant factors, like education, profession, age, children, years in the workforce, etc., the gap shrinks closer to zero. Studies that control for those variables tend to find gaps on the order of 3-5%, rather than the stark 23% of the original statistic. And even this remaining gap might simply reflect poor resolution in the underlying data. For example, two people might have equivalent education – say, medical doctorates – but if one person is a pediatrician and another is a neurosurgeon, the fact that both have MDs shouldn’t imply that their pay should be equal.
Now, there is a possibility that gender discrimination is real and significant. After all, we haven’t really considered how things like promotion or educational/career choices enter the mix. It could be that all managers are paid at the same rate, but women aren’t being promoted because of a “good ol’ boys network”. Or perhaps firms rationally choose not to promote women because higher levels of responsibility demand higher levels of reliability, and it would be a disaster for a business if an executive left the company to raise children (which is out of the company’s hands, and therefore puts a natural market premium on male workers who are less likely to leave for personal reasons). Maybe women face a hostile culture in some professions, and shy away from them. Or perhaps women are hardwired (or culturally pressured) to choose “softer” work like education than “harder” work like manufacturing. It’s possible that women feel pressured (by cultural expectations or their spouses) to raise children and drop out of the workforce. Or, maybe they’re hardwired to prefer raising kids to working outside the home, at least relative to men. While these are interesting and important questions, they cannot be answered simply by looking at pay gap statistics, even when those are constructed accurately and honestly. Unfair discrimination of the kind we’re looking for simply won’t appear in this kind of data, and even if it did, it’s not particularly actionable: if data indicated cultural biases against women, there’s not much an individual business could really do. If women are conditioned from a young age to self-select away from (e.g.) engineering, a firm would not be able to compensate for that bias because it can only choose workers from within a pool of qualified candidates – which is to say, (predominantly-)male engineers.
2. The suicide rate for post-op transgendered people (around 4%) is 2.5 times that of the general population (around 1.5%). The naive conclusion: gender reassignment leads to severe psychological harm that causes trans-people to commit suicide.
Except that the pre-op suicide rate has been estimated to be around 40%; that 4% rate actually shows a marked improvement for trans-people, at least as far as suicide and self-harm. The statistic’s framing against a particular baseline (the general population suicide rate, rather than the pre-op rate) biases the interpretation.
3.The United States has 4% of the world’s population, but 25% of the world’s prison population; we have more prisoners per capita than even repressive regimes like Iran and North Korea. The natural conclusion: the US has a problem of either rampantly high crime, or a horrendously overreaching criminal justice system. (The actual overall crime rate in the US turns out to be about the same as any other developed nation. Some crime [e.g. murder] is more prevalent in the US than other countries, while some [e.g. “hot” burglary or home invasion] is less prevalent. With that in mind, let’s assume the statistic is intended to be interpreted toward overreach in criminal justice.)
There’s no question that this is a problem. Even though crime rates are about the same in the US than other countries, we impose much longer prison sentences than other countries for the same crimes, compounded by poorly-executed parole systems.
At the same time, one has to wonder if the US is actually worse than repressive regimes, or if those regimes A) tend to use different punishments (e.g. public floggings) in lieu of prison, or B) don’t call it prison (e.g. North Korean slave-labor “education centers”).
4. There are 5 times as many white victims of black violence than black victims of white violence, and twice as many whites killed by police as blacks. Black privilege, amirite?
Except that there are 6 times as many whites as blacks in the general population. If police killed people of each race at the same rate, we’d expect 6 of 7 victims (86%) to be white, not 2 of 3 (67%). What’s important isn’t the raw number of people killed or victimized, it’s the rate – the likelihood that an encounter with police is fatal, not the specific number of fatalities.
The point of all this is that we should be very careful about using statistics, and know exactly what they mean. Psychologists have long known of “self-serving bias”, the tendency to present and interpret information in ways that tends to favor us personally. Our collective attachment to numbers and (what we think constitutes) evidence makes us extremely susceptible to dishonest statistics. Even – perhaps especially! – when we’re informed and sophisticated, there is a temptation to exploit the natural human credulity toward “objective” evidence, and statistics can seem to be the most objective evidence there is.
Even the scientific literature is not immune. This was made abundantly clear recently, when “Johannes Bohannon” revealed his role in duping the public about weight-loss benefits of chocolate using the time-honored techniques of “p-hacking” and small sample sizes. Or one can look at the work of John Ioannidis, who’s made a career of demonstrating the extent to which scientific literature is unreliable.
There are dozens of ways to construct statistics toward an agenda, without resorting to bald-faced lies or fabrications. There’s p-hacking and small sample sizes, of course. But one can also fail to use good methodology (particularly experimental controls), or neglect base rates (as in the crime and suicide rate examples), or compare unlike things (as in the wage gap and prison population examples). One can use technically true numbers to support specious conclusions. One can frame statistics cleverly (don’t say that a drug kills 20% of patients, say that 80% survive the therapy!). Most people will interpret numbers either as if they mean what they seem to say, or in a way that confirms their pre-existing beliefs, but they’ll rarely take it upon themselves to judge the credibility of the source.
This all doesn’t mean that we should ignore or dismiss statistics altogether. Rather, it means that we should be highly skeptical, recognizing that most statistics, even the technically true ones, are highly likely to be misleading. Until you know what a statistic truly means, assume that it’s offered in support of someone’s agenda, and that this agenda is rarely a dispassionate pursuit of Truth. (And, of course, don’t be That Guy – when using statistics, keep ’em as honest as possible.)