Spinning Stats
42.37% of all statistics are made up, including this one. Every news story, video and blog post (guilty as charged) uses numbers to vamp their credibility and the onslaught can make your head spin. However, that spinning may not just be in your head: as such figures may be tailored to each source to support their side or, worst case scenario, mislead, we have to start distinguishing the spin from the scoop.
Robert Ménard, mayor of Béziers, caused an uproar in France with a statistic that he mentioned on a televised debate show in May: that 64.6% of the children in his region’s classrooms were Muslim, indicating an “immigration problem”. The punchline: this data was obtained by counting the names of the children attending schools in Béziers. According to Ménard, “Sorry to say it, but the first names tell us their religion. To say otherwise is to deny the evidence”. Given that it is forbidden by French law, the Front National politician is now undergoing a legal process that could culminate in a five year jail sentence or a €300,000 fine.
Now, let’s look beyond the outrageous discrimination that the practice entailed and look at the mayor’s idea itself. I’m no expert on statistics, but extrapolating percentages of a religion from first names? The number itself isn’t reliable. Did the idea that a person could be called Mohammed or Fatima without being a practicing Muslim come to mind? Or that the child in question could be a French citizen? In addition, the option of homeschooling or the possibility of immigrants with no children means that Ménard’s statistical universe is off from the start.
“My second name is Moussa, after my grandfather Moussa. If I were a child in your city, would I have been put on file?”
To make matters worse, from this shaky premise came various overarching conclusions: that a) the children are Muslim, b) their French isn’t up to par and c) they indicate an immigration problem in France, there are “too many” to be assimilated. Since this kind of data collection is illegal in France, the mayor has nothing to compare his new data to. No wonder Ménard says he’s reserving his methodological justifications for the courtroom.
All because Ménard wanted to support a statement on public television. Not a good move, I think we can agree, given its illegality. However, his faux pas isn’t an isolated incident: using numbers in order to boost the strength of an argument is standard in our Big Data world. This trend is potentially worrying as the elaboration of these numbers seem to have less checkpoints, more bias and, unfortunately, more acceptance.
On one hand, there is the danger of data being spun or tweaked to suit the person in question, which isn’t exactly new: in fact, it happens all the time. On the other, there is the possibility of data making your head spin in bemusement. That happened this week in Argentina, as president Cristina Fernández de Kirchner told the Food and Agriculture Organization of the U.N in Rome that Argentina’s poverty rate is currently below 5%.
This incited a wave of criticism from all sides, prompting Aníbal Fernandez (Cabinet Chief) to defend her statement by comparing Argentina to Germany, which has a poverty rate of 20%. That is to say that, according to the numbers, Argentina has a smaller poverty rate than Germany and Denmark. This was confusing, to say the least, sparking ridicule and anger from all sides: especially in social media, where a mass “call to solidarity” to help Germany began as well as lamenting that “our government seems to think we are idiots”.
As in the case of Béziers, however, it could be more helpful to look at the numbers per se: first, it should be acknowledged that the 5% mark is an official number emitted by the official Argentine statistical organization, INDEC. Secondly, this data is from 2013: the INDEC hasn’t published anything on poverty in the country since. Outdated numbers are a continuous issue in Argentina and that the INDEC in itself has lost a lot of credibility over the years. In fact, there’s a motion of censure by the IMF against Argentina due to its hugely inaccurate inflation and economic growth data. In cases like these, it’s always worth having a fact-checker handy.
“Merkel wonders if we have no shame, protesting against abundance while they eat stir fry”. Waves of criticism and sarcasm were ignited after the government’s comments.
The trick is also to see the way that the numbers are presented. For example; as in most of Latin America, Argentina’s poverty rate is defined by measuring household income vis a vis the appreciation of a basic food basket (an obsolete method, by the way). In Europe, however, the poverty line is set at 60% of the median household income. Thus, comparing the poverty rate between Argentina and Germany isn’t just uncomfortable, but impossible. However, logic like this can get lost in a noisy haze: in this case, politics, the hasty defence of an outdated number and the ensuing uproar.
In the end, we have to resign ourselves to the spinning nature of Big Data. Politicians (amongst others) aren’t going to stop basing their statements or attempting to reinforce their legacy with shaky percentages and outrageous numbers. Data is still going to appear everywhere. We may not always feel the need to check the facts, but as the elections loom large in Argentina it might be worth trying to look beyond the noise of “lies” or “idiocy” and pay more attention to the numbers: where they came from, what they really indicate and where that should take us.