If you've read Darell Huff's seminal 1954 book How to Lie With Statistics, you've learned an important rule of thumb: any chart whose Y-axis doesn't start at zero is cause for suspicion, if not alarm. Read the rest
Matthew Hankins catalogs 500 phrases used in scientific articles that researchers use to figleaf the fact that their results aren't statistically significant, and to hand-wave-away the fact that they're publishing anyway. Read the rest
Wichita State University's Beth Clarkson (who is also chief statistician of WSU's National Institute for Aviation Research) discovered "odd patterns" in Kansas electoral voting records, so she requested public docs to help her get to the bottom of things -- requests that state officials ignored, dodged, and stalled. Read the rest
Peer review and replication are critical to the scientific method, but in medical trials, a combination of pharma company intransigence and scientists' fear of being pilloried for human error means that the raw data that we base life-or-death decisions upon is routinely withheld, meaning that the errors lurk undetected in the data for years -- and sometimes forever. Read the rest
The police will tell you that the reason they're arming up with surplus military gear and pursuing a shoot-first posture to their job is that being a cop is deadly business -- but as the saying goes, you're entitled to your own opinion, but not your own facts. Read the rest
In 2012, Jim Henley got tongue cancer, but it was the good kind -- his odds are like making a save-against-death throw on a D8 and needing to beat a one. Read the rest
Patrick Ball and the Human Rights Data Analysis Group applied the same statistical rigor that he uses in estimating the scale of atrocities and genocides for Truth and Reconciliation panels in countries like Syria and Guatemala to the problem of estimating killing by US cops, and came up with horrific conclusions. Read the rest
It is no surprise that critics and viewers alike agree that The Godfather is the "best film" among the ~2600 films considered on Rotten Tomatoes, with a 100% score among professional reviewers and a 98% score from the audience. It is perhaps somewhat more surprising to learn which films divide those two groups; thanks to Benjamin Moore, we can contemplate that...
“Overrated” and “underrated” are slippery terms to try to quantify. An interesting way of looking at this, I thought, would be to compare the reviews of film critics with those of Joe Public, reasoning that a film which is roundly-lauded by the Hollywood press but proved disappointing for the real audience would be “overrated” and vice versa.
To get some data for this I turned to the most prominent review aggregator: Rotten Tomatoes...
On the whole it should be noted that critics and audience agree most of the time, as shown by the Pearson correlation coefficient between the two scores (0.71 across 1200 films). [But] using our earlier definition it’s easy to build a table of those films where the audience ending up really liking a film that was panned by critics:
Here we’re looking at those films which the critics loved, but paying audiences were then less enthused:
The latest installment in Randall Munroe's XKCD "What If?" series is called Paint the Earth and it is amazing. One of Munroe's readers wanted to know "Has humanity produced enough paint to cover the entire land area of the Earth?" and Munroe uses this as a springboard for explaining Fermi estimation, a powerful, counter-intuitive tool that has applications in many fields. Read the rest
In Frequency, the latest XKCD cartoon, Randall Munroe has assembled a grid of animated GIFs representing various events in the universe, each keyed to blink in the frequency in which they occur in reality. As with the best of Munroe's work, it's a mix of the trenchant and the silly, and the juxtapositions are smart and provocative. There's real genius in putting "50,000 plastic bottles are produced" and "50,000 plastic bottles are recycled" next to each other, the former blinking much more often than the latter -- but the best part is "A Sagittarius named Amelia drinks a soda," just above them, mixing up the alarming and the humorous.
The other juxtapositions are just as delicious -- one birth/one death; China builds a car/Japan builds a car/Germany builds a car/US builds a car/someone else builds a car; someone buys "To Kill a Mockingbird"/someone's cat kills a mockingbird -- and so on. This being XKCD, you can be sure that Munroe has an absurdly well-thought-through process for establishing and documenting his numbers, too.
If you're the type of person who really needs some good visuals to make a concept stick in your head, this series of YouTube videos made by the British Psychological Society Media Centre will help you remember the meanings behind statistical concepts like "correlation", "frequency distributions", and "sampling error". There are four videos in the series so far, and they do a great job of painting pictures around abstract ideas. Bonus: Soothing music.
Between 1980 and 2000, a complicated war raged in Peru, pitting the country’s government against at least two political guerilla organizations, and forcing average people to band together into armed self-defense committees. The aftermath was a mess of death and confusion, where nobody knew exactly how many people had been murdered, how many had simply vanished, or who was to blame.
“The numbers had floated around between 20,000 and 30,000 people killed and disappeared,” says Daniel Manrique-Vallier. “But nobody knew what the composition was. Non-governmental organizations were estimating that 90% of the deaths were the responsibility of state agents.”
Manrique-Vallier, a post-doc in the Duke University department of statistical science, was part of a team that researched the deaths for Peru’s Truth and Reconciliation Commission. Their results were completely different from those early estimates. Published in 2003, the final report presented evidence for nearly 70,000 deaths, 30% of which could be attributed to the Peruvian government.
How do you find 40,000 extra dead bodies? How do you even start to determine which groups killed which people at a time when everybody with a gun seemed to be shooting civilians? The answers lie in statistics, data analysis, and an ongoing effort to use math to cut through the fog of war. Read the rest