Last May, Dave at Euri.ca took at crack at expanding Gabriel Rossman's excellent post on spurious correlation in data. It's an important read for anyone wondering whether the core hypothesis of the Big Data movement is that every sufficiently large pile of horseshit must have a pony in it somewhere. As O'Reilly's Nat Torkington says, "Anyone who thinks it’s possible to draw truthful conclusions from data analysis without really learning statistics needs to read this."
* If good looks and smarts are distributed normally, and
If good looks and smarts have nothing to do with each other, and
If movie producers want both smarts and looks
Then, by observing employed actors we’ll assume that looks and smarts have a negative correlation
Even though we constructed this experiment with no correlation
Here’s a graph of 250 randomly generated points (with no correlation). With the red circles representing “actors who are smart and good looking enough to get a job (looks+smarts>2), and lighter blue x’s representing “people who wanted to be actors”
Clearly if we only look at actors with jobs, we’ll see a clearly negative correlation between smarts and good looks. In fact, some brilliant actors are less attractive than an average person, and some gorgeous actors are dumber than an average person. Even more interesting though, is that if we try to rule out bias by looking at aspiring but unsuccessful actors as well, we’ll find that they exhibit a similar correlation...
You’re probably polluting your statistics more than you think
(via O'Reilly Radar)
John Johnson and Mike Gluck’s new book, Everydata: The Misinformation Hidden in the Little Data You Consume Every Day is a tour-de-force of statistical literacy. This excerpt, a chapter on understanding statistical outliers, is as clear an explanation of what an outlier is, and what it means, and why it matters, as you’re likely to find.
Love Hulten writes, “The Echo Observatory is a handcrafted tribute to fractals and self-similar patterns. It’s a mysterious artifact that both generates and visualizes complex mathematical formations, in real-time.”
To celebrate Pi Day (3/14), have fun with MyPiDay, developed last year by Stephen Wolfram and company. Enter your birthday or any other number and see where it first appears in pi. Background in Wolfram’s post here.
White hat hackers get paid to find holes in their own employers’ online systems, and plug those holes before they become serious security risks. It’s a job that pays handsomely…mostly because few job candidates, even experienced IT professionals, have the skills to scamper over firewalls and infiltrate the deepest recesses of a battle-tested network. But […]
Why buy one of those expensive and confusing universal remotes, clogged with enough buttons to launch a space shuttle, when you could accomplish the same electronic control right on your favorite mobile device? The Blumoo Universal Remote, now just $52.99 in the Boing Boing Store, harnesses the audio power of all your household equipment right […]
You may not love Microsoft Word, but you’ve definitely used it. Other than being one of the most ubiquitous programs on the planet, it’s been the go-to word processing system for more than a quarter-century because it’s as basic as it gets. But occasionally, you’ve got assignments that beg for a lot more options than simple […]