The mathematics of tabloid news

Leila Schneps and Coralie Colmez have an interesting piece at The New York Times about DNA evidence in murder trials, the mathematics of probability, and the highly publicized case of Amanda Knox. What good is remembering the math you learned in junior high? If you're a judge, it could be the difference between a guilty verdict and an acquittal. Read the rest

Why "cancer clusters" are so hard to confirm

This excerpt from the new book, Read the rest

Tallest possible Lego tower height calculated

The good folks on the most-excellent BBC Radio/Open University statistical literacy programme More or Less decided to answer a year-old Reddit argument about how many Lego bricks can be vertically stacked before the bottom one collapses.

They got the OU's Dr Ian Johnston to stress-test a 2X2 Lego in a hydraulic testing machine, increasing the pressure to some 4,000 Newtons, at which point the brick basically melted. Based on this, they calculated the maximum weight a 2X2 brick could bear, and thus the maximum height of a Lego tower:

The average maximum force the bricks can stand is 4,240N. That's equivalent to a mass of 432kg (950lbs). If you divide that by the mass of a single brick, which is 1.152g, then you get the grand total of bricks a single piece of Lego could support: 375,000.

So, 375,000 bricks towering 3.5km (2.17 miles) high is what it would take to break a Lego brick.

"That's taller than the highest mountain in Spain. It's significantly higher than Mount Olympus [tallest mountain in Greece], and it's the typical height at which people ski in the Alps," Ian Johnston says.

"So if the Greek gods wanted to build a new temple on Mount Olympus, and Mount Olympus wasn't available, they could just - but no more - do it with Lego bricks. As long as they don't jump up and down too much."

How tall can a Lego tower get?

More or Less: Opinion polling, Kevin Pietersen, and stacking Lego 30 Nov 2012 [MP3] Read the rest

Everything you eat is associated with cancer, but don't worry about it

Image: Shutterstock. Fried chicken gave the model in this stock photo cancer of the double chin.

Sarah Kliff at the Washington Post digs into new research out today from The American Journal of Clinical Nutrition. She writes about correlation and causality, and how to read statistics more intelligently.


“I was constantly amazed at how often claims about associations of specific foods with cancer were made, so I wanted to examine systematically the phenomenon,” e-mails study author John Ioannidis ”I suspected that much of this literature must be wrong. What we see is that almost everything is claimed to be associated with cancer, and a large portion of these claims seem to be wrong indeed.”

Among the ingredients in question for their purported relation to cancer risk: veal, salt, pepper spice, flour, egg, bread, pork, butter, tomato, lemon, duck, onion, celery, carrot, parsley, mace, sherry, olive, mushroom, tripe, milk, cheese, coffee, bacon, sugar, lobster, potato, beef, lamb, mustard, nuts, wine, peas, corn, cinnamon, cayenne, orange, tea, rum, and raisin.

Now: combine all of them into one recipe and do the study again, I say.

Read the rest

Nate Silver's The Signal and The Noise

Read the rest

Particle physicists not yet willing to call the election for Obama

Sure, there's a 99.2% probability that he will win, but that is several standard deviations away from the 99.99995% confidence that the particle physicists would need to declare the election won. (This is satire, obviously.) Via Jennifer Ouellette. Read the rest

Surviving a plane crash is surprisingly common

Between 1983 and 2000, more than 95% of people involved in plane crashes survived.

Fact-checking the RIAA's claim that the number of working musicians fell by 41%

Matthew Lasar's long Ars Technica feature, "Have we lost 41 percent of our musicians? Depends on how you (the RIAA) count" does an excellent job of digging into RIAA CEO Cary Sherman's claim that the number of working musicians in the USA has declined by 41 percent. After checking the RIAA's math, Lasar finds a gigantic discrepancy between the figures they cite and the conclusions they reach. But then Lasar delves further into the underlying sources, as well as government and industry stats, and finds that basically, the number of musicians working in America may have slightly declined, but is also projected to rise.

It is worth ending this cautionary tale with a review of the BLS's own occupational handbook projection for musician/singer employment in the near future. Note that the handbook cites a much higher employment figure for both trades in 2010 than mentioned in the above tables: about 176,200 musicians and singers. That's because it comes from the Bureau's National Employment Matrix, I was told, which adds additional data sources.

Employment for musicians and singers is expected to grow by ten percent over the decade—"about as fast as the average for all occupations," the government notes:

The number of people attending musical performances, such as orchestra, opera, and rock concerts, is expected to increase from 2010 to 2020. As a result, more musicians and singers will be needed to play at these performances.

There will be additional demand for musicians to serve as session musicians and backup artists for recordings and to go on tour.

Read the rest

What a dead fish can teach you about neuroscience and statistics

The methodology is straightforward. You take your subject and slide them into an fMRI machine, a humongous sleek, white ring, like a donut designed by Apple. Then you show the subject images of people engaging in social activities — shopping, talking, eating dinner. You flash 48 different photos in front of your subject's eyes, and ask them to figure out what emotions the people in the photos were probably feeling. All in all, it's a pretty basic neuroscience/psychology experiment. With one catch. The "subject" is a mature Atlantic salmon.

And it is dead. Read the rest

What cancer statistics actually mean

Genius science writer Ed Yong used to work for a cancer charity, so he's seen how the cancer research sausages get made. In a new post at Not Exactly Rocket Science, Ed takes you on a brief tour of the factory, explaining why even good data doesn't necessarily mean what you think it means.

The post is based around a new study that says 16.1% of all cancers worldwide are caused by infections. This statistic is talking about stuff like HPV—viruses and other infections that can prompt mutations in the cells they infect. Sometimes, those mutations propagate and become a tumor.

That statistic tells us that infections play a role in more cancers than most laypeople probably think, Ed says. It gives us an idea of the scale of the problem. But you have to be careful not to read too much into that 16.1%.

The latest paper tells us that 16.1% of cancers are attributable to infections. In 2006, a similar analysis concluded that 17.8% of cancers are attributable to infections. And in 1997, yet another study put the figure at 15.6%. If you didn’t know how the numbers were derived, you might think: Aha! A trend! The number of infection-related cancers was on the rise but then it went down again.

That’s wrong. All these studies relied on slightly different methods and different sets of data. The fact that the numbers vary tells us nothing about whether the problem of infection-related cancers has got ‘better’ or ‘worse’. (In this case, the estimates are actually pretty close, which is reassuring.

Read the rest

Why water supply affects your computer

Between now and 2020, the greatest increases in population growth in the United States are projected to happen in the places that have the biggest problems with fresh water availability. This isn't just a drinking water problem, or even an agriculture problem. It's an energy issue, too. Most of our electricity is made by finding various ways to boil water, producing steam that turns a turbine in an electric generator. In 2000, we used as much fresh water to produce electricity as we used for irrigation—each sector represented 39% of our total water use. (From a poster at Lawrence Berkeley National Laboratory.) Read the rest

Cybercrime sucks (for criminals)

Bruce Schneier comments on an NYT report on cybercrime that shows that there's just not much money to be had in being a ripoff artist. Dinei Florêncio and Cormac Herley wrote:

A cybercrime where profits are slim and competition is ruthless also offers simple explanations of facts that are otherwise puzzling. Credentials and stolen credit-card numbers are offered for sale at pennies on the dollar for the simple reason that they are hard to monetize. Cybercrime billionaires are hard to locate because there aren’t any. Few people know anyone who has lost substantial money because victims are far rarer than the exaggerated estimates would imply.

The authors frame cybercrime as a "tragedy of the commons," where the overfishing (overphishing) by crooks has reduced everyone's margins to nothing, making it hard graft indeed. Meanwhile, cybercrime estimates are subject to the same lobbynomics used to calculate losses from music downloading and profits from drug seizures:

Suppose we asked 5,000 people to report their cybercrime losses, which we will then extrapolate over a population of 200 million. Every dollar claimed gets multiplied by 40,000. A single individual who falsely claims $25,000 in losses adds a spurious $1 billion to the estimate. And since no one can claim negative losses, the error can't be canceled.

Cybercrime as a Tragedy of the Commons Read the rest

Why the DHS's pre-crime biometric profiling is doomed to fail, and will doom passengers with its failures

In The Atlantic, Alexander Furnas debunks the DHS's proposal for a "precrime" screening system that will attempt to predict which passengers are likely to commit crimes, and single those people out for additional screening. FAST (Future Attribute Screening Technology) "will remotely monitor physiological and behavioral cues, like elevated heart rate, eye movement, body temperature, facial patterns, and body language, and analyze these cues algorithmically for statistical aberrance in an attempt to identify people with nefarious intentions." They'll build the biometric "bad intentions" profile by asking experimental subjects to carry out bad deeds and monitoring their vital signs. It's a mess, scientifically, and it will falsely accuse millions of innocent people of planning terrorist attacks.

First, predictive software of this kind is undermined by a simple statistical problem known as the false-positive paradox. Any system designed to spot terrorists before they commit an act of terrorism is, necessarily, looking for a needle in a haystack. As the adage would suggest, it turns out that this is an incredibly difficult thing to do. Here is why: let's assume for a moment that 1 in 1,000,000 people is a terrorist about to commit a crime. Terrorists are actually probably much much more rare, or we would have a whole lot more acts of terrorism, given the daily throughput of the global transportation system. Now lets imagine the FAST algorithm correctly classifies 99.99 percent of observations -- an incredibly high rate of accuracy for any big data-based predictive model. Even with this unbelievable level of accuracy, the system would still falsely accuse 99 people of being terrorists for every one terrorist it finds.

Read the rest

Danish trade minister and ACTA booster apologise for bogus piracy numbers

Here's a clip of a Danish TV show discussing ACTA, which Denmark has fiercely advocated in favor of. It starts with the head of a rightsholder society and the Danish trade minister quoting dodgy statistics about the extent and cost of piracy, and then demonstrates that these statistics are patently false, and finally, brings out those responsible for quoting them and gets them to admit their errors. Priceless.

You can see both the Danish Trade Minister and the head of a Danish music rights organization (and famous Danish musician) Ivan Pedersen appear on a TV show below (with English subtitles). On the show, a well-informed presenter focuses on how both of these ACTA defenders claimed that 95% of music downloaded in Denmark was unauthorized, and carefully shows how that's simply false -- and then gets both of the ACTA defenders to admit that the numbers were wrong.

Danish Trade Minister Apologizes For Using Bogus Industry Numbers To Support Pro-ACTA Argument Read the rest

Facebook's funny accounting has "active users" who never use Facebook

The NYT's Andrew Ross Sorkin quotes Barry Ritholtz's digging into how Facebook's IPO documents define "active" users and finds that many of them may never visit the site. Facebook counts you as "active" if your only involvement with the service is setting it up to republish your Twitter feed, or if you click "Like" buttons but never log in to the actual service. This should matter to investors, since Facebook earns no advertising revenue from those users, though it may earn some other income by reselling the private details of their browsing habits as gleaned from its tracking cookies.

In other words, every time you press the “Like” button on, for example, you’re an “active user” of Facebook. Perhaps you share a Twitter message on your Facebook account? That would make you an active Facebook user, too. Have you ever shared music on Spotify with a friend? You’re an active Facebook user. If you’ve logged into Huffington Post using your Facebook account and left a comment on the site — and your comment was automatically shared on Facebook — you, too, are an “active user” even though you’ve never actually spent any time on

“Think of what this means in terms of monetizing their ‘daily users,’ ” Barry Ritholtz, the chief executive and director for equity research for Fusion IQ, wrote on his blog. “If they click a ‘like’ button but do not go to Facebook that day, they cannot be marketed to, they do not see any advertising, they cannot be sold any goods or services.

Read the rest

More posts