Stats-based response to UK Tories' call for social media terrorism policing

David Cameron wants social media companies to invent a terrorism-detection algorithm and send all the "bad guys" it detects to the police — but this will fall prey to the well-known (to statisticians) "paradox of the false positive," producing tens of thousands of false leads that will drown the cops.

And that's just the tip of the iceberg — the problems implicit in using Kafkaesque algorithms to determine guilt are myriad (and well-explained in this Guardian piece by James Ball).

Data strategist Duncan Ross set out what would happen if someone could create an algorithm that correctly identified a terrorist from their communications 99.9% of the time – far, far more accurate than any real algorithm – with the assumption that there were 100 terrorists in the UK.

The algorithm would correctly identify the 100 terrorists. But it would also misidentify 0.1% of the UK's non-terrorists as terrorists: that's a further 60,000 people, leaving the authorities with a still-huge problem on their hands. Given that Facebook is not merely dealing with the UK's 60 million population, but rather a billion users sending 1.4bn messages, that's an Everest-sized haystack for security services to trawl.

'You're the bomb!' Are you at risk from the anti-terrorism algorithms? [James Ball/The Guardian]

(Image: Haystacks, John Pavelka, CC-BY)