Big Data's religious faith denies the reality of failed promises, privacy Chernobyls

Cory Doctorow 7:10 am Wed Oct 7, 2015

Maciej Ceglowski (previously) spoke to a O'Reilly's Strata Big Data conference this month about the toxicity of data — the fact that data collected is likely to leak, and that data-leaks resemble nuclear leaks in that even the "dilute" data (metadata or lightly contaminated boiler suits and tools) are still deadly when enough of them leak out (I've been using this metaphor since 2008).

Ceglowski also raises a critical point: Big Data has not lived up to its promises, especially in life sciences, where we were promised that deep analysis of data would yield up new science that has spectacularly failed to materialise. What's more, the factors that confound Big Data in life science are also at play in other domains, including the business domains where so much energy has been expended.

The key point is that people react to manipulation through Big Data: when you optimize a system to get people to behave in ways they don't want to (to spend more money, to click links they aren't interested in, etc) then people adapt to your interventions and regress to the mean.

Big Data's advocates believe that all this can be solved with more Big Data. This requires them to deny the privacy harms from collecting (and, inevitably, leaking) our personal information, and to assert without evidence that they can massage the data so that it can't be associated with the humans from whom it was extracted.

As Ceglowski puts it, 'people speak of the "data driven organization" with the same religious fervor as a "Christ-centered life".'

This has been a bitter pill to swallow for the pharmacological industry. They bought in to the idea of big data very early on.

The growing fear is that the data-driven approach is inherently a poor fit for life science. In the world of computers, we learn to avoid certain kinds of complexity, because they make our systems impossible to reason about.

But Nature is full of self-modifying, interlocking systems, with interdependent variables you can't isolate. In these vast data spaces, directed iterative search performs better than any amount of data mining.

My contention is that many of you doing data analysis on the real world will run into similar obstacles, hopefully not at the same cost as pharmacology.

The ultimate self-modifying, adaptive system is any system that involves people. In other words, the kind of thing most of you are trying to model. Once you're dealing with human behavior, models go out the window, because people will react to what you do.

In Soviet times, there was the old anecdote about a nail factory. In the first year of the Five-Year Plan, they were evaluated by how many nails they could produce, so they made hundreds of millions of uselessly tiny nails

Haunted By Data [Maciej Ceglowski/Idle Words]

(via O'Reilly Radar)

"Make Polio Great Again?" NH Republicans push anti-vax bill

Anti-vax legislation by New Hampshire Republicans has cleared one house and appears headed for the governor's desk. Republican-controlled New Hampshire appears poised to become the first in the nation to… READ THE REST
Kagen Sound's incredibly intricate puzzle boxes

Kagen Sound is an award-winning woodworker and artist who makes remarkably detailed puzzle boxes out of wood. The puzzle boxes must be solved in order for the lid to be… READ THE REST
Goldene is a film of gold one atom thick

Goldene is a layer of gold a single atom thick. It doesn't look very … goldy?… but it's a major achievement of materials science. Shun Kashiwaya and others: The synthesis… READ THE REST
Save $169 on a lifetime license to Microsoft Windows 11 Pro and never look back

TL;DR: Revamp your digital world with this incredible lifetime license to Microsoft Windows 11 Pro, with its seamless interface and top-notch security, for only $29.97 (Reg. $199) until 11:59 PM on 1/07.… READ THE REST
Upgrade your tech for the new year with this refurbished iPad Pro, less than half price right now

TL;DR: Save over $350 on a refurbished Apple iPad Pro 10.5" 256GB, plus a free accessories bundle, with this sweet deal on sale for just $315.99 right now. Tech fans, it's time… READ THE REST
Make your rockstar dreams a reality for only $15.97 with this Guitar Lessons Training Bundle

TL;DR: The perfect last-minute holiday gift for an aspiring rocker, the 2024 Guitar Lessons Training Bundle is only $15.97 (Reg. $480) until 11:59 PM on 12/25. It's really never too late to make… READ THE REST