You searched for re-identification

Cory Doctorow 1:13 pm Wed Oct 2, 2019

Researchers think that adversarial examples could help us maintain privacy from machine learning systems

Machine learning systems are pretty good at finding hidden correlations in data and using them to infer potentially compromising information about the people who generate that data: for example, researchers fed an ML system a bunch of Google Play reviews by reviewers whose locations were explicitly given in their Google Plus reviews; based on this, the model was able to predict the locations of other Google Play reviewers with about 44% accuracy.

Cory Doctorow 12:21 pm Thu Sep 5, 2019

Google releases a free/open differential privacy library

"Differential privacy" (previously) is a promising, complicated statistical method for analyzing data while preventing reidentification attacks that de-anonymize people in aggregated data-sets.

Cory Doctorow 8:23 am Wed Jul 24, 2019

A generalized method for re-identifying people in "anonymized" data-sets

"Anonymized data" is one of those holy grails, like "healthy ice-cream" or "selectively breakable crypto" — if "anonymized data" is a thing, then companies can monetize their surveillance dossiers on us by selling them to all comers, without putting us at risk or putting themselves in legal jeopardy (to say nothing of the benefits to science and research of being able to do large-scale data analyses and then publish them along with the underlying data for peer review without posing a risk to the people in the data-set, AKA "release and forget").

Cory Doctorow 10:18 am Fri Jun 14, 2019

Hong Kong's #612strike uprising is alive to surveillance threats, but its countermeasures are woefully inadequate

The millions of Hong Kong people participating in the #612strike uprising are justifiably worried about state retaliation, given the violent crackdowns on earlier uprisings like the Umbrella Revolution and Occupy Central; they're also justifiably worried that they will be punished after the fact.

Cory Doctorow 9:27 am Fri Jun 1, 2018

The most interesting thing about the "Thanksgiving Effect" study is what it tells us about the limits of data anonymization

Late last year, a pair of economists released an interesting paper that used mobile location data to estimate the likelihood that political polarization had shortened family Thanksgiving dinners in 2016.

Cory Doctorow 8:59 am Thu Feb 1, 2018

An incredibly important paper on whether data can ever be "anonymized" and how we should handle release of large data-sets

Even the most stringent privacy rules have massive loopholes: they all allow for free distribution of "de-identified" or "anonymized" data that is deemed to be harmless because it has been subjected to some process.

Cory Doctorow 8:35 am Mon Jan 29, 2018

Fitness app releases data-set that reveals the location of sensitive military bases, patrol routes, aircrew flightpaths, and individual soldiers' jogging routes

Strava is a popular fitness route-tracker focused on sharing the maps of your workouts with others; last November, the company released an "anonymized" data-set of over 3 trillion GPS points, and over the weekend, Institute for United Conflict Analysts co-founder Nathan Ruser started a Twitter thread pointing out the sensitive locations and details revealed by the release.

Cory Doctorow 6:01 am Thu Dec 21, 2017

The Australian health authority believed it had "anonymised" a data-set of patient histories, but academics were easily able to unscramble it

The Australian government's open data initiative is in the laudable business of publishing publicly accessible data about the government's actions and spending, in order to help scholars, businesses and officials understand and improve its processes.

Cory Doctorow 8:18 am Wed Aug 2, 2017

Reidentification attack reveals German judge's porn-browsing habits

In their Defcon 25 presentation, "Dark Data", journalist Svea Eckert and data scientist Andreas Dewes described how easy it was to get a massive trove of "anonymized" browsing habits (collected by browser plugins) and then re-identify the people in the data-set, discovering (among other things), the porn-browsing habits of a German judge and the medication regime of a German MP.

Cory Doctorow 8:47 am Fri May 27, 2016

Study shows detailed, compromising inferences can be readily made with metadata

In Evaluating the privacy properties of telephone metadata, a paper by researchers from Stanford's departments of Law and Computer Science published in Proceedings of the National Academy of Sciences, the authors analyzed metadata from six months' worth of volunteers' phone logs to see what kind of compromising information they could extract from them.

Cory Doctorow 9:14 am Wed Oct 28, 2015

Mobile carriers make $24B/year selling your secrets

The largest carriers in the world partner with companies like SAP to package up data on your movements, social graph and wake/sleep patterns and sell it to marketing firms.

Cory Doctorow 8:32 am Tue Sep 15, 2015

Postcapitalism: A Guide to Our Future

Economist Paul Mason's blockbuster manifesto Postcapitalism suggests that markets just can't organize products whose major input isn't labor or material, but information, and that means that, for the first time in history, it's conceivable that we can have a society based on abundance.

Cory Doctorow 10:49 am Wed Jul 9, 2014

Big Data should not be a faith-based initiative

Cory Doctorow summarizes the problem with the idea that sensitive personal information can be removed responsibly from big data: computer scientists are pretty sure that's impossible.