An incredibly important paper on whether data can ever be "anonymized" and how we should handle release of large data-sets

Even the most stringent privacy rules have massive loopholes: they all allow for free distribution of "de-identified" or "anonymized" data that is deemed to be harmless because it has been subjected to some process. Read the rest

Fitness app releases data-set that reveals the location of sensitive military bases, patrol routes, aircrew flightpaths, and individual soldiers' jogging routes

Strava is a popular fitness route-tracker focused on sharing the maps of your workouts with others; last November, the company released an "anonymized" data-set of over 3 trillion GPS points, and over the weekend, Institute for United Conflict Analysts co-founder Nathan Ruser started a Twitter thread pointing out the sensitive locations and details revealed by the release. Read the rest

The Australian health authority believed it had "anonymised" a data-set of patient histories, but academics were easily able to unscramble it

The Australian government's open data initiative is in the laudable business of publishing publicly accessible data about the government's actions and spending, in order to help scholars, businesses and officials understand and improve its processes. Read the rest

Case study of LAPD and Palantir's predictive policing tool: same corruption; new, empirical respectability

UT Austin sociologist Sarah Brayne spent 2.5 years conducting field research with the LAPD as they rolled out Predpol, a software tool that is supposed to direct police to places where crime is likely to occur, but which has been shown to send cops out to overpolice brown and poor people at the expense of actual crimefighting. Read the rest

A (flawed) troll-detection tool maps America's most and least toxic places

The Perspective API (previously) is a tool from Google spinoff Jigsaw (previously) that automatically rates comments for their "toxicity" -- a fraught business that catches a lot of dolphins in its tuna net. Read the rest

Big data + private health insurance = game over

Once big data systems agglomerate enough data about you to predict whether you are likely to get sick or badly injured, insurers will be able to deny coverage (or charge so much for it that it amounts to the same thing) to anyone who is likely to get sick, forcing everyone who might ever need insurance into medical bankruptcy, and turning Medicaid into a giant "high-risk pool" that taxpayers foot the bill for. Read the rest

Algorithms try to channel us into repeating our lives

Molly Sauter (previously) describes in gorgeous, evocative terms how the algorithms in our life try to funnel us into acting the way we always have, or, failing that, like everyone else does. Read the rest

Business is booming for the surveillance state

Surveillance companies like Axon hope to turn every law enforcement officer into a data-gathering drone for a bodycam surveillance database they privately control. Now ShotSpotter, a listening technology that triangulates gunfire in "urban, high-crime areas," announced a planned IPO. Read the rest

The next iteration of Alexa is designed to watch you while you get dressed

The Echo Look is the next version of the Alexa appliance: it has an camera hooked up to a computer vision system, along with its always-on mic, and the first application for it is to watch you as you dress and give you fashion advice (that is, recommend clothes you can order from Amazon). Read the rest

"One price to all" has been the default since 1840, but online retail is sneakily killing it off

Since the earliest days of ecommerce, analysts have predicted that retailers would use their estimations of their customers' willingness to pay to invisibly, instantaneously reprice their goods, offering different prices to each customer. Read the rest

Ethics and AI: all models are wrong, some are useful, and some of those are good

The old stats adage goes: "All models are wrong, but some models are useful." In this 35 minute presentation from the O"Reilly Open Data Science Conference, data ethicist Abe Gong from Aspire Health provides a nuanced, meaningful, accessible and eminently actionable overview of the ways that ethical considerations can be incorporated into the design of powerful algorithms. Read the rest

Breitbart was a unique driver of hyper-partisan, trumpist news that shifted the 2016 election

A team of esteemed scholars including Yochai "Wealth of Networks" Benkler and Ethan Zuckerman (co-founder of Global Voices) analyzed 1.25 million media stories published between April 1, 2015 and election day, finding "a right-wing media network anchored around Breitbart developed as a distinct and insulated media system, using social media as a backbone to transmit a hyper-partisan perspective to the world." Read the rest

Wearing an activity tracker gives insurance companies the data they need to discriminate against people like you

Many insurers offer breaks to people who wear activity trackers that gather data on them; as Cathy "Mathbabe" O'Neil points out, the allegedly "anonymized' data-collection is trivial to re-identify (so this data might be used against you), and, more broadly, the real business model for this data isn't improving your health outcomes -- it's dividing the world into high-risk and low-risk people, so insurers can charge people more. Read the rest

How to find out what Trump's favorite big data machine knows about you

Cambridge Analytica is a dirty, Dementor-focused big data research outfit that provided the analysis and psych profiles that the Trump campaign used in its electioneering; because its parent company is in the UK, it is required (under EU law) to send you its dossier on you for £10. Read the rest

Trump's big data "secret sauce" sorcery - a much-needed reality check

An article that went viral last week attributed Trump's Electoral College victory to the dark big data sorcery of Cambridge Analytica, a dirty, dementor-focused big data company that specializes in political campaigns. Read the rest

Automated book-culling software drives librarians to create fake patrons to "check out" endangered titles

Two employees at the East Lake County Library created a fictional patron called Chuck Finley -- entering fake driver's license and address details into the library system -- and then used the account to check out 2,361 books over nine months in 2016, in order to trick the system into believing that the books they loved were being circulated to the library's patrons, thus rescuing the books from automated purges of low-popularity titles. Read the rest

Why the FBI would be nuts to try to use chatbots to flush out terrorists online

Social scientist/cybersecurity expert Susan Landau (previously) and Cathy "Weapons of Math Destruction" O'Neil take to Lawfare to explain why it would be a dangerous mistake for the FBI to use machine learning-based chatbots to flush out potential terrorists online. Read the rest

More posts