Beyond GIGO: how "predictive policing" launders racism, corruption and bias to make them seem empirical

"Predictive policing" is the idea that you can feed crime stats to a machine-learning system and it will produce a model that can predict crime. It is garbage.

It's garbage for a lot of reasons. For one thing, you only find crime where you look for it: so if you send out the cops to frisk all the Black people in a city, they will produce statistics that suggest that all concealed weapons and drugs are carried by Black people, in Black neighborhoods.

But once you feed that biased data to an algorithm, the predictions it reaches acquire a veneer of empirical respectability, as the mathematical oracle tells you where the crime will be based on calculations that can never be fully understood and so cannot be interrogated, let alone objected to.

Some of the dirtiest police forces in America have bought predictive policing tools, often in secret.

The iron law of computing states: "Garbage in, garbage out," but predictive policing is worse than mere GIGO: it produces computer-generated marching orders requiring cops to continue the bad practices that produced the bad data used to create the bad model.

In a forthcoming paper for The New York University Law Review, Rashida Richardson (AI Now Institute, previously), Jason Schultz (NYU Law, previously) and Kate Crawford (AI Now, previously) describe how at least thirteen cities whose police departments experimented with predictive policing after being placed under federal investigations or consent decrees for "corrupt, racially biased, or otherwise illegal police practices."

That means that in at least 13 cities, the data that cops were feeding to the predictive policing systems had been generated by practices known to be biased.

In our research, we examine the implications of using dirty data with predictive policing, and look at jurisdictions that (1) have utilized predictive policing systems and (2) have done so while under government commission investigations or federal court monitored settlements, consent decrees, or memoranda of agreement stemming from corrupt, racially biased, or otherwise illegal policing practices. In particular, we examine the link between unlawful and biased police practices and the data used to train or implement these systems across thirteen case studies. We highlight three of these: (1) Chicago, an example of where dirty data was ingested directly into the city’s predictive system; (2) New Orleans, an example where the extensive evidence of dirty policing practices suggests an extremely high risk that dirty data was or will be used in any predictive policing application, and (3) Maricopa County where despite extensive evidence of dirty policing practices, lack of transparency and public accountability surrounding predictive policing inhibits the public from assessing the risks of dirty data within such systems. The implications of these findings have widespread ramifications for predictive policing writ large. Deploying predictive policing systems in jurisdictions with extensive histories of unlawful police practices presents elevated risks that dirty data will lead to flawed, biased, and unlawful predictions which in turn risk perpetuating additional harm via feedback loops throughout the criminal justice system. Thus, for any jurisdiction where police have been found to engage in such practices, the use of predictive policing in any context must be treated with skepticism and mechanisms for the public to examine and reject such systems are imperative.

Dirty Data, Bad Predictions: How Civil Rights Violations Impact Police Data, Predictive Policing Systems, and Justice [Rashida Richardson, Jason Schultz and Kate Crawford/The New York University Law Review]

(Thanks, Jason!)

(Images: Scrapyard, CC-0; HAL 9000, Cryteria, CC-BY; Processor, 502designs, CC-0)