Predpol (previously) is a "predictive policing" company that sells police forces predictive analytics tools that take in police data about crimes and arrests and spits out guesses about where the police should go to find future crimes.
Predpol has drawn sharp criticism for algorithmic discrimination, in which data from racist policing practices are laundered through an algorithm that gives them the veneer of empirical impartiality: feeding faulty data to a predictive algorithm produces faulty analysis. "Garbage in, garbage out" is an iron law of computing that has not been repealed by machine learning techniques.
Even as Predpol and its competitors, like Palantir (previously) have expanded their operations, concerned citizens have successfully pushed for local laws requiring cities to engage in public consultation before procuring services that feed private corporations policing and surveillance data in order to direct policing operations. However, in most markets, Predpol and its competitors operate in obscurity. The cop who stopped you this week (or who didn't come to your neighborhood at all) might have been acting on orders from an AI oracle provided by Predpol to your local police, whose tax-funded revenues are a close-kept secret.
An anonymous security researcher recently contacted me with what may be a list of Predpol's customers. This researcher had seen that Predpol assigns easy-to-guess subdomains to each Predpol customer, in the form of CITYNAME.predpol.com, for example, baltimore.predpol.com.
This researcher wrote a script that combined the name of every US city and town with ".predpol.com" and checked to see whether this domain existed. The full list of cities that had Predpol domains is both short and confusing:
Many of these cities have already publicly disclosed that they are using Predpol's services (Baltimore, MD; Pleasanton, CA; Modesto, CA; Tacoma, WA; El Monte, CA; Elgin, IL; Livermore, CA; Reading, PA; Merced, CA and Haverhill, MA).
Two of the remaining domains are easy to understand: berkeley.predpol.com refers to the UC Berkeley campus police, who have purchased Predpol's services (Predpol's Board of Directors includes Tom Jorde, Professor Emeritus of Law at UC Berkeley). Frederick, MD cheerfully admits that they are a Predpol customer.
The remainder are a mystery, though. None of the police departments for any of the US cities called Long Beach or Albany (there are several!) admit to using Predpol's services. The press officer for the Indianapolis, IN police department was definitive that his department wasn't a Predpol customer. I left several messages for the press officer for the South Jordan, UT police, but never heard back. The "hollywood.predpol.com" domain seems to refer to Hollywood, CA, which is under LAPD jurisdiction (the LAPD has a publicly disclosed relationship with Predpol).
Predpol itself was tight-lipped in the extreme: they initially ignored all press requests, then sent a terse "neither confirm nor deny" response to my questions about this list. They wouldn't even confirm whether the login forms at these domains were secure, despite repeated warnings from me that I would be making them public, requesting that they ensure that these forms require strong logins and passwords to avoid exposing sensitive policing data.
The list raises more questions than it answers. Does Predpol really have fewer than two dozen customers in the USA? What are we to make of the cities with subdomains who have never procured services from Predpol?
Predpol sells services to publicly funded policing organizations that make predictions about where crime will occur. Everything about this process is a secret: which police departments procure Predpol's services, what data they provide Predpol with, how Predpol arrives at its oracular pronouncements, and how much public money they receive for this service.
What we do know is that policing predictions are self-fulfilling prophecy: if the police ask everyone on a given corner to turn out their pockets, or stop and search every car going down a certain road, they will, eventually, find crime. What we don't know is whether Predpol's predictions are better at finding crime than random chance.
We also know that machine learning predictions are no better than the data used to generate their models. If the data used to train Predpol's models come from biased policing, then Predpol's predictions will be biased, too: but their machine-learning pedigree is a kind of empirical facewash that makes them seem "scientific."
(Image: Wapcaplet, CC-BY-SA)