Algorithmic risk-assessment: hiding racism behind "empirical" black boxes

Courts around America and the world increasingly rely on software based risk-assessment software in determining bail and sentencing; the systems require the accused to answer more than a hundred questions which are fed into a secret model that spits out a single-digit "risk score" that courts use to decide who to lock up, and for how long.

The data used to train these models are a trade secret, closely held by the highly profitable companies that sell this service to courts. Unlike the rigorous machine-learning deployed by companies like Amazon, which retrain their models every time they fail (if Amazon shows you an ad it hopes will entice you to buy, and you don't, it adjusts its model to account for this fact), these models have no formal system for gathering data about their predictions and refining the model to see whether it is performing well.

The predictions themselves can only be described as racist. Though the questions given to subjects don't directly deal with race, they do address many of the correlates of race, such as poverty and acquaintance or relation to people with police records. The recommendations that the systems spit out — a single number between one and ten — recommend harsher sentences for brown people than for white people.

Propublica investigated the predictions of these systems, particularly those of Northpointe (acquired in 2011 by Canadian company Constellation Software). They confirmed that the software's predictions are racially biased and incorrect: that is, they score brown people as higher risk for reoffending than the actual reoffense rate; and score white people as lower risks for reoffense than they actually end up being.

The issue is worse than racism, though. Because the training data and models used by Northpointe's system are not subject to peer review or even basic scrutiny, no one can challenge their assumptions — meanwhile, the veneer of data-driven empiricism makes their numeric scores feel objective and evidence-based.

It's a deadly combination, what mathematician and former hedge-fund quant Cathy O'Neil calls Weapons of Math Destruction — her forthcoming book on the subject devotes a whole chapter to algorithmic risk-assessment.

But judges have cited scores in their sentencing decisions. In August 2013, Judge Scott Horne in La Crosse County, Wisconsin, declared that defendant Eric Loomis had been "identified, through the COMPAS assessment, as an individual who is at high risk to the community." The judge then imposed a sentence of eight years and six months in prison.

Loomis, who was charged with driving a stolen vehicle and fleeing from police, is challenging the use of the score at sentencing as a violation of his due process rights. The state has defended Horne's use of the score with the argument that judges can consider the score in addition to other factors. It has also stopped including scores in presentencing reports until the state Supreme Court decides the case.
"The risk score alone should not determine the sentence of an offender," Wisconsin Assistant Attorney General Christine Remington said last month during state Supreme Court arguments in the Loomis case. "We don't want courts to say, this person in front of me is a 10 on COMPAS as far as risk, and therefore I'm going to give him the maximum sentence."

That is almost exactly what happened to Zilly, the 48-year-old construction worker sent to prison for stealing a push lawnmower and some tools he intended to sell for parts. Zilly has long struggled with a meth habit. In 2012, he had been working toward recovery with the help of a Christian pastor when he relapsed and committed the thefts.

After Zilly was scored as a high risk for violent recidivism and sent to prison, a public defender appealed the sentence and called the score's creator, Brennan, as a witness.

Brennan testified that he didn't design his software to be used in sentencing. "I wanted to stay away from the courts," Brennan said, explaining that his focus was on reducing crime rather than punishment. "But as time went on I started realizing that so many decisions are made, you know, in the courts. So I gradually softened on whether this could be used in the courts or not."

Machine Bias
[Julia Angwin, Jeff Larson, Surya Mattu and Lauren Kirchner/ProPublica]