The police in Durham, England bought a license to the "Mosiac" dataset from the credit bureau Experian, which includes data on 50,000,000 Britons, in order to train a machine learning system called HART ("Harm Assessment Risk Tool") that tries to predict whether someone will reoffend.
The Mosiac dataset attempts to group people based on their demographic characteristics, creating marketing categories with names like 'Disconnected Youth,' 'Asian Heritage' and 'Dependent Greys.' People are sorted into these categories based on a suite of criteria that includes their first names (people named "Chelsea" and "Liam" are likely to be classed as "Disconnected Youth"), their exam marks, the size of their gardens, messages they've posted to pregnancy advice websites, and other criteria.
HART was co-developed with Cambridge University. Durham Police say they use its assessments to determine whom to extend additional support to in order to prevent re-offending.
Experian's Mosaic code includes the 'demographic characteristics' of each stereotype – characterising 'Asian Heritage' as 'extended families' living in 'inexpensive, close-packed Victorian terraces', adding that 'when people do have jobs, they are generally in low paid routine occupations in transport or food service'.
'Disconnected Youth' are characterised as 'avid texters' whose 'wages are often low' – with first names like 'Liam' and 'Chelsea'.…
Experian's 'Mosaic' links names to stereotypes: for example, people called 'Stacey' are likely to fall under 'Families with Needs' who receive 'a range of benefits'; 'Abdi' and 'Asha' are 'Crowded Kaleidoscope' described as 'multi-cultural' families likely to live in 'cramped' and 'overcrowded flats'; whilst 'Terrence' and 'Denise' are 'Low Income Workers' who have 'few qualifications' and are 'heavy TV viewers'.
Police use Experian Marketing Data for AI Custody Decisions [Big Brother Watch]
(Images: Warner Brothers (fair use); Durham Constabulary (fair dealing); Cryteria, CC-BY)