Jeff Brantingham is as close as it gets to putting a face on the controversial practice of “predictive policing.” Over the past decade, the University of California-Los Angeles anthropology professor adapted his Pentagon-funded research in forecasting battlefield casualties in Iraq to predicting crime for American police departments, patenting his research and founding a for-profit company named PredPol, LLC.
PredPol quickly became one of the market leaders in the nascent field of crime prediction around 2012, but also came under fire from activists and civil libertarians who argued the firm provided a sort of “tech-washing” for racially biased, ineffective policing methods.
Now, Brantingham is using military research funding for another tech and policing collaboration with potentially damaging repercussions: using machine learning, the Los Angeles Police Department’s criminal data, and an outdated gang territory map to automate the classification of “gang-related” crimes.
Being classified as a gang member or related to a gang crime can result in additional criminal charges, heavier prison sentences, or inclusion in a civil gang injunction that restricts a person’s movements and ability to associate with other people. Generally, law enforcement determines gang links through a highly subjective, individualized assessment of criminal histories, arrests, interviews, and other intelligence. In recent years, activists in California, Illinois, and other states have pushed back against gang policing measures such as databases and gang injunctions, and in the case of California, succeeded in winning residents the right to review and appeal their gang classification.
But in a paper on “Partially Generative Neural Networks for Gang Crime Classification” presented in February at the inaugural Artificial Intelligence, Ethics, and Society (AIES) conference, Brantingham and his co-authors propose automating this complex and subjective assessment.
The paper attempts to predict whether crimes are gang-related using a neural network, a complex computational system modeled after a human brain that “learns” to classify or identify items based on ingesting a training dataset. The authors selected what they determined to be the four most important features (number of suspects, primary weapon used, the type of premises where the crime took place, and the narrative description of the crime) for identifying a gang-related crime from 2014–16 LAPD data and cross-referenced the crime incidents with a 2009 LAPD map of gang territory to create a training dataset for their neural network.
Researchers tested the accuracy of the network’s predictions by seeing how well it classified crime data without one key feature: the narrative text description of the crime, the most time-consuming data for police to collect. This is where the “partially generative” aspect in the title comes in. In the absence of a written description, the neural network generates new text — effectively, an algorithmically written crime report based on the three other features used in the training model. The generated text isn’t actually read by anyone, nor is it presumed to provide meaningful narrative context replacing a police report, but it is turned into a mathematical vector and incorporated into a final prediction of whether a crime is gang-related.
This paper is the first to be published by a research team co-led by Brantingham studying “Spatio-Temporal Game Theory & Real-Time Machine Learning for Adversarial Groups” at the University of Southern California’s Center for Artificial Intelligence and Society (CAIS). CAIS’ mission states a goal of “[sharing] our ideas about how AI can be used to tackle the most difficult societal problems.”
Funding for the USC research team that includes Brantingham’s project comes from the Minerva Initiative, a Pentagon research program intended to improve the military’s understanding of social, political, and behavioral drivers of conflict. According to the Minerva Initiative website, funding is provided to projects that address “specific topic areas determined by the Secretary of Defense.” Via email, CAIS co-founder and paper co-author Milind Tambe said that the Minerva grant for this project is “roughly” $1.2 million, to be distributed over three years.
The website for the research team’s efforts, including the gang classification paper, opens with references to ISIS and Jabhat al-Nusra before shifting to the terrain of Los Angeles street gangs, a conflation that echoes Brantingham’s earlier DOD-funded work that led him to co-founding PredPol. PredPol has sold its services to police everywhere from California to Georgia, as well as the United Kingdom. In 2015, PredPol unsuccessfully lobbied the Arizona legislature to approve a $2 million appropriation bill to use the firm’s forecasting technology to predict gang activity.
First reported in Science, the paper met with significant concern over its ethical implications. However, reporting on the paper and its fallout made no mention of Brantingham’s business connections to PredPol or the military funding of his past and present research.
When asked in a phone interview about whether this research might inform future business endeavors, Brantingham said, “This is a separate project, and that’s how we’re thinking about it.” Pointing out that it took a decade for his previous military-funded research to become PredPol, Brantingham emphasized that the paper reflected very preliminary work. “It’s our job to do careful basic research and make sure we understand how and why things are the way they are, long before any thoughts of use in the field might be contemplated.”
However preliminary the research might be and however good its authors’ intentions are, the paper and Brantingham’s involvement raise eyebrows with critics of increasingly automated data-driven policing tech.
Aaron Harvey, a San Diego resident and activist who successfully fended off charges of gang conspiracy from the local prosecutor that could have landed him in prison for over a decade, has since become a prominent California activist pushing back against the state’s gang laws, which are the oldest and most severe in the United States.
“Any time you take out the human perspective or interaction, I don’t believe there’s any positives,” Harvey said of Brantingham’s research. Aside from removing human discretion from the process, Harvey believed that automating such decisions based on historical criminal data from police departments alone would only reinforce past allegations of gang involvement, whether they were true or not. “You’re making algorithms off a false narrative that’s been created for people — the gang documentation thing is the state defining people according to what they believe,” Harvey said. “When you plug this into the computer, every crime is gonna be gang-related.”
Christo Wilson, assistant professor in computer and information science at Northeastern University and a co-organizer of the Fairness, Accountability, and Transparency in Machine Learning conference, also has concerns about the model’s potential to reinforce errors and biases. “If I train a model to predict people’s height, we know how to interpret the output and gauge its accuracy.” But, Wilson noted, “gang-related” is a complex, subjective determination. “So the algorithm is accurate at predicting what? Whether LAPD officers would label a crime as gang-related. Now, maybe the LAPD is 100 percent objective in their determinations of what is and is not gang-related. But if they are not, then the algorithm is going to reproduce their errors and biases.”
There is ample evidence in the public record of widespread inaccuracies in gang data — a 2016 state audit of California’s CalGang database found rampant errors, files that should have been purged years earlier, and unsubstantiated claims of gang involvement.
Micha Gorelick, senior research engineer at machine intelligence research company Cloudera Fast Forward Labs, adds a further objection: the training data assumes gang territories haven’t shifted in at least five years. When asked about the use of the 2009 map with 2014-16 crime data, Brantingham said that it was the most recent one available to him and that “There is some movement of territories over time but not as much as you would think, actually.”
Harvey, who grew up in the Blood-affiliated neighborhood of Lincoln Park in southeast San Diego, pointed out that gang territories and allegiances are highly fluid, and five years is an eternity in street life. “You’re able to come up with a conclusion of something and never have that on-the-ground interaction with the community,” Harvey said of Brantingham’s research approach.
Gorelick says many of the technical decisions in the paper are overly simplistic and technically rudimentary, but he believes the gang territory map is “the most nefarious of the features used.” Evaluating the likelihood a crime is gang-related based on a blanket labeling of a neighborhood as a gang territory “is encoding geographic bias, which, especially in a place like LA, is encoding racial bias.”
Wilson also pointed out that the paper fails to incorporate documented approaches to evaluating biased outcomes in machine learning: “The authors could have [looked at] whether their algorithm achieves statistical parity across races and ethnicities … They also could have looked for so-called disparate mistreatment by looking to see if the classification errors are evenly spread across these groups. But they did none of this, even though the methods to do so are well-known in the fair algorithms and even the predictive policing literature.”
As for Brantingham’s and his co-authors’ insistence of how preliminary this research is, Wilson noted a similar defense was used for controversial research using AI to identify sexual orientation or criminality. Like Brantingham’s paper, “both of these studies also had fundamental methodological problems. But that doesn’t obviate the essential ethics of the research itself: should we be doing this research at all?”
Milind Tambe of USC’s Center for Artificial Intelligence and Society emphasized that his research center, which houses Brantingham’s new research and collaborates with the university’s school of social work, “focuses on AI for Social Good”, and that this preliminary research contributes to improving domain understanding to facilitate said social good. But the number of criticisms of the technical and ethical shortcomings of the paper raises questions about whose version of social good is being served by this research.
For years, PredPol has been plagued with criticisms over the paucity of depth, richness, and rigor the software brings to policing. This new line of research suggests that Brantingham has not taken critiques of his research methodology to heart and is pressing forward with a project that is founded on incomplete data, dubious methods, and a premise that, if applied in the field, could result in more people of color behind bars.
Correction: An earlier version of this report quoted Hau Chan, one of the researchers involved in the 2018 AIES paper, as responding to ethical concerns by saying “I’m just an engineer,” which he did not say. The erroneous quote was based on a transcription error in the Science report that originally reported the remarks. Chan’s remarks are more accurately quoted as “as a researcher I don’t know what’s the appropriate answer for that question.” The erroneous quote has been removed, and copy has been updated with accurate context. The Verge regrets the error.