Weapons of Math Destruction

Read an excerpt from Cathy O’Neil’s book on how Big Data increases inequality and threatens democracy



In the age of the algorithm, mathematical models are everywhere—on Facebook, on journalism sites like FiveThirtyEight.com, on your dating profile, and even in your doctor’s office. But Big Data isn’t as benign as you might think. As Cathy O’Neil writes in her new book, Weapons of Math Destruction, many of these models encode “human prejudice, misunderstanding, and bias into the software systems that increasingly [manage] our lives.”

And these data points can be a matter of life and death. Take the criminal justice system. Twenty-four states are currently using recidivism models, which are used by courts to determine the danger, and thus the sentencing, of convicted criminals. Though these models have kept sentences more consistent—and often, shorter—as O’Neil points out in this excerpt, assumptions and biases can be camouflaged by technology.

One of the more popular models, known as LSI–R, or Level of Service Inventory–Revised, includes a lengthy questionnaire for the prisoner to fill out. One of the questions—“How many prior convictions have you had?”—is highly relevant to the risk of recid­ivism. Others are also clearly related: “What part did others play in the offense? What part did drugs and alcohol play?”

But as the questions continue, delving deeper into the person’s life, it’s easy to imagine how inmates from a privileged background would answer one way and those from tough inner-city streets an­other. Ask a criminal who grew up in comfortable suburbs about “the first time you were ever involved with the police,” and he might not have a single incident to report other than the one that brought him to prison. Young black males, by contrast, are likely to have been stopped by police dozens of times, even when they’ve done nothing wrong. A 2013 study by the New York Civil Liberties Union found that while black and Latino males between the ages of fourteen and twenty-four made up only 4.7 percent of the city’s population, they accounted for 40.6 percent of the stop-and-frisk checks by police. More than 90 percent of those stopped were innocent. Some of the others might have been drinking underage or carrying a joint. And unlike most rich kids, they got in trouble for it. So if early “involvement” with the police signals recidivism, poor people and racial minorities look far riskier.

The questions hardly stop there. Prisoners are also asked about whether their friends and relatives have criminal records. Again, ask that question to a convicted criminal raised in a middle-class neighborhood, and the chances are much greater that the answer will be no. The questionnaire does avoid asking about race, which is illegal. But with the wealth of detail each prisoner provides, that single illegal question is almost superfluous.

The LSI–R questionnaire has been given to thousands of in­mates since its invention in 1995. Statisticians have used those results to devise a system in which answers highly correlated to recidivism weigh more heavily and count for more points. After answering the questionnaire, convicts are categorized as high, medium, and low risk on the basis of the number of points they accumulate. In some states, such as Rhode Island, these tests are used only to target those with high-risk scores for antirecidivism programs while incarcerated. But in others, including Idaho and Colorado, judges use the scores to guide their sentencing.

This is unjust. The questionnaire includes circumstances of a criminal’s birth and upbringing, including his or her family, neighborhood, and friends. These details should not be relevant to a criminal case or to the sentencing. Indeed, if a prosecutor attempted to tar a defendant by mentioning his brother’s crimi­nal record or the high crime rate in his neighborhood, a decent defense attorney would roar, “Objection, Your Honor!” And a se­rious judge would sustain it. This is the basis of our legal system. We are judged by what we do, not by who we are. And although we don’t know the exact weights that are attached to these parts of the test, any weight above zero is unreasonable.

Many would point out that statistical systems like the LSI–R are effective in gauging recidivism risk—or at least more accurate than a judge’s random guess. But even if we put aside, ever so briefly, the crucial issue of fairness, we find ourselves descending into a pernicious WMD feedback loop. A person who scores as “high risk” is likely to be unemployed and to come from a neigh­borhood where many of his friends and family have had run-ins with the law. Thanks in part to the resulting high score on the evaluation, he gets a longer sentence, locking him away for more years in a prison where he’s surrounded by fellow criminals—which raises the likelihood that he’ll return to prison. He is finally released into the same poor neighborhood, this time with a crim­inal record, which makes it that much harder to find a job. If he commits another crime, the recidivism model can claim another success. But in fact the model itself contributes to a toxic cycle and helps to sustain it. That’s a signature quality of a WMD.

Copyright © 2016 by Cathy O’Neil. Published by Crown Publishing Group, an imprint of Penguin Random House LLC.

Permission required for reprinting, reproducing, or other uses.

Stephanie Bastek is the senior editor of the Scholar and the producer/host of the Smarty Pants podcast.


Please enter a valid email address
That address is already in use
The security code entered was incorrect
Thanks for signing up