to Prevent Child Abuse,” declared a headline on about favorite child welfare fad, predictive analytics, or as it should properly be called, .
The story quotes Paula Bennett, New Zealand’s former minister of social development, declaring at a conference: “We now have a golden opportunity in the social sector to use data analytics to transform the lives of vulnerable children.”
If implemented, the story enthuses, it would be a “world first.”
All this apparently was based on two studies that, it now turns out, used methodology so flawed that it’s depressing to think that things ever got this far. That’s revealed in a .
The studies that supposedly proved the value of predictive analytics attempted to predict which children would turn out to be the subjects of “substantiated” reports of child maltreatment.
Among the children identified by the software as being at the very highest risk, between 32 and 48 percent were, in fact, “substantiated” victims of child abuse. But that means more than half to more than two-thirds were false positives.
Think about that for a moment. A computer tells a caseworker that he or she is about to investigate a case in which the children are at the very highest level of risk. What caseworker is going to defy the computer and leave these children in their homes, even though the computer is wrong more than half the time?
But there’s an even bigger problem. Keddell concludes that “child abuse” is so ill-defined and so subjective, and caseworker decisions are so subject to bias, that “substantiation” is an unreliable measure of the predictive power of an algorithm. She writes:
Turns out, it is not consistent, it does not represent incidence, and the vision is skewed. Keddell writes:
That problem may be compounded, Keddell says, by racial and class bias, whether a poor neighborhood is surrounded by wealthier neighborhoods (substantiation is more likely in such neighborhoods), and even the culture in a given child protective services office.
Algorithms don’t counter these biases, they magnify it.
Having a previous report of maltreatment typically increases the risk score. If it’s “substantiated,” the risk score is likely to be even higher. So then, when another report comes in, the caseworker, not about to overrule the computer, substantiates it again, making this family an even higher “risk” the next time. At that point, it doesn’t take a computer to tell you the children are almost certainly headed to foster care.
So the predictive analytics become a self-fulfilling prophecy.
Keddell also highlights the problems when even accurate data are misused by fallible human beings:
But it turns out there may be one area where predictive analytics can be helpful. Keddell cites two studies in which variations on analytics were used to detect caseworker bias. In one, the researchers could predict which workers were more likely to recommend removing children based on questionnaires assessing the caseworkers’ personal values.
In another, the decisions could be predicted by which income level was described in hypothetical scenarios. A study using similar methodology .
So how about channeling all that energy now going into new ways to data-nuke the poor into something much more useful: algorithms to detect the racial and class biases among child welfare staff? Then we can protect children from biased decisions by “high risk” caseworkers.