Monday, November 14, 2016

Big data loses. Bigly.


      For those interested in knowing more about the problems with predictive analytics in child welfare, NCCPR’s full analysis is available in our report Big Data Is Watching You.


I will leave it to others to try to guess what the election of Donald Trump means for child welfare policy, aside from pointing out that in addition to all the other reasons to worry, as far as I know the only one of his close advisers who’s ever thought about the topic – Newt Gingrich – has suggested throwing poor people’s children into orphanages.

But we shouldn’t let the postmortems go by without child welfare learning from one of the biggest losers on election night: predictive analytics.
This was the year when an amazing number of organizations, from FiveThirtyEight to The New York Times, to the Princeton Election Consortium were all crunching numbers, assessing data points and putting it all together into algorithms that were going to tell us who was going to win the presidential election. The overwhelming consensus: Hillary Clinton.

The same predictive analytics number crunchers also had algorithms to tell us which party would wind up in control of the Senate. The not-quite-as-overwhelming consensus: the Democrats.
Speaking just for myself, I wish all those predictions had been correct. Instead, we are left with two YUGE “false positives.”
“American voters just tossed an ice cold bucket of reality on those who argue that Big Data is here and now, and ready to run everything,” writes Forbes columnist John Carpenter.

“Tonight, data died,” Mike Murphy, a Republican strategist opposed to Donald Trump told MSNBC.


It was a rough night for number crunchers. And for the faith that people in every field … have increasingly placed in the power of data. [Emphasis added]

If all those number crunchers can’t figure out one presidential election, why should anyone trust predictive analytics to tell us which parents are going to harm their children? Yet that, of course, is what some in child welfare seriously want us to do.
The hubris behind that effort is astounding, and dangerous. The various data gurus who got the election forecasts wrong suffer nothing worse than public humiliation. Wrongly predict that a parent is going to abuse a child and there’s an excellent chance the child will suffer the enormous trauma of unnecessary foster care – and workers will be overloaded with all those needless removals, leaving them less time to find children in real danger.

The predictive analytics problem goes beyond elections

Of course, one could argue even a slew of bad election predictions (it wasn’t just one election, Big Data was wrong about several states) is not, alone, enough to say we should not press forward with letting algorithms tell us when to tear apart a family. The botched election predictions could be thought of as just a horror story – and we all know people in child welfare never, ever base policy decisions on horror stories, right?
But the Times says the lessons go much deeper:

… undercutting the belief that analyzing reams of data can accurately predict events. Voters demonstrated how much predictive analytics, and election forecasting in particular, remains a young science …

  data science is a technology advance with trade-offs. It can see things as never before, but also can be a blunt instrument, missing context and nuance … But only occasionally — as with Tuesday’s election results — do consumers get a glimpse of how these formulas work and the extent to which they can go wrong …

The danger, data experts say, lies in trusting the data analysis too much without grasping its limitations and the potentially flawed assumptions of the people who build predictive models.

Flawed assumptions, built into the models, were the root of the rampant racial bias and epidemic of false positives that an in-depth study done by ProPublica found when analytics is used in criminal justice. Prof. Emily Keddell found much the same when she examined bias and false positives specific to predictive analytics in child welfare.

The Times story also includes a lesson for those who insist they can control how analytics are used – those who say they’ll only use it to target prevention – not to decide when to tear apart families:

Two years ago, the Samaritans, a suicide-prevention group in Britain, developed a free app to notify people whenever someone they followed on Twitter posted potentially suicidal phrases like “hate myself” or “tired of being alone.” The group quickly removed the app after complaints from people who warned that it could be misused to harass users at their most vulnerable moments.

The failure of predictive analytics in this election should be one more warning not to data-nuke poor families. It should, but, as usual when the arrogance of software companies meets the arrogance of the child welfare field, it probably won’t.