Food, Bias, and Justice: a Case for Statistical Prediction Rules
April 14, 2011 7 Comments
We’re remarkably bad at making good decisions. Even when we know what goal we’re pursuing, we make mistakes predicting which actions will achieve it. Are there strategies we can use to make better policy decisions? Yes – we can gain insight by looking at cognitive science.
On the surface all we need to do is experience the world and figure out what does and doesn’t work at achieving goals (the focus of instrumental rationality). That’s why we tend to respect expert opinion: they have a lot more experience on an issue and have considered/evaluated different approaches.
Let’s take the example of deciding whether or not to grant prisoners parole. If the goal is to reduce repeat offenses, we tend to trust a panel of expert judges who evaluate the case and use their subjective opinion. They’ll do a good job, or at least as good a job as anyone else, right? Well… that’s the problem: everyone does a pretty bad job. Quite frankly, even experts’ decision-making is influenced by factors that are unrelated to the matter at hand. Ed Yong calls attention to a fascinating study which finds that a prisoner’s chance of being granted parole is strongly influenced by when their case is heard in relation to the judges’ snack breaks:
The graph is dramatic. It shows that the odds that prisoners will be successfully paroled start off fairly high at around 65% and quickly plummet to nothing over a few hours (although, see footnote). After the judges have returned from their breaks, the odds abruptly climb back up to 65%, before resuming their downward slide. A prisoner’s fate could hinge upon the point in the day when their case is heard.
Curse our fleshy bodies and their need for “Food” and “breaks”! It’s obviously a problem that human judgment is influenced by irrelevant, quasi-random factors. How can we counteract those effects?
Statistical Prediction Rules do better
Fortunately, we have science and statistics to help. We can objectively record evidential cues, look at the resulting target property, and find correlations. Over time, we can build an objective model, meat-brain limitations out of the way.
This was the advice of Bishop and Trout in “Epistemology and the Psychology of Human Judgment“, an excellent book recommended by Luke Muehlhauser of Common Sense Atheism (and a frequent contributor to Less Wrong).
Bishop and Trout argued that we should use such Statistical Prediction Rules (SPRs) far more often than we do. Not only are they faster, it turns out they’re more trustworthy: Using the same amount of information (or often less) a simple mathematical model consistently out-performs expert opinion.
They point out that when Grove and Meehl did a survey of 136 different studies comparing an SPR to the expert opinion, they found that “64 clearly favored the SPR, 64 showed approximately equivalent accuracy, and 8 clearly favored the clinician.” The target properties the studies were predicting varied from medical diagnoses to academic performance to – yup – parole violation and violence.
So based on some cues, a Statistical Prediction Rule would probably give a better prediction than the judges on whether a prisoner will break parole or commit a crime. And they’d do it very quickly – just by putting the numbers into an equation! So all we need to do is show the judges the SPRs and they’ll save time and do a better job, right? Well, not so much.
Knowing Statistics Isn’t Enough
In what Ben Goldacre might call “properly insane”, SPRs do better even when the experts are presented with the mathematical result! The hunger which seemed to affect our judges is a problem, but so are all the systematic biases in our reasoning. One great example is in diagnosing “broken leg” situations.
If we were presented with the actuarial table and asked to predict the chance a person is at the movies, we could do pretty well. But if we knew that the person was in a cast with a broken leg, we might ignore the table’s stats about age, economic status, or gender and go with a good guess of “they’re lying in the hospital, not at the movies”. That’s “defecting” from the strategy, choosing to disregard our statistical approach because we think a different strategy would be better in this case.
It makes sense – the actuarial table doesn’t apply to this case! Unfortunately, we’re really bad at telling when to abandon the statistical model. Not only are we bad at it, we don’t realize how bad we are! We’re prone to spot false connections and miss real ones, but still be overconfident that we’re right. The temptation to defect from the statistical recommendation is tough to resist – after all, we feel a sense of confidence that it’s wrong! But, rationally, we need to remember that the feeling of confidence is more weakly correlated enough with being correct than we think.
We simply can’t get over the sense that, studies be damned, OUR judgment is special! Bishop and Trout lament this “epistemic exceptionalism”:
“When gatekeepers avail themselves of unstructured interviews, they actually degrade the reliability of their predictions. Although the interview effect is one of the most robust findings in psychology, highly educated people ignore its obvious practical implication. We suspect that this occurs because of our confidence in our subjective ability to ‘read’ people. We suppose that our insight into human nature is so powerful that we can plumb the depths of a human being in a 45-minute interview – unlike the lesser lights who were hoodwinked in the SPR studies.
We need to urge people to trust the statistics more and their intuition less. Yes, while gathering data on a new issue or on a range of objects that we suspect differ from previous models, it can be best to trust expert opinion. Similarly, there are times that experts spot systematic flaws in the SPR which allow us to update and improve them. But when choosing between two reasoning strategies, we’re making a comparison between the model and unaided human judgment.
In our society we trust people to make judgment calls on everything from diagnoses and treatments to finances and law enforcement. Factors like our overconfidence, confirmation bias, and the munchies lead us astray. But we have science and statistics for a reason – they’ve been demonstrably more reliable than our unaided judgment. Don’t you want important decisions made with the most reliable strategy?
(Luke wrote about SPRs on Less Wrong, it’s worth checking out)
I associate understanding with making good predictions. If an “expert” can’t make good predictions, what makes him/her/it an expert?
The idea of using simple mathematical models to predict the behavior of complex systems is addressed here:
http://xkcd.com/793/
First of all, 10 points for being able to bring in a relevant webcomic – I love it.
I think many people consider someone an ‘expert’ if they’ve studied more relevant literature, spent more time doing something, and has thought about it more. Those things are correlated with understanding and the ability to make good predictions, but perhaps not as strongly as it’s widely believed. Our biases still get in the way, in spite of that study. Sometimes, if the ‘expert’ studied a bogus theory, they’ll actually be worse!
I considered using the word ‘veteran’ but didn’t want to confuse it with a military veteran.
The audience is certainly less receptive to presentations given right before lunchtime.
However what about positive biases? For example working as a waiter it is statistically true that Foreigners, African-Americans, and Jewish people are poorer tippers. Logically then wouldn’t I be better served to spend less effort on them and maximize the gain from better tippers? I choose to have a positive bias in favor of believing in people despite evidence to the contrary, because I don’t like the way I act as an alternative. Similarly, isn’t bias also present in the formation of statistics? It’s possible that all poor tippers are connected by on measurable statistic, ignored in favor of arbitrary distinctions based on ethnicity.
How do we account for systematic bias? Since humans cannot be removed from the equation isn’t it also impossible to remove human bias?
Sam,
Remember that the actions of foreigners, African-Americans, and Jewish people may be influenced by other factors. It may be the case that those subsets of people statistically tip poorly because they statistically receive poorer service from their waiters. Or perhaps not.
My point though is that there is always an interplay of various factors.
Pingback: News Bits
Pingback: Game Theory and Football: How Irrationality Affects Play Calling « Measure of Doubt