Pulling levers, killing monsters: the lure of unpredictable rewards

In an interesting recent TED Talk, video game designer Tom Chatfield explained how his field has honed the art of dispensing rewards at precisely the right rate to keep people hooked on the game. Chatfield used a simple quest as a case study: you kill a monster, you earn a pie, and your goal is to collect 15 pies. Simple, silly, but this is more or less the template used by a huge number of video games, if you boiled them down to their essentials.

As a game designer, you don’t want to give your player a pie every time he kills a monster. That would quickly get boring and the player would lose interest. But you also don’t want to give him pies so rarely that he gives up out of frustration. So what’s the ideal reward-rate? From looking at their data on the behavior of hundreds of millions of players, game designers have figured out that giving people a reward about 25 percent of the time will keep them playing the longest.

(As an interesting side note, Chatfield explains that the ideal reward-rate fluctuates depending on how close the player is to meeting his quota of pies: “When people get to about 13 out of 15 pies, their perception shifts, they start to get a bit bored, a bit testy. They’re not rational about probability. They think this game is unfair. It’s not giving me my last two pies. I’m going to give up.” So to compensate for players’ skewed perception of probabilities which kicks in around 13-out-of-15 pies, game developers often change the reward rate at that point, increasing it from a 25 percent chance of getting a pie after killing a monster to a 75 percent chance.)

You’ve probably heard of this phenomenon in the context of casinos, where the unpredictability of rewards is what keeps people sitting in front of slot machines, feeding in their chips and pulling a lever for hours on end. But it was originally rats, not humans, whose lever-pulling addictions revealed the power of what psychologists and animal trainers now call “variable-ratio schedule reinforcement.” Psychologist B. F. Skinner, in his 1940’s and 50’s experiments on rats, found that rats who were given a food pellet after pulling a lever a fixed number of times would quickly abandon the behavior if the pellets stopped coming on schedule. But rats whose food pellets appeared after a random number of pulls were hooked — they would continue pulling the lever even if it had been ages since they last received a pellet, always hoping that this time, they’d get their reward.

“A variable schedule is far more effective in maintaining behavior than a constant, predictable schedule of reinforcement,” wrote animal trainer Karen Pryor in Don’t Shoot the Dog: The New Art of Teaching and Training.  In fact, a variable schedule can actually motivate the animal to perform their task not just longer, but better:

“If I were to give a dolphin a fish every time it jumped, very quickly the jump would become as minimal and perfunctory as the animal could get away with. If I  then stopped giving fish, the dolphin would quickly stop jumping. However, once he animal had learned to jump for fish, if I were to reinforce now the first jump, then the third, and so on at random, the behavior would be much more strongly maintained; the unrewarded animal would actually jump more and more often, hoping to hit the lucky number, as it were, and the jumps might even increase in vigor.”

I’ve personally noticed the motivating appeal of unpredictable rewards when dieting. If you step on the scale every day, assuming you’re sticking to your diet, you’ll see an overall downward trend over time. But on any particular day, your weight might not be lower than the previous day’s reading — it might even be slightly higher. That’s because your weight fluctuates naturally from day to day, within a pound or two, depending on whether you’ve eaten recently, whether you’re retaining water, and so on. Even though the expected value of the decline per day is fixed as long as you stick to your diet, on any given day there’s nevertheless some noise, some uncertainty about whether you actually are going to see a lower number than you did the day before — and if you do, you get a little hit of endorphins.

Why is this better than your weight going down by a predictable amount every day? Because if you know for sure what number you’re going to see on the scale tomorrow morning, you start to take that fact for granted, and by the time tomorrow rolls around there’s no jolt of endorphins when you actually do step on the scale. Of course, there would always be the incentive that if you don’t stick to your diet you won’t see that pleasurable decrease on the scale. But in general, the promise of reward is a bigger motivator than the threat of punishment – especially because, if you know for sure you aren’t going to see a lower number on the scale, you may just avoid the scale altogether.

The motivational power of variable rewards may also be the key to explaining a more modern mystery: why I can’t seem to stop checking my damn email every few minutes. Dan Ariely, author of Predictably Irrational, explains:

If you think about it, e-mail is very much like trying to get the pellet rewards. Most of it is junk and the equivalent to pulling the lever and getting nothing in return, but every so often we receive a message that we really want. Maybe it contains good news about a job, a bit of gossip, a note from someone we haven’t heard from in a long time, or some important piece of information. We are so happy to receive the unexpected e-mail (pellet) that we become addicted to checking, hoping for more such surprises. We just keep pressing that lever, over and over again, until we get our reward.

8 Responses to Pulling levers, killing monsters: the lure of unpredictable rewards

  1. Brad B says:

    Interesting post. I like how you applied it to dieting and email since i also step on the scale every morning, and I also am constantly checking my email. It’s funny because I looked up and saw my gmail inbox number had increased by one right as I was reading the last paragraph and got to experience the jolt, a phenomenon I had just read about 2 seconds earlier. I wonder if it’s common in mammals to hold out checking for the reward when a situation presents the possibility of one, for the purpose of prolonging the good feeling in case there isn’t one? I do that sometimes.

  2. Cory Albrecht says:

    The easiest way, I found, to stop obsessing over my email is to get a job where I have to process dozens of emails from people and a few hundred email alerts from automated systems on a daily basis. I no longer check my personal email on my phone all the time and even ignore the email bleep. 🙂

    Though it merely seems that my obsessing time about twitter and facebook alerts has expanded to take up the slack. I’m not sure that having the Borg hail sound effect from ST:TNG for those two is ironic. 🙂

    What I wonder, though, is it possible to use some kind of negative/anti-/reverse/inverse Skinner-esque reinforcement to beak us from the habit? Or after you broke the habit and discontinued the service, would you eventually become reconditioned to obsessing as your email resumes arriving irregularly?

  3. Max says:

    Do Americans prefer basketball over soccer because they like more frequent rewards?

    Checking emails is different from pulling a lever because your emails arrive whether or not you check them, so the frequency with which you check emails determines the frequency of rewards (over the number of checks, not over time). If you check your email once a week, you’ll be sure to see a bunch of emails, and probably have folks ticked at you for not responding.

  4. Barry says:

    Interesting — I can already feel my “endolphins” jumping! To me, though, the most interesting question is how to use this information if we’re the game players, not the game designers. Once we know we’re being manipulated, why can’t we stop?

    • Naomi Most says:

      > Once we know we’re being manipulated, why can’t we stop?

      There are two answers here.

      1) You can’t because these games hack at the very framework of human code.

      2) You can if you decide to become a self-hacker.

      For my part, I’ve found that simply deciding not to take any pleasure from the proffered rewards, and then choosing to associate a much greater sense of pleasure from going and doing something else, does the trick. Visualization helps: just enact the scene of achieving the game goal in your head and then imagine you are really, bloody disappointed in what you got for it.

      Listening to Alan Watts also helps.

Leave a Reply

%d