On the Randomness, or Lack Thereof, of a Baseball Linescore

Last night, the Texas Rangers beat the Baltimore Orioles by a score of 30 to 3. In a baseball game. The last major league baseball team to score 30 or more runs in a game was the Chicago Colts, in 1897.

If you had to guess when the Rangers scored their runs over 9 innings (the game was in Baltimore, so Texas batted in the top of the 9th), how would you distribute the runs? If I had to do it, my linescore would probably look about like this:

1 2 3 4 5 6 7 8 9

4 3 1 0 5 6 3 5 3

But here is the actual linescore:

1 2 3 4 5 6 7 8 9

0 0 0 5 0 9 0 10 6

The Rangers scored 30 runs in just 4 innings! It’s a good reminder, once again, that the way data plays out in real life is often nowhere near as orderly, predictable, or consistent as you might imagine it to be. Even though runs scored per inning isn’t quite a matter of random distribution, this linescore did call to mind the common exercise of predicting coin flips. Levitt wrote about this topic a while ago: if you ask most people to predict how a sequence of 100 coin flips would come out, they would rarely have long streaks of heads or tails. Their answer, in other words, would end up just as fake as my imagined linescore above.

Leave A Comment

Comments are moderated and generally will be posted if they are on-topic and not abusive.

 

COMMENTS: 104

  1. Bill Effros says:

    Computers have random number generators used to approximate randomness. They don’t work very well, and never have. That’s why the iPod plays the same songs over and over, and never plays others.

    But it does demonstrate the difficulty of simulating or predicting randomness.

    If you program an iPod to generate baseball linescores, it will probably never generate one like the one last night.

    Thumb up 0 Thumb down 0

  2. Bill Effros says:

    Computers have random number generators used to approximate randomness. They don’t work very well, and never have. That’s why the iPod plays the same songs over and over, and never plays others.

    But it does demonstrate the difficulty of simulating or predicting randomness.

    If you program an iPod to generate baseball linescores, it will probably never generate one like the one last night.

    Thumb up 0 Thumb down 0

  3. James says:

    Aside from the normal factors in clustering of runs, one has to figure that the Baltimore manager essentially gave up in the 6th inning. Winning the game was extremely unlikely, and he was not going to use a limited resource (quality pitching) to support an almost-certain losing cause.

    I would expect there was also reduced effort on defense by the fielders — would you risk injury in a game you’re going to lose?

    Thumb up 0 Thumb down 0

  4. James says:

    Aside from the normal factors in clustering of runs, one has to figure that the Baltimore manager essentially gave up in the 6th inning. Winning the game was extremely unlikely, and he was not going to use a limited resource (quality pitching) to support an almost-certain losing cause.

    I would expect there was also reduced effort on defense by the fielders — would you risk injury in a game you’re going to lose?

    Thumb up 0 Thumb down 0

  5. josh says:

    The other interesting baseball related fact about the game is that about half of those 30 runs came from the 7,8 and 9 hitters; typically the least productive batters in the lineup.

    Thumb up 0 Thumb down 0

  6. josh says:

    The other interesting baseball related fact about the game is that about half of those 30 runs came from the 7,8 and 9 hitters; typically the least productive batters in the lineup.

    Thumb up 0 Thumb down 0

  7. jonathan says:

    I think the box score is more interesting . . . because Steve looks at the 30 and imagines that overly large number must be broken up because the odds of scoring 9 runs and 10 runs is very low, while others look at a regular box score, notice that it clumps and then enlarge those clumps to get to 30. The two perspectives that normally go together don’t. If you look at the box score, the Baltimore manager kept his pitchers in to take a beating. Why? Perhaps the answer is twofold: that this was the first game of a doubleheader, so he couldn’t burn out all his pitchers in this game, and that he was that day named the manager for next year and he knew he has job security despite a truly embarrassing loss.

    Thumb up 0 Thumb down 0

  8. jonathan says:

    I think the box score is more interesting . . . because Steve looks at the 30 and imagines that overly large number must be broken up because the odds of scoring 9 runs and 10 runs is very low, while others look at a regular box score, notice that it clumps and then enlarge those clumps to get to 30. The two perspectives that normally go together don’t. If you look at the box score, the Baltimore manager kept his pitchers in to take a beating. Why? Perhaps the answer is twofold: that this was the first game of a doubleheader, so he couldn’t burn out all his pitchers in this game, and that he was that day named the manager for next year and he knew he has job security despite a truly embarrassing loss.

    Thumb up 0 Thumb down 0