Usain Bolt: It’s Just Not Normal

Usain Bolt‘s wonderful run in the Olympic 200-meter sprint reminds us that the normal distribution — the familiar bell curve beloved by economists and statisticians — can be wildly inappropriate when analyzing extremely selected samples.

This morning’s New York Times shows Usain Bolt’s new world record, relative to the 250 greatest 200-meter sprints ever. Not only does this not look like a normal distribution, it doesn’t even look like the tail of any standard distribution I’ve ever seen:

INSERT DESCRIPTION

The full graphic, as a story board, is available here. (It is a beautiful example of using statistics to tell a story.) It should be clear from this chart why few thought that the previous world record would be broken anytime soon. (An interesting aside: This graphic shows that it is only a fairly recent phenomenon that the 200-meter typically yields a faster average speed than the 100-meter sprint.)

Extreme outliers aren’t that unusual in sports. The greatest outlier may well be Australian cricketer Donald Bradman, whose career batting average of 99.94 puts him so far ahead of any other cricketer that it defies comprehension. (Trivia note: Bradman played the piano at my grandmother’s wedding.) Here is a histogram of career batting averages conditional on being among the top 100 (among those with at least 20 innings):

INSERT DESCRIPTION

Some argue that Joe DiMaggio‘s 56-game hitting streak is pretty extraordinary. So I put together a histogram of the great hitting streaks (among those longer than 30). DiMaggio is okay, but he’s no Don Bradman.

INSERT DESCRIPTION

The key to all of these strange distributions is that we are focusing on the extreme tails of highly selected samples, where the usual statistical patterns rarely hold. These situations are highly atypical, but equally, incredibly interesting when thinking about the very greatest. (I’ve never understood the urge to call these “black swans,” given that black swans are actually fairly common birds if you know where to look.)

Those interested in how things change in extremely selected samples may enjoy Tim Groseclose‘s paper, “Extreme Sample Selection Bias: Conditions That Cause the Correlation Between Two Variables to Switch Signs.” Groseclose claims that this extreme sample selection can explain why nonmillionaire members of Congress win re-election more often than millionaires; why it shouldn’t be surprising that the greatest golfer is multiracial, even though most top golfers are white; and why high S.A.T.’s may actually predict lower subsequent incomes among those attending elite universities.

Leave A Comment

Comments are moderated and generally will be posted if they are on-topic and not abusive.

 

COMMENTS: 79

  1. Sherman Sims says:

    I’m not buying the comparison of the the sprint graph outliers to the graphs of the batting outliers. The batting outliers are more of a series of the unlikely events and are influenced by many factors. A comparison to home run distance would be more appropriate and shows a direct correlation to improvements in training, human size and possibly steriods.

    Thumb up 0 Thumb down 0

  2. Mike says:

    On a tangent, if the 200m now has the fastest average speed, why isn’t the winner of this event now called “World’s Fastest Man/Woman”?

    Thumb up 2 Thumb down 0

  3. TomTom says:

    The question this raises is whether Bolt and Michael Johnson are using the same supplier for the medications, and Carl Lewis, Ben Johnson and the rest who “dominated” this noble sport in the last 30 or so years were at a physiological disadvantage, or merely had worse access to meds.

    Thumb up 2 Thumb down 0

  4. Tom Tom says:

    oh, and I don’t believe it is appropriate to compare Sir Don’s remarkable achievements with the Usain Bolt or Michael Johnson.

    Thumb up 0 Thumb down 0

  5. Frank Booth says:

    What I meant to say was that the article says “it is only a fairly recent phenomenon that the 200-meter typically yields a faster average speed than the 100-meter sprint.”

    It is not surprising that the sportscasters haven’t kept up with the statistical analysis.

    In any case, Bolt is the Olympic Champion and WR holder in the 100 and 200 now. He’s pretty much the undisputed fastest man alive.

    Thumb up 0 Thumb down 0

  6. Black Political Analysis says:

    So, now that Bolt has broken Johnson’s record, how long before someone breaks Bolt’s? Johnson’s previous record is now such an outlier. In fact, times at that end of the spectrum have now doubled. Only a few more and the histogram begins to look normal. Plus, what would happen if you just looked at the top times of the last year?
    http://blackpoliticalanalysis.com

    Thumb up 0 Thumb down 0

  7. mmm says:

    I’d imagine that the fastest average speed has a lot to do with the fact that it takes a good percentage of the race in a 100m to actually get “up to speed” where as you maintain a higher speed for longer in the longer race.

    I’d think that the race that advantages those with good acceleration and starting reaction time as well as good overall speed is the appropriate race for the “world’s fastest.”

    Thumb up 0 Thumb down 0

  8. L Nettles says:

    Now if you took a sample from Bolt’s real hometown on Krypton the distribution might me more normal.

    Thumb up 1 Thumb down 0