Outliers: the statistical mysteries that hold the key to understanding

Sometimes, when dealing with data sets, you may have a particular observation that doesn't fit. Maybe the data point is much higher (or lower) than all the other data. Or maybe it just doesn't fall into the pattern of everything else that you're seeing.

These anomalies are called outliers. An NFL player is an outlier. A kid who graduates from college at 14 years old is an outlier. The worst salesperson in your company, who only sold 1/3 of what the next worst salesperson sold? Also an outlier.

When you're looking at averages, you need to watch out for outliers, because – as you'll see – their effect on averages can be dramatic. It's like adding cream to your black coffee. It's still 95 percent coffee – but a few drops can change the appearance dramatically.

The tricky part is that there aren't really any hard and fast rules when it comes to identifying outliers. Some economists say an outlier is anything that's a certain distance away from the mean, but in practice it's fairly subjective and open to interpretation.^I That's why statisticians spend so much time looking at data on a case-by-case basis to determine what is – and isn't – an outlier.

So, what causes an outlier? Sometimes, it's simply a mistake. Maybe someone entering data on a spreadsheet transposed a few numbers, and typed in 4.9 instead of 9.4. Perhaps a test tube was contaminated, which is why it shows a much higher level of bacteria than normal. Mistakes happen.

Sometimes an outlier is a red flag for something abnormal. When Mark McGwire hit 70 home runs for the St. Louis Cardinals in 1998, it seemed out of the ordinary. And for anyone not using steroids, it was. A decade later, McGuire admitted using drugs during his record-setting season, confirming the suspicions of fans and statisticians alike.

Finally, as you read (or watch or listen to) the news, keep in mind that many stories are newsworthy precisely because they're about outliers. The same old, same old isn't always as exciting as something that's (far) out of the ordinary.

If you pay attention to the Olympics, you may already be familiar with one way in which people try to handle outliers – by simply eliminating them. In diving, gymnastics and other sports, for example, an athlete's score may be calculated by taking all of the judges' scores for an event, dismissing the highest and lowest scores, and then calculating the mean.

This tactic – known as mean trimming – can help avoid having a judge's bias or personal preference affect the outcome. And it's possible that mean trimming could have affected the medal standings in at least one event, according to a paper that looked at the diving scores from the 2000 Olympics.^II

But does mean trimming – this specific method of dealing with potential outliers – work? Ask yourself, what would happen if there was more than one judge who was biased in favor of an athlete? The Olympic system – as it's commonly used – only eliminates the one highest and one lowest value. Or consider the fact that mean trimming treats the highest and lowest values as if they're outliers regardless of whether they truly are or not. Is this a fair system?

And then there's the question of whether a high or low score – whether it's an outlier or not – is actually a sign of bias. Yes, nationalistic bias may exist – researchers have found that "most judges gave some type of nationalistic bump to their countrymen without giving a similar bump to non-countrymen."^III But consider the Chinese diving judge. His average score for Chinese divers at the 2000 Olympics was 1.48 points higher than his average score for non-Chinese divers. Seems like a bias, right? But the Chinese judge, it turns out, was "apparently the least biased judge" because he scored both Chinese and non-Chinese divers higher than the other judges, on average. Does it make sense to discard his scores in this scenario?

Of course, sometimes outliers aren't mistakes or red flags – they're a perfectly valid part of the data. Consider American history. If you look at how many days each U.S. President served in office, you'll see that most of them served either for 1,460 days or 2,921 days (plus or minus a day), which corresponds to 4-year and 8-year terms, respectively. But 44 percent of our presidents served for shorter or longer periods of time, making them outliers according to statistician Robert W. Hayden, Ph.D., who performed the analysis.^IV Every time a president died in office (therefore not completing the rest of his term) he became an outlier – as did the person who replaced him.

So what do you do with outliers? Do you treat them equally, include them with the rest of the data, and have them skew your average? Do you completely ignore them? Is there a middle ground?

It depends. There are no blanket rules, because each case is different, and it's not always easy to identify an outlier. For example, some parents might think their toddler is an outlier because she's in the 35th percentile for height. Other parents might not care unless their kid was in the 5th percentile. After all, when you're looking at averages, you're going to have some values above average – and some below.^V

The bottom line is you need to look at the data, and see how much of an impact the outlier has on the question you're trying to answer.

Which leads us to Conwood.

It was the largest verdict in the history of antitrust law. $1.05 billion. And it all hinged on outlier data.

Conwood Company – a tobacco manufacturer – was suing another tobacco manufacturer (U.S. Tobacco Company) for hindering Conwood's growth.^VI The data expert for Conwood performed a state-by-state analysis to show the alleged impact of U.S. Tobacco's activities on Conwood's market share.^VII

The problem was that the analysis included Washington, D.C. – an extremely small market, relatively speaking – which meant that even small changes in the amount of product sold by Conwood (perhaps getting stocked in just a few stores) translated into large differences in market share.

When the data was analyzed, it was clear that DC didn't act like the 48 states that were measured (Alaska and Hawaii were excluded). It was, as antitrust professor Herbert Hovenkamp called it, a "significant outlier."^VIII But rather than omit the outlier data, the expert included it, which skewed the rest of the data and resulted in a conclusion that wasn't supported by the rest of the data. As Hovenkamp said, "the plaintiff's expert had ignored a clear 'outlier' in the data."^IX

If that outlier data had been excluded – as it arguably should have been – then the results would have shown a clear increase in market share for Conwood. Instead, the conclusion – driven by an extreme observation – showed a decrease.

If your conclusions change dramatically by excluding a data point, then that data point is a strong candidate to be an outlier. In a good statistical model, you would expect that you can drop a data point without seeing a substantive difference in the results. Something to think about when looking for outliers.

I: For example, some statisticians and economists look for three or four standard deviations (which is a statistical measure of how spread out the data is) as an indicator of an outlier.

II: John W. Emerson and Silas Meredith, "Nationalistic Judging Bias in the 2000 Olympic Diving Competition," August 22, 2010,. The specific event in which the outcome may have changed was women's 10-meter platform, which the authors explored in: John W. Emerson, Miki Seltzer, and David Lin, "Assessing Judging Bias: An Example From the 2000 Olympic Games," The American Statistician, 63(2009, 2): 124-131.

III: Emerson and Meredith, Nationalistic Judging.

IV: Robert W. Hayden, "A Dataset that is 44 percent Outliers," Journal of Statistics Education Volume 13, no. 1 (2005),

V: An exception being if every value in the data set is identical.

VI: Conwood Company was purchased by Reynolds American Inc. and changed its name to American Snuff Company, LLC, effective January 1, 2010.

VII: You can read more about the case here: Benjamin Klein and Joshua D. Wright, "ANTITRUST ANALYSIS OF CATEGORY MANAGEMENT: CONWOOD v. UNITED STATES TOBACCO," November 10, 2006,.

VIII: Herbert Hovenkamp, The Antitrust Enterprise: Principle and Execution. (Harvard University Press, 2008) Page 81.

IX: Hovenkamp, page 81.

Reprinted by permission of bibliomotion books + media. Excerpted from Everydata: The Misinformation Hidden in the Little Data You Consume Every Day. Copyright 2016 by John H. Johnson, PhD and Mike Gluck. All rights reserved.