The Tails

A number of data facts that are by now obvious were anything but obvious for me when I started. Most statistics that I learned at school dealt with conditional means, and, sometimes, variances. The focus on the first two moments makes sense for bell-shaped distributions such as normal. It is a lot easier to prove limit theorems about means, so their properties are well-documented. Unfortunately, the vast majority of real data that I encounter are anything but bell-shaped.

Instead, most variables have a very different distribution. There is almost always a mass point at zero, a relatively well-behaved lognormal-shaped piece with positive values, and a very long uniform-shaped tail. Sometimes the tail is so long that the mean can be above the 90th percentile. The outliers also inflate the variance, so most standard inference procedures will tell a very misleading story about what is going on in the data.

In addition, the tails of the distribution is where the most interesting things happen. If we are looking at model errors, the tails are the cases when the model fit is worst. If we are looking at sales, the tails are the bestsellers. The most important users are also at the tails of the entire distribution. So not only the tails mess up the usual inference procedures, they are also of primary interest by themselves.