DisclaimerThe views and opinions expressed in this page are strictly those of the page author. The contents of this page have not been reviewed or approved by any organizations with which I might be affiliated. 
The TailsA number of data facts that are by now obvious were anything but obvious for me when I started. Most statistics that I learned at school dealt with conditional means, and, sometimes, variances. The focus on the first two moments makes sense for bellshaped distributions such as normal. It is a lot easier to prove limit theorems about means, so their properties are welldocumented. Unfortunately, the vast majority of real data that I encounter are anything but bellshaped. Instead, most variables have a very different distribution. There is almost always a mass point at zero, a relatively wellbehaved lognormalshaped piece with positive values, and a very long uniformshaped tail. Sometimes the tail is so long that the mean can be above the 90th percentile. The outliers also inflate the variance, so most standard inference procedures will tell a very misleading story about what is going on in the data. In addition, the tails of the distribution is where the most interesting things happen. If we are looking at model errors, the tails are the cases when the model fit is worst. If we are looking at sales, the tails are the bestsellers. The most important users are also at the tails of the entire distribution. So not only the tails mess up the usual inference procedures, they are also of primary interest by themselves.

Contact InformationKonstantin Golyaev, Ph.D. Office Phone: Email: 