Nowadays, research in many fields involves statistics. Whether it comes down to a government’s decisions on measures fighting a pandemic or the Red Bull formula 1 racing team deciding on whether Verstappen should swap tires in the next lap, it sounds like a good idea to back the decision with statistics. For a fair amount of time, appliance of statistics has seemed one of the most irrefutable means of proving a claim or, in other situations, coming to the right conclusion. In their essence, statistics seem, when employed properly, to be telling the story as-is, but is this always the case?
I assume that most readers know their basics in the field of statistics, but will give a very blunt illustration of hypothesis testing nonetheless.
Imagine a machine that fills bags of skittles automatically. The skittles company claims that the machine is programmed to put 20 skittles in each bag, but they also state that the amounts may deviate a bit from time to time. Basic hypothesis testing using statistics goes as follows: acknowledging that the machine cannot fill up every bag with the exact desirable amount, we still know that the amounts should be somewhat near the average amount. Now say that we have a sufficient amount of bags and we calculate the average amount of skittles in each bag, we can then state with some amount of certainty whether or not the machine indeed fills bags with an average of 20 skittles. Graphically this looks as follows:
Obviously, choices whether or not to reject a hypothesis depend on the kind of test, the distribution that the statistic is supposed to follow, and many other factors. There are hundreds of tests and test statistics that statisticians have come up with and they all suit different situations. When a test is used under the wrong conditions, or when erroneous assumptions are made, its corresponding conclusion can be false. In the world of statistics, it is therefore always essential to check your own model and assumptions – and those of others. After all, a company or individual may very much benefit from a research’s specific outcome.
Enough reason not to believe a claim straight away, even though it is said to be based on statistics. One phenomenon in particular, however, is often overlooked.
Reversion to mediocrity
When you are a middling golf player, it is still possible to score a lucky hole-in-one across a decently sized fairway. However, trying to reproduce the same perfect shot will likely turn out to be extremely challenging, or impossible even. This has everything to do with reversion to mediocrity, also called regression towards the mean. Your beautiful shot involved a number of chance events that all fell in your favour; had you hit the ball from just a different angle, on a different spot, or with a slightly different force, it would not have turned out to be a hole-in-one, presumably. Your next shot will likely be not as good, and will be more comparable to your usual performance.
Another example of this is the following: imagine 30 students taking a multiple choice test. Each student answers every question at random. The average score is 4.8 points, some students scored better, but some scored worse. However, what happens when we take as a new sample the students who scored the least amount of points? If they answer all questions at random again, the average of this group of students will likely be way better than the first time! Have they gotten better? Obviously not, as they selected answers at random! This phenomenon is called regression towards the mean.
Now for a totally random test, this result seems quite obvious. But in many cases, it may be not as clear. More often, not all observations end up in the so-called tails of the distribution by mere chance, but by actual underlying causes. The result can therefore be (partly) explained by rationale, but the conclusions can be exaggerated. This is also how regression to the mean can be exploited.
Exploitation of the incidence
There are many ways in which the occurrence of regression to the mean can be exploited; it affects the results, after all. Let’s look at an example:
consider a study involving a drug for heart patients. Assume that out of the whole population, people with bad heart conditions are selected. These are people with high cholesterol, high blood pressure etcetera. After 3 months of taking the drug, the average person out of this sample has less heart problems than he or she had at the first time of inspection! One can now imagine that this change cannot be all attributed to the drug alone. Just like other conditions or diseases, the heart condition symptoms have a natural ebb and flow. Also a person could have had a bad day, or even his/ her recent diet could have had an effect on the first measure. Also, in the measure itself, there is at least some amount of uncertainty. Regression towards the mean tells us that the group of people with the worst conditions will likely be in better shape in another point of time, on average.
Control groups
Some studies make use of a control group, to make sure that the changes in condition cannot be attributed to external factors or a placebo effect. People in the control group are treated and observed in the same way, but the drug they take is a placebo.
Even then, however, the result cannot be trusted if the control group is just a sample from the total population, as regression towards the mean will generally not affect this sample.
It is therefore necessary that the control group consists of a sample drawn from people with the same exact conditions!
Dit artikel is geschreven door: Pieter Dilg