The Regression Fallacy

Alias: The Regressive Fallacy

Type: Non Causa Pro Causa

Example:

KUALA LUMPUR: Prime Minister Datuk Seri Dr Mahathir Mohamad congratulated Malaysian shuttler Mohd Hafiz Hashim for his achievement but warned that he should not be "spoilt" with gifts like previous champions.

"Very good and congratulations, but now I would like to request everybody not to spoil him," he said when asked to comment on Hafiz's victory in the men's singles final of the All-England Badminton Championships on Sunday.

Dr Mahathir said people should remember what had happened to previous champions when they were spoilt with gifts of land, money and other items.

"I hope the states will not start giving acres of land and money in the millions, because they all seem not to be able to play badminton after that," he said after taking part in the last dry run and dress rehearsal for the 13th NAM Summit at the PWTC yesterday.

Source:"Mahathir asks states not to 'spoil' Hafiz", The Star Online, 2/18/2003

Exposition:

The Regression Fallacy is the result of a statistical phenomenon known as "regression to the mean". The "mean" refers to the arithmetical average of some variable in a population, that is, the "mean" is what we usually mean by "average". "Regression" refers to the value of the variable tending to move closer to the mean, away from extreme values. So, "regression to the mean" refers to the tendency of a variable characteristic in a population to move away from the extreme values towards the average value.

Consider a sample taken from a population. The value of the variable will be some distance from the mean. For instance, we could take a sample of people—it could be just one—measure their heights, and then determine the average height of the sample. This value will be some distance away from the average height of the entire population of people, though the distance might be zero.

Suppose, further, that we take a second sample of the population. If the value for the first sample is an extreme one—that is, far away from the mean—then it is likely that the value of the variable for the second sample will be closer to average than the first one. The farther away from the mean the first sample was, the more likely that the second will be closer to it. This is regression to the mean.

For example, the children of tall parents tend to be tall themselves, but not as tall as their parents. The fact that the children tend to be taller than average is probably the result of genetics, but the fact that they tend not to be as tall as their parents is the result of regression to the mean. The Regression Fallacy occurs when one mistakes regression to the mean, which is a statistical phenomenon, for a causal relationship. For example, if a tall father were to conclude that his tall wife committed adultery because their children were shorter, he would be committing the regression fallacy. Snake oil

Exposure:

One of the most common occasions for the Regression Fallacy is illness. People are most likely to seek treatment for an illness—especially experimental treatment—when they are at their sickest, that is, their condition is an extreme one. They take a remedy, and then get better due to regression to the mean, but they attribute their regained health to the effect of the remedy. This is one reason why some people will swear by such bizarre treatments as drinking urine, or psychic surgery.

"It worked for me", they say, when all they really know is that they took the remedy and they got better. Due to regression to the mean, many people will get better no matter what treatment they take, even none at all. Some will die, luckily for the snake oil salesmen, since the dead won't be around to badmouth the snake oil that they took before dying.

Regression to the mean is one reason why it is difficult to determine whether a potential remedy is really effective; one cannot tell simply by taking it when ill.

Sources:

Reader Response:

Dana Brown makes the following objection to the Example:

I noticed something I disagree with: in the regression fallacy page, you give, as an example, a tennis player who wins tournaments and then doesn't win. However, the "regression to the mean" occurs because one assumes that luck was involved in the first observation being so far away from the mean. This is, strictly speaking, true in all circumstances―but the proportion of luck to skill involved in winning a tennis championship, I suspect, is greatly tipped towards skill. Thus, I think it improper to suggest that what "caused" the tennis players to not win again was solely regression to the mean, when the Malaysian prime minister may, in fact, have been right in believing that the winners got spoiled―logically, nothing favors one explanation over the other.

Regression to the mean will affect any variable that is at least partly random. So, even though skill is the primary factor in winning tennis, as long as there is some degree of luck involved, regression to the mean will occur. The prime minister might be right about what is causing the champions' playing to deteriorate, but he gives no evidence that the regression is due to the players being spoiled, other than the fact that they regressed. That's what makes this an example of the fallacy: the fact that the prime minister concludes that the players are spoiled on no other evidence than the fact that their play got worse, which is to be expected due to regression to the mean. This is not a claim about what "caused" the tennis players subsequent losses, since regression to the mean is a statistical, not a causal, phenomenon.


fallacist@fallacyfiles.org