Unrepresentative Sample

Alias: Biased Sample

Type: Weak Analogy


N% of sample S has characteristic C.
(Where S is a sample unrepresentative of the population P.)
Therefore, N% of population P has characteristic C.


The Literary Digest, which began its famous straw poll with the 1916 presidential campaign, mailed out millions of mock ballots for each of its surveys. …

The results that poured in during the months leading up to the [1936 presidential] election showed a landslide victory for Republican Alf Landon. In its final tabulation, the Digest reported that out of the more than two million ballots it had received, the incumbent, Roosevelt, had polled only about 40 percent of the straw votes. …

Within a week it was apparent that both their results and their methods were erroneous. Roosevelt was re-elected by an even greater margin than in 1932. … The Digest's experience conclusively proved that no matter how massive the sample, it will produce unreliable results if the methodology is flawed.

… The mailing lists the editors used were from directories of automobile owners and telephone subscribers…[which] were clearly weighted in favor of the Republicans in 1936. People posperous enough to own cars have always tended to be somewhat more Republican than those who do not, and this was particularly true in [the] heart of the Depression.

…The sample was massive, but it was biased toward the affluent, and in 1936 many Americans voted along economic lines.

Source: Michael Wheeler, Lies, Damn Lies, and Statistics: The Manipulation of Public Opinion in America (Liveright, 1976), pp. 67-9.


This is a fallacy affecting statistical inferences, which are arguments of the following form:

N% of sample S has characteristic C.
(Where sample S is a subset of set P, the population.)
Therefore, N% of population P has characteristic C.

For example, suppose that an opaque bag is full of marbles, and you can win a prize by guessing the proportions of colors of the marbles in the bag. Assume, further, that you are allowed to stick your hand into the bag and withdraw one fistful of marbles before making your guess. Suppose that you pull out ten marbles, six of which are black and four of which are white. The set of all marbles in the bag is the population which you are going to guess about, and the ten marbles that you removed is the sample. You want to use the information in your sample to guess as closely as possible the proportion of colors in the bag. You might draw the following conclusions:

  • 60% of the marbles in the bag are black.
  • 40% of the marbles in the bag are white.

Notice that if 100% of the sampled marbles were black, say, then you could infer that all the marbles in the bag are black, and that none of them are white. Thus, the type of inference usually referred to as "induction by enumeration" is a type of statistical inference, even though it doesn't use percentages. Similarly, from the example we could just draw the vague conclusion that most of the marbles are black and few of them are white.

The strength of a statistical inference is determined by the degree to which the sample is representative of the population, that is, how similar in the relevant respects the sample and population are. For example, if we know in advance that all of the marbles in the bag are the same color, then we can conclude that the sample is perfectly representative of the color of the population—though it might not represent other aspects, such as size. When a sample perfectly represents a population, statistical inferences are actually deductive enthymemes. Otherwise, they are inductive inferences.

Moreover, since the strength of statistical inferences depend upon the similarity of the sample and population, they are really a species of argument from analogy, and the strength of the inference varies directly with the strength of the analogy. Thus, a statistical inference will commit the Fallacy of Unrepresentative Sample when the similarity between the sample and population is too weak to support the conclusion. There are two main ways that a sample can fail to sufficiently represent the population:

  1. The sample is simply too small to represent the population, in which case the argument will commit the subfallacy of Hasty Generalization.
  2. The sample is biased in some way as a result of not having been chosen randomly from the population. The Example is a famous case of such bias in a sample. It also illustrates that even a very large sample can be biased; the important thing is representativeness, not size. Small samples can be representative, and even a sample of one is sufficient in some cases.