How to Read a Poll

Dewey Defeats Truman

Every other year, during election campaigns, the American public is polled, surveyed, and canvassed for their opinions, and the news media continuously inform us of the results. The media report polls in the same breathless way that race track announcers describe horse races: "As they round the corner of the convention, the Republican is pulling ahead on the right! Now, they're entering the home stretch and the Democrat is pulling up on the left!" Et cetera.

There is little drama in simply waiting until after the election to report the results. Instead, reporters use polls to add suspense to their coverage, with a leader and an underdog to root for. Moreover, every news outlet is trying to scoop the others by being the first to correctly predict the winner. Unfortunately, much of this coverage sacrifices accuracy for artificial excitement.

This article explains how a layman can read a news report of a poll without being duped by the hype. You don't need to be a statistician to understand enough about polls to not be taken in, because the problems are often not with the polls themselves but with the way that they're reported.

First, please take the following unscientific poll:

Fallacy Files Online Poll

  • A poll, such as this one, which people voluntarily take online, is just as good at measuring public opinion as a poll in which the participants are chosen at random.
    Agree Disagree
  • If a scientific poll shows a candidate with 37% support, say, that means that if the election were held today 37% of the voters would vote for that candidate.
    Agree Disagree
  • In a scientific poll conducted between two candidates for office with a margin of error of ±3%, Candidate R has 52% support and Candidate D has 48% support, so we can be sure that R is supported by more voters than D.
    Agree Disagree
  • In a scientific poll conducted a month ago with a margin of error of ±3%, Proposition P received support from 52% of respondents. In a similar poll with the same margin of error conducted today, P received only 51% support. So, popular support for P is falling.
    Agree Disagree
  • This poll is a worthwhile measure of people's understanding of polls.
    Agree Disagree

Looking for the results of the poll you just took? Read on!

Scientific Versus Self-Selected

Opinion polls, like other surveys, are a way of inferring the characteristics of a large group—called "the population"—from a small sample of it. In order for this inference to be cogent, the sample must accurately represent the population. Thus, the main error to avoid is an unrepresentative sample. For example, the most famous polling fiasco was the Literary Digest poll in the 1936 presidential election. The magazine surveyed over two million people, chosen from the magazine's subscriber list, phone books, and car registrations. Even though the sample was enormous, it was unrepresentative of the population of voters because not everyone could afford a phone or car during the Depression, and those who could tended to vote Republican in greater numbers than those who couldn't. As a result of this biased sample, the poll showed Republican Alf Landon beating the actual winner, Democrat Franklin Roosevelt.

So, the first question that you should ask of a poll report you read is: "Was the sample chosen scientifically?" If the poll is a scientific one, then an effort has been made to either choose the sample randomly from the population, or to weight it in order to make it representative of the population. Reputable polling organizations always use scientific sampling.

However, many polls are unscientific, such as most online polls you take using a computer, telephone surveys in which you must call a certain number, or mail-in questionaires in magazines or sent to you by charities. Such surveys suffer from the fault that the sample is self-selected, that is, you decide whether you wish to participate. Self-selected samples are not likely to be representative of the population for various reasons:

For example, some media outlets sponsor scientific polls but, when the results are reported in their online edition, they are sometimes accompanied by an online poll using a self-selected sample and asking some of the same questions. It is instructive to compare the two, as the results are usually very different.

So, self-selected samples are almost inevitably biased and are, at best, a form of entertainment. They cannot be trusted as a source of information about the population as a whole.

Margin of Error Errors

Because polls question only a sample of the population, there is always a chance of sampling error, that is, of drawing a sample that is unrepresentative. For instance, in a political poll, it is possible that a random sample of voters would consist entirely of Democrats, though this is highly unlikely. However, less extreme errors of the same kind are not so unlikely, and this means that every poll has some degree of imprecision or fuzziness. Because the sample may not be precisely representative of the population as a whole, there is some chance that the poll results will be off by a certain amount. Statisticians measure the chance of this kind of error by the "margin of error", or "MoE" for short.

The MoE takes the form "±N%", where usually N=3 in national polls. This margin determines what is called a "confidence interval": for example, if the percentage of a sample who supports candidate R is 51%, and the MoE is ±3%, then the confidence interval is 48-54%. In turn, the confidence interval and the MoE are determined by the "level of confidence", which is usually set at 95% in national polls. What this means is that one can have confidence that in 19 out of 20 such samples the percentage of the population who support candidate R will fall within the confidence interval. So, the chance of the poll being off by more than the MoE is only 5%.

The MoE is a common source of error in news reports of poll results. Most reputable news sources require their reporters to include the MoE in a report on a poll, at least in a note at the end. However, many reporters ignore the MoE in the body of their articles, perhaps because they don't understand what the number means.

Reporters often use polls for "horse race" reporting by comparing the poll numbers of candidates, or to compare current polls to past ones to see if the results are changing. The MoE needs to be factored into such comparisons. There are two kinds of errors about MoEs frequently committed in news reports of poll results:

  1. Suppose that in one poll with a MoE of ±3%, candidate C polls at 36%, and in a later poll with the same MoE C is at 38%. Many newspapers will report this as a 2% rise in support for C between the two polls, as if 2% of undecided voters or previous supporters of other candidates had decided to vote for C since the previous poll. However, given that the MoE is ±3% in both polls, the result in the first poll could be as high as 39%, and in the second one as low as 35%. In other words, C's support could have dropped by as much as 4%! The poll results are simply not precise enough to say with 95% confidence that there is a real increase in C's support, let alone that such an increase is exactly 2%.
  2. Another type of mistake occurs in reporting on polls, such as those conducted during presidential elections, in which the results for two major candidates are compared. For instance, suppose that in a national poll with an MoE of ±3% candidate R's support is 52% while D's is 48%. The difference in their support is 52 - 48, that is, four percentage points, which is greater than the MoE. Most news outlets would report that candidate R is ahead by four points. However, R's support could be as low as 49% and D's as high as 51%, which would mean that D actually leads by two points.

The Confidence Game

In the previous section, I mentioned the level of confidence—usually 95%—used to determine the MoE and, therefore, the confidence interval. The purpose of a survey is to measure some characteristic―such as support for a candidate―of a sample in order to be able to infer its level in the whole population. A 95% confidence level means that in 19 out of 20 samples, the percentage of the sample with the characteristic should be within the confidence interval of the percentage of the population with the characteristic.

95% confidence sounds pretty confident—and it is!—however, there are a lot of polls done these days. In fact, there are many more than 20 national polls conducted in the U.S. during a presidential election year. This means that even with a confidence level of 95%, we can expect a few polls to be off by more than the MoE as a result of sampling error.

How can we tell when the results of a poll are off by more than the MoE? If a poll gives very different results from others taken around the same time, or shows a sudden and large change from previous polls, this suggests that the unusual result may be due to sampling error. No one can know for sure whether sampling error is responsible for polls with surprising results, but the fact that 1 in 20 polls can be expected to be significantly in error should encourage us to regard such poll results with skepticism. Moreover, it's important to pay attention to all of the polls taken on a given topic at a particular time, otherwise you'll have no way of knowing whether a poll you're looking at is giving wildly different results than comparable polls.

Polling the Polls

Here's another reason to pay attention to all the comparable polls, as opposed to concentrating on just one. Suppose that five polls are conducted at about the same time showing the following results with a MoE of ±3%:

Poll Candidate D Candidate R Undecided
1 43% 42% 15%
2 42% 41% 17%
3 44% 42% 14%
4 44% 44% 12%
5 46% 43% 15%

Each of these results is within the MoE so, taken individually, you would have to conclude that neither candidate is really ahead. However, four of the five polls show candidate D with a lead, and the other shows a tie; no poll shows candidate R leading. Of course, it's highly improbable that both candidates have exactly the same level of support, but if they are within a percentage point of each another you would expect the polls showing one candidate ahead to be about evenly divided between the two. Instead, in this example, all of the polls showing one candidate ahead favor candidate D, which is unlikely unless D has a real, albeit small, lead.

Thus, even when individual polls do not show a clear leader, the consensus of all polls may do so. Unfortunately, news stories on polls usually concentrate on one poll at the expense of all others. Many polls are sponsored by newspapers or networks, which get their money's worth by reporting only the results of their own polls, ignoring those sponsored by their competitors. Therefore, it's up to you to check to see whether there are other polls on the same topic, and to compare the results of any comparable polls.

A Checklist of Questions

When you are confronted with a new poll, ask the following questions about it:

If the poll you are confronted with fails at any step of this checklist, or if you can't find the answer to these questions in the report, then your confidence in the poll should be much less than 95%.

The Poll Results

If you haven't guessed by now, the online poll was bogus, but not much more bogus than most such polls. If you go back and retake the poll having read the entire article, I hope that you will agree to disagree with all of the questions!