How to Read a Poll
Every other year, during election campaigns, the American public is polled, surveyed, and canvassed for their opinions, and the news media continuously inform us of the results. The media report polls in the same breathless way that race track announcers describe horse races: "As they round the corner of the convention, the Republican is pulling ahead on the right! Now, they're entering the home stretch and the Democrat is pulling up on the left!" Et cetera.
There is little drama in simply waiting until after the election to report the results. Instead, reporters use polls to add suspense to their coverage, with a leader and an underdog to root for. Moreover, every news outlet is trying to scoop the others by being the first to correctly predict the winner. Unfortunately, much of this coverage sacrifices accuracy for artificial excitement.
This article explains how a layman can read a news report of a poll without being duped by the hype. You don't need to be a statistician to understand enough about polls to not be taken in, because the problems are often not with the polls themselves but with the way that they're reported.
First, please take the following unscientific poll:
Opinion polls, like other surveys, are a way of inferring the characteristics of a large group—called "the population"—from a small sample of it. In order for this inference to be cogent, the sample must accurately represent the population. Thus, the main error to avoid is an unrepresentative sample. For example, the most famous polling fiasco was the Literary Digest poll in the 1936 presidential election. The magazine surveyed over two million people, chosen from the magazine's subscriber list, phone books, and car registrations. Even though the sample was enormous, it was unrepresentative of the population of voters because not everyone could afford a phone or car during the Depression, and those who could tended to vote Republican in greater numbers than those who couldn't. As a result of this biased sample, the poll showed Republican Alf Landon beating the actual winner, Democrat Franklin Roosevelt.
So, the first question that you should ask of a poll report you read is: "Was the sample chosen scientifically?" If the poll is a scientific one, then an effort has been made to either choose the sample randomly from the population, or to weight it in order to make it representative of the population. Reputable polling organizations always use scientific sampling.
However, many polls are unscientific, such as most online polls you take using a computer, telephone surveys in which you must call a certain number, or mail-in questionaires in magazines or sent to you by charities. Such surveys suffer from the fault that the sample is self-selected, that is, you decide whether you wish to participate. Self-selected samples are not likely to be representative of the population for various reasons:
For example, some media outlets sponsor scientific polls but, when the results are reported in their online edition, they are sometimes accompanied by an online poll using a self-selected sample and asking some of the same questions. It is instructive to compare the two, as the results are usually very different.
So, self-selected samples are almost inevitably biased and are, at best, a form of entertainment. They cannot be trusted as a source of information about the population as a whole.
Because polls question only a sample of the population, there is always a chance of sampling error, that is, of drawing a sample that is unrepresentative. For instance, in a political poll, it is possible that a random sample of voters would consist entirely of Democrats, though this is highly unlikely. However, less extreme errors of the same kind are not so unlikely, and this means that every poll has some degree of imprecision or fuzziness. Because the sample may not be precisely representative of the population as a whole, there is some chance that the poll results will be off by a certain amount. Statisticians measure the chance of this kind of error by the "margin of error", or "MoE" for short.
The MoE takes the form "±N%", where usually N=3 in national polls. This margin determines what is called a "confidence interval": for example, if the percentage of a sample who supports candidate R is 46%, and the MoE is ±3%, then the confidence interval is 43-49%. In turn, the confidence interval and the MoE are determined by the "level of confidence", which is usually set at 95% in national polls. What this means is that one can have confidence that in 19 out of 20 such samples the percentage of the population who support candidate R will fall within the confidence interval. So, the chance of the poll being off by more than the MoE is only 5%.
The MoE is a common source of error in news reports of poll results. Most reputable news sources require their reporters to include the MoE in a report on a poll, at least in a note at the end. However, many reporters ignore the MoE in the body of their articles, perhaps because they don't understand what the number means.
Reporters often use polls for "horse race" reporting by comparing the poll numbers of candidates, or to compare current polls to past ones to see if the results are changing. The MoE needs to be factored into such comparisons. For example, suppose that in one poll with a MoE of ±3%, candidate D polls at 36%, and in a later poll D is at 38%. Many newspapers will report this as a 2% rise in support for D between the two polls, as if 2% of undecided voters or previous supporters of other candidates had decided to vote for D since the previous poll. However, given that the MoE is ±3%, the result in the first poll could be as high as 39%, and in the second one as low as 35%. In other words, D's support could have dropped by as much as 4%! The poll results are simply not precise enough to say that there is a real increase in D's support, let alone that such an increase is exactly 2%.
Also, when comparing two poll numbers, remember that the MoE applies to both numbers. For instance, suppose that in a single poll with a MoE of ±3%, Candidate D gets 46% and Candidate R only 42%, this will often be reported as a lead for D. However, D's support could be as low as 43% and R's as high as 45%, giving R a 2 percentage point lead. So, in order for the absolute difference between two poll numbers to be statistically significant, it must be greater than twice the MoE, which is equal to the confidence interval.
In the previous section, I mentioned the level of confidence—usually 95%—used to determine the MoE and, therefore, the confidence interval. The purpose of a survey is to measure some characteristic―such as support for a candidate―of a sample in order to be able to infer its level in the whole population. A 95% confidence level means that in 19 out of 20 samples, the percentage of the sample with the characteristic should be within the confidence interval of the percentage of the population with the characteristic.
95% confidence sounds pretty confident—and it is!—however, there are a lot of polls done these days. In fact, there are many more than 20 national polls conducted in the U.S. during a presidential election year. This means that even with a confidence level of 95%, we can expect a few polls to be off by more than the MoE as a result of sampling error.
How can we tell when the results of a poll are off by more than the MoE? If a poll gives very different results from others taken around the same time, or shows a sudden and large change from previous polls, this suggests that the unusual result may be due to sampling error. No one can know for sure whether sampling error is responsible for polls with surprising results, but the fact that 1 in 20 polls can be expected to be significantly in error should encourage us to regard such poll results with skepticism. Moreover, it's important to pay attention to all of the polls taken on a given topic at a particular time, otherwise you'll have no way of knowing whether a poll you're looking at is giving wildly different results than comparable polls.
Here's another reason to pay attention to all the comparable polls, as opposed to concentrating on just one. Suppose that five polls are conducted at about the same time showing the following results with a MoE of ±3%:
Each of these results is within the MoE so, taken individually, you would have to conclude that neither candidate is really ahead. However, four of the five polls show candidate D with a lead, and the other shows a tie; no poll shows candidate R leading. Of course, it's highly improbable that both candidates have exactly the same level of support, but if they are within a percentage point of each another you would expect the polls showing one candidate ahead to be about evenly divided between the two. Instead, in this example, all of the polls showing one candidate ahead favor candidate D, which is unlikely unless D has a real, albeit small, lead.
Thus, even when individual polls do not show a clear leader, the consensus of all polls may do so. Unfortunately, news stories on polls usually concentrate on one poll at the expense of all others. Many polls are sponsored by newspapers or networks, which get their money's worth by reporting only the results of their own polls, ignoring those sponsored by their competitors. Therefore, it's up to you to check to see whether there are other polls on the same topic, and to compare the results of any comparable polls.
A Checklist of Questions
When you are confronted with a new poll, ask the following questions about it:
If the poll you are confronted with fails at any step of this checklist, or if you can't find the answer to these questions in the report, then your confidence in the poll should be much less than 95%.
The Poll Results
If you haven't guessed by now, the online poll was bogus, but not much more bogus than most such polls. If you go back and retake the poll having read the entire article, I hope that you will agree to disagree with all of the questions!
Fallacy Files Weblog Entries:
Margin of error mistakes:
Polling "records" that are within twice the margin of error:
The consensus of polls:
Fallacy Watch: Familiar Contextomies