Exit Poll Update

There are a lot of interesting pieces circulating right now about exit polls. In the extended entry, I discuss three of them.
First, Seteve Soto notes a recent correction made to the Latino vote in the Texas exit poll. Here's an excerpt: Now it turns out that upon further study by NEP and Mr. Mitofsky, not only didn't Bush get 59% of the Hispanic vote in Texas, he didn't even beat Kerry amongst Hispanics in Texas:

The revised BC-TX-Exit-Poll Excerpts showed that 20 percent, not 23 percent, of all Texas voters were Hispanic. They voted 50 percent for Kerry and 49 percent for Bush, not 41-59 Kerry-Bush.

That's a pretty nifty 19-point swing, huh?

So at the risk of being called arrogant once again by the likes of Mr. Morin and pollsters who know (supposedly) what they are doing, how is it possible to miss on this by that large of a margin with your properly weighted exit poll, and not now have to adjust downward the claimed Hispanic vote that Mr. Mitofsky said Bush got on Election Night?

Mystery Pollster, of course, has a new, extensive piece on exit polls. This time Blumenthal explains the evolution of exit polls, explaining why early exits are different from others, what we can extrapolate from this and what we can't, and why it will be difficult for us to acquire the information we seek. The piece is huge, but I will try to excerpt a meaningful portion: First a review of the process: On November 2, 2004, the NEP conducted separate exit polls in all 50 states and the District of Columbia plus a separate, stand-alone "national" sampling. NEP instructed its interviewers to call in to report their results three times on Election Day (all times are local): At about noon, at about 3:00 p.m. and roughly an hour before the polls closed. NEP started releasing tabulations of the vote preference question for the national survey and for most of the battleground states on an hourly basis beginning at 1:00 p.m. Eastern Time.(...)

Those who are concerned about the discrepancy between the exit polls and the actual vote count should focus only on #3, the tabulations prepared just before poll closing for each state. Unfortunately, these were not officially released. The early releases (#1 & #2) were widely leaked and posted on the Internet, but had bigger discrepancies resulting from incomplete samples or the use of completely unweighted data. The later releases now available through official channels (#5) are not helpful for analysis of the discrepancy since they were "corrected" to conform to actual vote results. The only source of "just-before-poll-closing" results appears to be the data in the paper by Steven Freeman (more on this in the next post).

I haven't heard from Freeman in a week, which is disappointing. I'll keep trying.

Third, discussing sampling error in exit polls, Nick Panagakis sent me the following email:

Exit Polls Vs. Election Outcomes

In the general debate comparing exit polls and election outcomes, there are two fundamental weaknesses in analyses of differences between exit polls and election outcomes. The weaknesses are: 1) calculating error between poll and election outcomes and, 2) the effect of sample design in calculating sample error for cluster samples used for exit polls.

Here I use Steven Freeman's paper "The Unexplained Exit Poll Discrepancy" only as an example. This discussion applies to any of the exit poll vs. election outcome analyses which seem to come up after any election. Note that exit poll survey data are used here, not survey data weighted by actual election returns which would be redundant.

OUTCOME VS. EXIT POLL ERROR

Freeman: "The conventional wisdom going into this election was that three critical states would likely determine who would win the Presidential election - - Ohio, Pennsylvania, and Florida. In each of these states, however, exit polls differed significantly from recorded tallies." Freeman in Table 1 uses "Tallied vs. predicted" as his source data. In Ohio, Pennsylvania, and Florida, the differences between Bush's final tallies [outcomes] and his earlier exit poll percentages were, respectively, 6.7%, 6.5%, and 4.9%.

Differences between poll and election margins in statistical analysis should not be used. It is the poll estimate that is subject to sample error, not margins; e.g., 48% voting for A and 52% for B. Error on the margin effectively overstates estimate error by a factor of two. This is also complies with National Council on Public Polls post-election poll analyses.

Elections are zero-sum games. Two points high for one candidate means two points low for the other. Vote estimate errors for each candidates are not additive which is the effect of using margins in an analysis.

The differences between exit poll estimates and final election outcomes in these key states subject to tests of significance are as follows:

Ohio Bush: Exit poll 47.9%; outcome 51.0%. Difference +3.1
Ohio Kerry: Exit poll 52.1%; outcome 48.5%. Difference -3.6

Pennsylvania Bush: Exit poll 45.4%; Outcome 48.6%. Difference +3.2
Pennsylvania Kerry: Exit poll 54.1%; Outcome 50.8%. Difference -3.3

Florida Bush: Exit poll 49.8%; Outcome 52.1%. Difference +2.3
Florida Kerry: Exit poll 49.7%; Outcome 47.1% Difference -2.6

Differences between poll estimates and election outcomes range from -2.6 to +3.6, not 4.9% to 6.7%.

EXIT POLL STATISTICAL ERROR

The conclusion that "exit polls differed significantly from recorded tallies" in the three states is incorrect.

However, Freeman's page 6 footnote is correct: "This analysis assumes a simple random sample. If on the other hand, states were broken into clusters (e.g., precincts) and then the clusters (precincts) were randomly selected (sampling individuals within those selected precincts), the variances would increase."

By necessity, exit poll samples are cluster samples. The number of precincts in states typically number in the thousands. Wisconsin, for example, has 3,700 precincts. Illinois, a larger state, has 10,000.

Standard error assuming a simple random sample is calculated, but only as a first step. A confidence level of at least 99% is assumed - higher than the customary 95% - probably because of the higher standard of precision for exit polls and the number of races involved, about 100 across the states including the race for president and races for senate and governor on November 2.

A measure called the Design Effect must then be calculated to adjust the standard error for the cluster sampling effect. The magnitude of the Design Effect depends on the average number of interviews per precinct in each a state sample. The smaller the number of average interviews per precinct in a state, the smaller the design effect. Design Effect also differs by characteristic and can be much larger for characteristics highly clustered by precincts such as race. Design Effect is a variance measure so the square root is used to multiply the standard errors.

Without knowing the number of precincts sampled, you can't calculate the Design Effect. But Design Effect square roots are said to have typically ranged from 1.5 to 1.8 in the November exit poll. I used 1.6 as a "best estimate".

Conclusion. All of the state estimates above are well within their error calculations below.

Ohio, n = 2020. Sqrt (.5 X .5) / Sqrt 2020 X 2.6 X 1.6 = +/- 4.6%.

Pennsylvania, n = 2107. Sqrt (.5 X .5) / Sqrt 2107 X 2.6 X 1.6 = +/- 4.5%.

Florida, n = 2862. Sqrt (.5 X .5) / Sqrt 2862 X 2.6 X 1.6 = +/- 3.8%.

Nick Panagakis

Tags: General 2008 (all tags)

Comments

15 Comments

That still doesn't explain pro-Bush pattern
Chris,

Being within the margin of error is one part of statistics, but the consistent pro-Bush pattern in the difference between the poll results vs. exit polls is not explained by these calculations.

There is something else going on in there, and I think it would be misleading to dismiss it entirely by using the margin of error logic.

by SwingVoter 2004-11-30 01:59PM | 0 recs
Re: That still doesn't explain pro-Bush pattern
Actually, the margin of error logic must be taken into account.  First, let's recognize that the differences are within the margin of error as demonstrated.  Then consider that while exit polls appear to have been biased towards Bush in these three states, the end-of-day polls may have been biased towards Kerry in other states, again within the margin of error.  That does not mean something was amiss in FL, PA and OH, only that polls of a sample have error which might be above or below the "true" population mean.

I, too, will want to see this report of Freeman's because that might well answer some of these questions for us.  For now, however, it appears to me to have been a combination of statistical error, overzealous early reporting (and believing) of the exit polls during election day, and some wishful thinking from many of us about how the results would turn out.

A necessary disclaimer given how others have reacted in similar discussions lately: That's not to say there were no shenanigans anywhere, or that voters weren't turned away from some polling places, or that some votes weren't counted correctly.  It's only to say that pinning our hopes on the exit polls is probably the wrong way to build a case.

by boffo 2004-11-30 05:42PM | 0 recs
Re: That still doesn't explain pro-Bush pattern
I don't recall exit polls being pro-Bush. What I meant was that the exit polls consistently over-estimated in favor of Kerry, compared to the actual results that were published.

This consistent pattern of the exit polls fovoring Kerry vs. the results that turned up later in the night is what needs to be analyzed.

Were the Bush voters embarrased to say they voted for him, so they told exit pollsters they voted for Kerry? Or were their other inherent problems in the exit polls? Or, were the exit polls more accurate after all and the real results reflect issues with the election?

by SwingVoter 2004-12-01 02:48AM | 0 recs
Re: That still doesn't explain pro-Bush pattern
Right, but the point is that when we are talking about differences that lie well within the margin of error, and if other states saw overestimates in the exit polls in favor of Bush rather than Kerry, then this is all a non-issue.  What you are talking about is systematic error, but the kinds of differences here are indistinguishable from random statistical variation.  In other words, you may see patterns in the clouds, but to me they're just clouds.  If it's random, then there's nothing to investigate.  So far the evidence makes that scenario quite probable.

As the Mystery Pollster reports, Freeman's selective reporting of the end-of-day exit poll data at least raises the question that he cherry-picked the most pro-Kerry exit polls to make his case, ignoring others which contradict it.

(Heck, even if all the overestimates in the exit polls were towards Kerry but well within the margin of error, we still could not rule out random variation, but at that point I'd be more willing to entertain the possibility that there was some small systematic effect going on.  That's why we need the complete data, and not just the states Freeman reported.)

Finally, it should be remembered that when we talk about the margin of error, that it is at 95% confidence (commonly for media polls) or 99% confidence (in the case of exit polls, as reported in the original post).  So, for each jurisdiction there is a 1 in 100 chance that the "true" vote lies outside the margin of error for the exit poll.  That does not mean anything systematic was going on there, either.  There are 51 jurisdictions, each with a 1 in 100 chance that a random exit poll will be quite wrong.  Not that unlikely, in other words, that it could happen once or twice, though so far we don't have evidence that it was; in Freeman's list, only New Hampshire was close.  Bottom line: Things have to be quite wrong in quite a few jurisdictions for us to use exit polls in this manner.  Clearly that was not the case here with the evidence available to us (as opposed to, say, Ukraine).

by boffo 2004-12-01 03:17AM | 0 recs
Re: That still doesn't explain pro-Bush pattern
That's a good point. All Panagakis has proven is that the probability of the exit poll discrepancies in any one of these three states being due to sampling error alone is over 1%. He didn't calculate the exact p-values, but it's telling he chose a 99% confidence interval vs. the "customary" 95%, implying that at least one of them is below 5%. (Networks also use 99% confidence intervals before projecting winners, but that's because they have 51 elections to project, and they don't want to make 2 or 3 wrong projections in every election. That's why they didn't call any of these states for Kerry before official results were available.)

Given 51 elections, the odds of getting a couple of 1-in-20 results and even a 1-in-100 result aren't bad. But the odds of finding those results among the smaller set of swing states are much less. And the odds of all three results being the same direction are only 1/4 of that. And it gets even worse when you add more states to the analysis.

None of this answers the real question: was it pro-Kerry bias in the exit polls, or pro-Bush bias (voter suppression, spoilage, fraud, etc.) in the official results? It could even be a combination of the two, but I think we can safely reject the sampling error hypothesis as an explanation for the consistently pro-Bush discrepancies.

by Mathwiz 2004-12-01 11:50AM | 0 recs
Re: That still doesn't explain pro-Bush pattern
     Moreover, Freeman's analysis was not three states, but showed the one-way bias in 10/11 states and no bias in the 11th.  Waving this away by suggesting sample error simply remains not credible.
by macedc 2004-12-01 01:27PM | 0 recs
Re: That still doesn't explain pro-Bush pattern
I don't get what's not credible about pointing this out since all were within the margin of error.  Of the Freeman state, the only possible exception was NH, which is borderline.  Now, if the other 39 states plus DC exhibit a similar pattern, I'd be willing to stake more on this, but given these data have been only selectively reported I can't take much stock in it.  As Mystery Pollster points out, these data would look a whole lot less interesting if put beside other states that exhibit no bias or a pro-Bush bias in their exit polls.

(And yes, having the precise p-stats would ground this discussion much better.  I disagree, however, that choosing the 99% threshold is "telling" of anything; confidence thresholds are by their nature arbitrary.  Especially under circumstances when the data might be used to buttress allegations of electoral misfeasance or malfeasance, I think it's better to err on the side of caution in order to build a credible case.)

Mystery Pollster also makes some nice points in his original critique of the Freeman paper about possible pro-Dem biases in the polls arising out of artefacts of response rate.  I find that quite plausible and is worthy of further investigation.

Maybe I don't get what the fuss is about because different people are using these data at cross purposes.  Is the difference meant to indicate something was wrong with the exit polls, or that something was wrong with the election itself on a mass scale?  On balance, if it's anything, the evidence is stronger for the former than the latter (again, that's not to say there weren't irregularities at the local level in some places).

by boffo 2004-12-01 01:50PM | 0 recs
Exit polls in Europe
I never wrong by more than 0.1% per candidate, even for national elections.
What's up with this country? Granted France isn't as big as the US, but they are still more electors than let's say Ohio. Why can't pollsters get it right by 0.1% in exit polls in Ohio?
by FrenchSocialist 2004-11-30 08:59PM | 0 recs
Re: Exit polls in Europe
I meant "The exit polls never go wrong...", not "I neverwrong..." Excuse my typo
by FrenchSocialist 2004-11-30 09:00PM | 0 recs
Re: Exit polls in Europe
Given that we are talking about random error, which is inherent in any poll which samples from a population, three possibilities present themselves given what you say:

  1. European pollsters have been very, very lucky to get so close to the actual result;
  2. Someone is lying to the public about the polls, perhaps by weighting them to conform to the election results;
  3. Your memory is faulty.

Remember: Random error is a statistical property of sampling, which is not at all connected to the act of polling.  We could be picking marbles out of a bowl and still have the same margin of error.
by boffo 2004-12-01 03:53AM | 0 recs
Re: Exit polls in Europe
and/or

4. European exit polls sample more voters, thus reducing the sampling error.

Of course, getting it down to .1% would require a huge sample, so if 4 is an explanation, it probably goes with #3.

by Mathwiz 2004-12-01 11:54AM | 0 recs
Re: Exit polls in Europe
No kidding it would take a large sample size!  Using the parameters in the original post, to get a margin of error of .1% it would take a sample of more than 4 million respondents.  Would kind of defeat the purpose of an exit poll.
by boffo 2004-12-01 03:41PM | 0 recs
Re: Exit polls in Europe
No. To be certain (ie with 99% certainty) that the poll would fall within a .1% MOE, you might need 4 million respondents. But that's to be sure you would be that close in 99% of cases. You would need far fewer respondents to get within .1% 80% of the time. (Too lazy to figure it out right now.)
by taliesin 2004-12-01 07:56PM | 0 recs
Re: Exit polls in Europe
Yeah, it would take fewer respondents at 80% confidence (though no self-respecting pollster would release such figures).  But it's still a ridiculously huge number for an exit poll: more than one million respondents to get a margin of error of .1% at 80% confidence.  Fewer than the four million at 99%, but still a lot.
by boffo 2004-12-02 04:48AM | 0 recs
I sent Liz Doyle at Edison-Mitofsky a request
to clarify timing of release of their data (following Mystery Pollster's story). She is inundated with mail, but she wrote back:

"The data sets are being databased into ASCII and SPSS formats for all elections in 50 states plus the national survey by one person. It took over 1 year to prepare for the delivery of this information, with only one person working on this task the 3 months that is projected is timely. The data does not belong to Edison/Mitofsky, The National Election Pool of media are the deciding party in when the data will be sent to Roper."

NEP/the networks are where pressure needs to be placed to get moving any faster. In the end, I don't know what we are going to learn from the exercise. But it has to be done.

by DemFromCT 2004-12-01 04:27AM | 0 recs

Diaries

Advertise Blogads