Polls! Polls! Polls!
Aug. 7th, 2008 07:08 pmIn 2004, some people started putting up websites on which they tried to aggregate state-by-state polls to produce estimates of how the battle for electoral votes was shaping up in the presidential election. Some of those have started doing it again. One of the most popular ones was electoral-vote.com; there was also RealClearPolitics and Election Projection and some others. The political sympathies of these people were across the spectrum (electoral-vote's Votemaster seems to be a centrist Democrat; RCP and Election Projection are conservative), and that sometimes affected which polls they trusted and consequently their results; sampling bias is a hard thing to deal with here. But they all actually did an OK job of tracking how things were shaping up.
In the end, things were so close that the situation was very confusing, but the last week or so of polls actually tracked Bush's narrow reelection win pretty well--it was only the infamous leaked early exit polls that were strangely off in Kerry's direction. (Some people took this as a sign that there had been some sort of massive nationwide vote-rigging, but I was convinced by the Mystery Pollster's various arguments that, whether or not that happened, the early exits weren't good evidence for it.)
Anyway, a really popular new one is FiveThirtyEight.com on the liberal Democratic side. The guy who runs it, Nate Silver, is apparently better known for his sabermetric baseball analyses. What he does is a more elaborate version of what a few sites attempted in 2004, basically a Monte Carlo simulation. He takes state polls, fuzzes them out with probability distributions calculated according to a fairly complicated model (weighting polls by how old they are and how reliable a track record the pollster has, in addition to the supposed margin of error from pure sample size), puts in some other contributions calculated from national polls, and then does thousands of simulation runs to get a distribution of electoral votes.
He says he's actually trying to predict the November election, and as a result he puts a lot of extra slop into the probability distributions to reflect what he doesn't know. Which is fine, but reading his site I often get the impression that the evolution of his numbers depends more on the frequent changes he makes to his complex model than anything in reality. I think I'd prefer a simpler model that remained more constant over time.
An interesting recent development was the return of Sam Wang at Princeton, who was one of the slightly-lesser-known people doing this in 2004. Wang made a recent post on his blog criticizing FiveThirtyEight.com, pointing out, among other things, that if what you want is a probability distribution calculated from a bunch of state Gaussians, you really don't need to do a Monte Carlo simulation; you can just calculate values of the aggregate distribution directly and get more accurate results with fewer cycles. It's a good point, though it's also true that Silver's approach gives you a lot of fun (if questionably accurate) additional details about the probability of dozens of different weird things happening. Wang's got his own model, for which he makes no claims of real prediction--he just calls it an "if the election were held today" aggregate--so it isn't far off FiveThirtyEight's in overall result but has a much tighter spread.
Wang has reason to be skeptical of models with a lot of ad-hoc assumptions. Late in the 2004 campaign, he started to post pairs of electoral cartograms--one calculated just from recent polls, and another that incorporated his speculative assumption, based on observation of previous elections, that undecideds would break preferentially for Kerry. In the end, the tweaked map was wrong and the one that just aggregated polls was pretty much dead on. He learned from the mistake: here he reprints a pugnacious e-mail chalking it up to wishful thinking and his remarkably polite response.
In the end, things were so close that the situation was very confusing, but the last week or so of polls actually tracked Bush's narrow reelection win pretty well--it was only the infamous leaked early exit polls that were strangely off in Kerry's direction. (Some people took this as a sign that there had been some sort of massive nationwide vote-rigging, but I was convinced by the Mystery Pollster's various arguments that, whether or not that happened, the early exits weren't good evidence for it.)
Anyway, a really popular new one is FiveThirtyEight.com on the liberal Democratic side. The guy who runs it, Nate Silver, is apparently better known for his sabermetric baseball analyses. What he does is a more elaborate version of what a few sites attempted in 2004, basically a Monte Carlo simulation. He takes state polls, fuzzes them out with probability distributions calculated according to a fairly complicated model (weighting polls by how old they are and how reliable a track record the pollster has, in addition to the supposed margin of error from pure sample size), puts in some other contributions calculated from national polls, and then does thousands of simulation runs to get a distribution of electoral votes.
He says he's actually trying to predict the November election, and as a result he puts a lot of extra slop into the probability distributions to reflect what he doesn't know. Which is fine, but reading his site I often get the impression that the evolution of his numbers depends more on the frequent changes he makes to his complex model than anything in reality. I think I'd prefer a simpler model that remained more constant over time.
An interesting recent development was the return of Sam Wang at Princeton, who was one of the slightly-lesser-known people doing this in 2004. Wang made a recent post on his blog criticizing FiveThirtyEight.com, pointing out, among other things, that if what you want is a probability distribution calculated from a bunch of state Gaussians, you really don't need to do a Monte Carlo simulation; you can just calculate values of the aggregate distribution directly and get more accurate results with fewer cycles. It's a good point, though it's also true that Silver's approach gives you a lot of fun (if questionably accurate) additional details about the probability of dozens of different weird things happening. Wang's got his own model, for which he makes no claims of real prediction--he just calls it an "if the election were held today" aggregate--so it isn't far off FiveThirtyEight's in overall result but has a much tighter spread.
Wang has reason to be skeptical of models with a lot of ad-hoc assumptions. Late in the 2004 campaign, he started to post pairs of electoral cartograms--one calculated just from recent polls, and another that incorporated his speculative assumption, based on observation of previous elections, that undecideds would break preferentially for Kerry. In the end, the tweaked map was wrong and the one that just aggregated polls was pretty much dead on. He learned from the mistake: here he reprints a pugnacious e-mail chalking it up to wishful thinking and his remarkably polite response.
no subject
Date: 2008-08-08 09:51 am (UTC)I'm just used to everyone taking national poll figures and plugging that number into their, tweaked to taste, brute force and dumb ignorance swingometer formula to get the predictions for where each individual state's electoral college votes (it would be Parliamentary Seats here) would go. That I've not been able to find anything like this amongst the armada of US election prediction websites, just does my head in. No, I do not want to manually change indvidual states from red to blue or vice versa. No I'm especially not interested in what you've decided are 'swing states'. And why does no one never want to tell me the vote shares and counts in each individual states (or congressional district for the state or states that do it that way) that everyone got last time, in a real simple table form, so I've got something to actually go on if you really insist on forcing me to guess manually?
C'mon, just let me type in some national percentages and have a doohickey calculate an appropiate lie to tell me. 'Cos I wants my, it-can't-possibly-work-but-for-mysterious-reasons-doesn't-do-much-worse-than-anything-more-complicated, Uniform Percentage Swing, and I wants it now! :)
no subject
Date: 2008-08-08 12:04 pm (UTC)One of the many components of Nate Silver's model is actually one of those--he's done some sort of regression analysis he uses to estimate how each state will probably track the national popular vote, and he uses that in conjunction with national tracking polls to help fill in the gaps when there's little polling somewhere. But he doesn't regard it as reliable enough to use it as more than a minor contributor to the model.
Part of it is that we have the Electoral College, and ever since the debacle of 2000 we've had it hammered into our heads that the popular vote doesn't mean jack, so there's this special fascination with swing states. Of course, 2000 going the way it did, even a state-by-state analysis wouldn't have been that predictive.
no subject
Date: 2008-08-08 12:11 pm (UTC)Detailed scenarios...
Date: 2008-08-08 03:42 pm (UTC)Do you really think the probabilities of all those crazy scenarios at FiveThirtyEight.com are interesting to do? They are easy to do using the Meta-Analysis - though not really worth doing until Election Eve. I guess they are interesting for bookmakers.
All the best,
Sam Wang
Princeton University
no subject
Date: 2008-08-08 07:10 pm (UTC)Another web site election-projection.net uses a similar methodology as Prof. Wang but includes a factor for systematic error in the poll results. It currently shows a 91% chance of Obama winning if the election were held today.
election-projection.net also has a unique feature: an Interactive Presidential Election Probability Calculator. This allows you to input your own probability estimates for each state, and from your estimates, it runs simulations to compute the probability of each candidate winning the election and an expected distribution of electoral votes.
The Interactive Presidential Election Probability Calculator can be found at
http://election-projection.net/interactive.html