My good friend David Goldstein of the Horse Manure blog ("straight poop" from the Horse's Ass, in his own words) dumps a pile of horse product on his readers with his posting about error rates in vote counting machines:
With a hand recount looming in our historically close gubernatorial election, there has been much debate over the relative accuracy of hand counts versus machine counts, and the error rate of vote counting technologies in general… most of it uninformed.And now thanks to David, we have even more uninformed debate.
He discusses a couple of research papers, which he apparently read, but didn't understand very well; e.g. "Using Recounts to Measure the Accuracy of Vote Tabulations: Evidence from New Hampshire Elections 1946-2002".
David then announces the following conclusions, none of which are correct in the context of the Washington gubernatorial vote:
1. The "Residual voting" rate (includes both blank and improperly marked ballots), which he calls "the primary statistical measure of the performance and accuracy of voting technologies" is 1 - 2%.
2. The error rate of machine counting ("tabulation error rate") is 0.56% for optical scanning machines.
3. He infers from (2) that
A .5 percent invalidation rate in a gubernatorial election with over 2.8 million votes cast amounts to 14,000 erroneous votes!4. Finally, he claims that "Republicans scoff at Gregoire calling this election a tie, but statistically speaking, it is."
Again, all of David's statements 1 - 4 are incorrect.
I do agree with David that our current voting system is prone to inaccuracies, and that we're not going to emerge from the hand recount with confidence that we measured the will of the voters with ball-bearing precision. I hope after this whole mess we can actually work together for meaningful election reform. But the numbers he's throwing around for error rates and "erroneous ballots" are wildly off the mark, and we are not in a "statistical tie". Dino Rossi's TWO victories are exactly that. Victories.
Corrections and clarifications of David's erroneous claims:
1. The "Residual Rate" (blank and otherwise disqualified ballots) in Washington was far less than 1% this year. Furthermore, all indications are that the vast majority of blank ballots were really intended to be left blank. If you look at the Presidential race, you'll see that a total of 2,883,499 votes were cast and 2,859,084 votes were counted., so there were 24,415 residuals, or 0.85%. But the SoS page doesn't break out write-in votes and they're included with the other residuals. I don't have ready access to write-in numbers from all counties, but I do have those numbers for King County. The SoS page imputes 4,704 residuals for King, but the "e-Canvas" reports 1,194 write-ins, so the real residual rate in King is only 0.39%. That's more or less equal to the Libertarian vote and about half the Nader vote. That doesn't seem to be an unreasonable number of people who would simply chose not to vote for any of the presidential candidates. Some of those residuals may be unintentionally spoiled ballots. But in the King County gubernatorial recount, the canvassing board managed to convert exactly 717 initial residual ballots into non-residuals, out of 898,238 ballots tallied in the first count. That is only 0.08% of ballots that were plausibly miscast such that there is some reasonable claim that the voter filled out the ballot improperly, but well enough to leave marks from which discernable intent can be inferred.
2. The "Tabulation Error Rate" (the difference between the outcomes of the first count and the recount) in the governor's race was nowhere near 0.56%. It was 0.0040% when looking at the entire state, and even taking the weighted average of the (absolute values) of the counties' errors it is still only 0.0046%. [copy this table into Excel and do the math] This result is so far off the mark of the cited paper (7 standard deviations), that the paper's analysis doesn't seem to have any relevance to the systems and processes we use here in WA state.
3.There is absolutely no basis for screaming that there were "14,000 erroneous votes!" [David's exclamation mark]. First of all, the so-called tabulation error rate does not give the number of erroneous votes, it only gives the discrepancy between two counts. The true number of erroneously counted votes would, on average, be half of the discrepancy. Second, the number is based on a presumed tabulation error rate (0.5%) that is 125 times larger than what we actually experienced. Third, much of the actual discrepancy between the two counts was explained by the discovery of hundreds of new ballots around the state, and not by discrepancies between different methods of reading a controlled sample of ballots.
4. The election is not a tie, statistical or otherwise.. Governor-elect Rossi won the first count by 219 votes and he won the second count by 42 votes. It is close, but it is not a tie. David compared Rossi's two victories to "flipping a coin and having it land on heads two times in a row". Wrong. If we make the reasonable approximating assumption that the percentage of votes given to Rossi in a count is a normal random variable, we can use statistics to calculate the odds that Rossi truly won more than 50%. His share in the first count was 50.004722%. His share in the second count was 50.000729%. Let the null hypothesis be that Rossi's true share was < 50%. Use the t-distribution (Excel TDIST() function). Calculate the sample mean and standard error and you get a t-statistic of about 1.36. The one-tailed t-distribution with 1 degree of freedom gives the answer that we can reject the null hypothesis at the 20% level. In other words, the probability is 80% to 20% that Rossi beat Gregoire. That is much better than a tie. Those odds wouldn't be quite good enough for me to trust, say, elective surgery. But if I have to choose between two candidates in a close race, I'll go with the 80% winner over the 20% loser anyday.
Legally, the third count decides the race. But what will we learn statistically from the hand recount? Certainly if Rossi wins, then he's the undisputed winner. But what if Gregoire wins? Would she be the statistically legitimate winner, i.e. can we believe with confidence that she really won the majority of the votes? It depends on how large her margin is. All indications are that the hand recount will be far less accurate than the earlier counts, with opportunities for introducing new kinds of human errors. But let's be charitable and assume that these errors cancel out and add the 3rd count into the sample with the first two. The t-distribution tells us that it's still considered a Rossi victory unless Gregoire wins the 3rd count by more than 50.005450%, or by about 300 votes. Even she wins by, say, 250 votes, she might be the winner in the legal sense (assuming there wasn't fraud). But the statistics will still favor Rossi (albeit the more of a lead for Gregoire, the lower the confidence in Rossi's victory). Only if Gregpore wins by more than 300 legitimate votes should she be considered the statistical winner of the first three counts. If she does win the 3rd count by fewer than 300 votes, the statistics will still tell us to assume that Rossi won most of the votes. That scenario would be problematic. In our system, where leaders have to win the consent of the governed, political legitimacy goes to those who win the majority of the votes. Too slender a lead for Gregoire in the 3rd count would award political legitimacy to Rossi but a legalistic victory to Gregoire. That's not a recipe for a smooth transfer of power. It's a recipe for a widespread perception of a stolen election and a Ukraine-style crisis. Let's hope we don't see that. But if we do, we can only blame Gregoire and the Democrats for their arrogance in trying to overturn as certain an electoral victory as we'll see in this race.
And in any case, join me in reading David Goldstein's blog if you enjoy his occasionally entertaining posts. Just don't go there expecting to learn anything having to do with math.
UPDATE: David Goldstein responds to my critique here, lamenting that I've treated him "dismissively". "Dismissive" would indeed be a fair characterization of my rebuttal to his thoroughly flawed and easily dismissable analysis. He also frets that I've accused him of stupidity. That is unfair to both of us. I do not think he is stupid, but my good friend only undermines his own credibility by celebrating his ignorance and laziness:
I could really give a shit about all his t-distributions and null hypotheses and probability whojamacallits, because even if I could do the math (and I can’t), and even if I trusted him to present honest calculations (and I most certainly don’t)… I’m an experienced enough computer programmer to understand one basic axiom: garbage in, garbage out.Statistics is the science of drawing inferences from the kind of messy data about real-world phenomena that David can only call "garbage". Let's have a real discussion of the statistical issues here. Any readers who have some knowledge of statistics, please let me know if I've made any mistakes or if you can improve upon my analysis. I also encourage even the most basic of questions from those who aren't as familiar with the concepts we've mentioned, but want to understand them better. Those who prefer to scream emotional arguments misusing fancy-sounding phrases like "statistical tie", will surely find their rallying point over at the Horse's Math blog. Posted by Stefan Sharkansky at December 04, 2004 12:01 AM | Email This
I did indeed find the same results.
HOWEVER, it is unreliable to look at a t-distribution with only two data points. Had this been a 6th or 7th recount, your numbers would be more plausible. But... for now, we'd have to put a confidence interval on your confidence interval.
Still, great analysis.
Posted by: bmvaughn on December 5, 2004 01:35 AM"But according to CalTech/MIT, the results of a recount are more accurate than the original, and thus it seems likely that the results of the second recount will be more accurate than the results of the first."
Ahhh what they discussed was a recount being more accurate than the original general election. However, assuming no further introduction of votes, the CT/MIT piece would lead one to believe that another recount would produce quite near the SAME result as the 1st recount.
I've been winning all kinds of cash on this election cycle (thank you George W, Dino Rossi... well, McKenna, you screwed me.. didn't think you'd win).
Anyway.. I'll take Dino to win for $20.. any takers? Email me. No spread.
Posted by: bmvaughn on December 5, 2004 01:41 AMThat said, it has long been the practice in the data processing industry to manually examine (I believe the industry's technical term is "looking at") punch cards, when tabulating discrepancies occur. The purpose of this exercise is to improve accuracy.
As I explained in the post above, I agree that we won’t know with absolute certainty who truly won the election. But we don’t need absolute certainty here, we only need a reasonable level of confidence that one side or the other won. I argue that the two counts that Rossi won give us enough information to declare him a winner with 80% confidence (vs. 20% for Gregoire). To claim, as a quote in the Seattle Times article does, that we might as well flip a coin, implies that the two counts that Rossi won gave us absolutely no information whatsoever. That doesn’t strike me as a particularly thoughtful or well-reasoned claim.
Posted by: Stefan Sharkansky on December 5, 2004 11:25 AMDavid, you've proven my point once again by making an emotional statement of belief dressed up with a technical sounding term ("margin of error") that you've told us you have no interest in understanding or defining what it actually means.
Posted by: Stefan Sharkansky on December 5, 2004 01:30 PMBetter to look at this as a binomial distribution
with p=.5. The variance of a distribution with
2800000 trials comes to npq (.5 * .5 * 2800000)
or 700000 (we did not get a second INDEPENDENT
2800000 trials with a recount). That puts the
standard deviation at a bit less than 2650 votes.
In other words, a margin of 5300 votes only comes
to one standard deviation (about 84 percent).
The initial recount gave us a probability of
about 54 percent that Rossi "won" (that the
percentage of people who voted for him exceeded
fifty percent), and the recount an even tighter
50.6 percent.
Fortunately, elections like this don't happen too
often (and elections do not have to break
"statistical ties" - which this election clearly
falls under - only actual ones). I would advise
the democrats to call the election "within the
margin of counting error" and not "a tie" (which
sounds ridiculous).
If I understand the binomial distribution as you're trying to use it (and please enlighten me further if I'm wrong), it would apply to the problem of determining whether a coin was biased towards heads or tails by flipping it a large number of times. i.e. if you flip the coin 2.9 million times and it comes up heads on 261 more trials than it comes up tails, then how much confidence do you have that the coin is biased towards heads?
But I think this requires a different model. It's as if the "coin" has been flipped 2.9 million times, but we're not testing the coin for bias. We're trying to count the number of times that heads came up. The random variable is not the number of heads (that's a fixed quantity) and the probability p of recording an individual toss correctly is much closer to 1 than to .5. The random variable is the answer we get when we count the number of heads. Each count is an independent event. That seems to be a somewhat different problem, no?
Anon, please feel free comment again and/or email me. Any other statistical experts are encouraged to weigh in.
Posted by: Stefan Sharkansky on December 5, 2004 03:27 PMClearly, my analysis accepts the fact that the first two counts gave us information... enough information to suggest that, given the margin of error, the race is too close to call to any meaningful degree of confidence.David, you've proven my point once again by making an emotional statement of belief dressed up with a technical sounding term ("margin of error") that you've told us you have no interest in understanding or defining what it actually means.
An "emotional statement"...? Yeah, that's right Stefan, those are the ravings of a loose cannon... so don't get me mad, or I might come over to your place and stomp your camera!
Stefan, it doesn't take a statistician to refute your analysis -- it is all based on a faulty assumption:
If we make the reasonable approximating assumption that the percentage of votes given to Rossi in a count is a normal random variable, we can use statistics to calculate the odds that Rossi truly won more than 50%.
If you had actually read the CalTech/MIT studies, like I "apparently" did, you'd understand that recounts are more accurate:
Tabulations may change from the initial count to the recount for a variety of reasons: ballots may be mishandled; machines may have difficulty reading markings; people and machines may make tabulation errors. Because recounts are used to certify the vote, greater effort is taken to arrive at the most accurate accounting of the ballots cast. The initial count of ballots, then is treated as a preliminary count, and the recount as the official.
Thus "the percentage of votes given to Rossi in a count" is NOT a "normal random variable." So all the statistical analysis that follows your faulty assumption should be, well... dismissed.
You call for a "real discussion of the statistical issues" because you refuse to address the only substantive point of disagreement between us: are recounts legitimate? You cling to the notion that they are not, because your candidate "won" the first count. But the studies I cite clearly state that recounts are more accurate, so forgive me if I choose to rely on CalTech/MIT's conclusions over yours.
Posted by: David Goldstein on December 6, 2004 12:53 AMThose who prefer to scream emotional arguments misusing fancy-sounding phrases like "statistical tie", will surely find their rallying point over at the Horse's Math blog.
Right... and this from "haruspex" spouting Stefan, who name-drops Excel functions like he has a rare form of mathematical Tourettes. (Insert rolling-my-eyes emoticon here.)
Posted by: David Goldstein on December 6, 2004 01:22 AMServes me right for trying to post from a
strange computer (and glad that nobody noticed
that such a basic math error - obviously the
square root of a number less than a million can
not exceed a thousand).
OK, now that I have my tools again:
Original count:
variance - 685641.75
standard deviation - 828.03
Rossi lead - .3152 standard deviations
probability - 62.4%
Recount:
variance - 686231.5
standard deviation - 828.39
Rossi lead - .0507 standard deviations
probability - 52.0%
Stefan,
Using any distribution requires a lot of (mostly
unrealistic in the real world) assumptions. I
agree that I have modelled a slightly different
problem (in the population does one candidate
actually have the support of more than half of
the population?) - and that comes with a huge
assumption (that this 2.8 million votes actually
represents an accurate "random sample" of the
population).
This proposition comes closer to what the
Democrats claim ("the will of the people"), and
actually this seems more relevation regarding the
proposition that the vote represents a tie (in
statistical terms) - it takes into account the
nature of getting the intent tranlated to votes.
I feel a little uncomfortable about making any
assumptions about the randomness of a count -
you should get the same results if you recount
(especially after the first one). A recount does
not represent another independent (even in the
worst case it will come close to the first count)
observation of a random normally distributed
variable, so applying a t-test seems wrong to me.
I would call my analysis incorrect (even with the
right numbers) because of the assumptions, but
yours even a little "more" incorrect (because of
the assumptions), but, hey, it sounds good. :)
I guarantee you that the majority of the people
who refer to the election as a "tie" (even in
statistical terms) have little idea of the
underlying theory to support that statement.
They understood that if the body politic agreed to certain overall rules, a society could govern itself by selecting rulers by ballot. One of the most important of those rules was that the winner was the one who had more votes when the count was complete, and that the process was accepted by the loser and ended there.
Attempting to prolong the process by endless second-guessing, advocacy and attempts to recount under new rules is just the finest way to sour the body politic on the idea of having elections at all. We might just as well commence the lawsuits six months before any election and make a real media circus out of the discredited process, concluded by a duel to the death between the candidates.
Posted by: Hank Bradley on December 6, 2004 07:37 AMin today's election, many voters are able to vote from the comfort of their homes. If a mistake is made, intentionally or unintentionally, it is difficult, although not impossible to recieve a replacement ballot. What is a voter to do if she voted for Gregoire and then changed her mind at the last minute and decided that Rossi was her man. Should her ballot be invalidated because she changed her mind, even though she made it clear that her mind changed on her ballot? Or should she be forced to take the time to get a replacement ballot?
What aout the man who rested his pen on the oval for Rossi for a second and then changed his mind and filled in Gregoire's oval. He (and the machine) saw a tiny ink spot on the ballot. Should this ballot be invalidated? Is the voter's intent known?
Machines are not able to determine intent, and even though human intervention is subject to flaws, only a human can determine intent (if at all) of under and over votes. Surely there should be some common sense to this process.
Posted by: Chris on December 6, 2004 10:56 AM1) Probability theory is difficult or impossible to apply to the three vote counts because one isn't simply taking the same three piles and counting them three times. The piles changed in the second and third counts.
2) Should the piles have changed? That's the crux of the debate, it seems to me. Chris's last post gets it best; it does come down to intent. Is it OK to determine the intent of voters whose ballots weren't counted (which changed the piles)?
First, the law requires it. But how is the law applied? There is some percentage of the over and under votes and rejected ballots where the intent of the voter is really clear - for example, only one bubble is filled in (in each race), but in the governor's race its only a quarter filled in. I don't know how many of the rejected ballots are like that, but regardless, for the balance of these ballots, the voters' intent is not really clear, as in the examples given by Chris, the example shown on K5 news, and scads of other cases that anyone can try to think of. (Even if the voter circled the names of his or her choices - maybe that voter hates the system and all the candidates, and was saying "**** you" as he made his circles! Nobody KNOWS for sure.) For these ballots the intent is not provable, and I believe they SHOULD NOT BE COUNTED. Why? because partisans will want the votes to count if any argument, no matter how strained, can be made that the voter intended to vote for their candidate. That's not counting - its waving your hands and stomping your feet and hoping the result comes out your way.
And then there are the ballots where each vote on the ballot is clearly marked, but it is not objectively clear that the ballot is legitimate: rejected provisionals and absentees. Many of these can be verified, and if so, the law and common sense says they should be counted. Those that aren't strictly verified SHOULD NOT BE COUNTED. And the verification process has to be auditable, which means recorded with free access to the records, and each political party should be able perform their audit before the results are finalized. From what I have read thus far, the Democratic party has obstructed any audit of the ballot verification, SO I DON'T TRUST THE RESULT.
So I don't know what good all the statistical analysis does. While I completely agree that this election is not tied, and that the 42 vote difference is in surely within the "margin of error" for counting the same three piles of votes three times; the political/philosophical issue isn't about that. Its about using the law to get your way, regardless of the rationale behind the law and the good of the society governed by that law.
Posted by: srogers on December 6, 2004 01:26 PMhttp://pullonsupermanscape.typepad.com/pull_on_supermans_cape/2004/12/more_good_math_.html
in which I'm supportive of Shark and hopefully cast some of this discussion in terms that satisfies those with limited faith in probability - or whether that makes any sense in this context.
I also run through a series of math related posts that I made during and after the recount - in which no one has ventured any refutation - including a pretty intensive discussion with a newspaper editor.
I find Shark's post very refreshing and look forward to more 'math in public' in the future.
Posted by: MC on December 6, 2004 07:55 PMI basically doubled the probability above fifty
percent. The correct numbers (again this attempts
to come of with the probability that the entire
population favored Rossi given the vote totals
that the election yielded).
The (again) corrected numbers
Original count:
Rossi lead - .1576 standard deviations
probability - 56.3%
Recount:
Rossi lead - .0254 standard deviations
probability - 51.0%
I agree that statistical analyses don't accomplish
much in the real world (but I still enjoy them). :)
Elaborating, the normal distribution approximates the binomial very well for large samples, so your model and the binomial model of Anon are essentially the same; he/she simply uses better assumptions about the variance.
Finally, both models are fundamentally flawed, because statistical inference is designed to make predictions about some population based on a sample of that population. But your sample already includes the entire population of individuals who voted in the Washington election. Any discussion of a statistical tie would have to be based on the error rate for reading the ballots, and you apparently do not have enough information to make any sound statistical assertions in that regard.
Best wishes and hope your man comes out on top.
Posted by: beimami on December 8, 2004 01:21 AM