I’m reading Avi Rubin’s Brave New Ballot. I’ll have more to say about it when I’ve finished reading it (it’s excellent so far) but I wanted to comment on a passage that caught my eye. On pp. 185-189 Rubin discusses a proposal by Brit Williams to secure voting machines by comparing a hash of each voting machine’s software to a pre-computed hash in a centralized repository:
In the library that Williams envisioned, a cryptographic hash, also called a fingerprint, would be computed on the binary after the software is compiled. The hash would be stored in a secure location, and whenever a machine is rolled out, its software would be rehashed and the hash compared to a stored value, just as fingerprints might be compared. If they match, the software is authentic. If they don’t match, officials are alerted to a problem and can deal with it through predetermined procedures.
Rubin identifies several problems with this idea. He notes that software binaries can change frequently for legitimate reasons, so the library would have to be constantly updated with additional hash values. In addition, he notes that this does nothing to counter threats from insiders; if somebody is in a position to introduce malicious code into machines, he might also be able to introduce a malicious hash value into the library.
But Rubin didn’t mention the problem that immediately came to mind for me when i read William’s suggestion: what if the machine lies about its hash value? Although the description is ambiguous, it sounds like Williams is imagining that the hashes would be checked by election judges on or near the day of the election. The only practical way to do this, as far as I can see, is to have the machine itself compute the hash and display it on the screen. But that’s of no use at all, because a smart hacker will simply determine what the correct value is and replace the real hash-calculating program with one that always returns the expected value. Even if the vote-counting program is stored on removable media, there are still plenty of ways a compromised machine could detect its presence and trick it into producing the expected value. For example, the system calls that display information to the screen could be patched so that any time a program tries to print the text “Hash Value: ” (or whatever the string is in the real program) the operating system displays the expected hash value instead of the one passed to the function.
The fundamental problem here is that human beings can’t see the actual 1s and 0s inside of a voting machine, so at some point the only way to find out its contents is to ask the machine itself. And since software can be tampered with, trusting its output isn’t a good idea. Hence, we shouldn’t trust self-reported hash values for the same reason we should trust self-reported vote totals.
Update: Jim Lippard suggests that “trusted” hardware could solve this problem. Although I think he’s probably be right in theory, I have the impression that such technologies is still at the proof-of-concept phase right now. I wouldn’t want us using our polling places as a testing ground for ironing the kinks out.