Hiding Bad Code is Easy, Finding It Is Hard

by on October 26, 2006 · 4 comments

One of the important points made in Jon Stokes’s write up of e-voting is how much easier it is to hide malicious code in a program than it is to find it. This was also a point that Avi Rubin made quite well in Brave New Ballot, when he describes a computer security course he taught in 2004:

I broke the class up into several small groups, and we divided the semester into thirds. In the first third, each group built an electronic voting machine that it demonstrated to the rest of the class. These machines were basically simple programs that allowed a user to make choices among several candidates in different races and that were required to keep an electronic audit log and produce the final tallies when the election was over. The groups then devoted the second third of the term to planting a back door in their voting machines–a mechanism by which a voter could cheat and change the vote totals and the audit logs so that the change would be undetectable. Each team had to turn in two versions of its system, one that worked properly and one that “cheated,” with all the code for both.

The groups spent the last third of the semester analyzing the machines and code from the other groups, looking for malicious code. The goal of the project was to determine whether people could hide code in a voting machine such that others of comparable skill could not find it, even with complete access to the whole development environment. Each group was assigned three machines from other groups–one good one, one bad one, and one chosen at random, but none of them identified as such. That was for the students to figure out by analyzing the code and running the machines. Admittedly, this setting was not much like that of a real manufacturer, in which there would be years to develop and hide malicious code in a code base that would be orders of magnitude larger and more complex than in our little mock-ups. Furthermore, the students had all just spent more than a month developing and hiding their own malicious code, so they had a good idea of what other groups might try. Conversely, in practice, auditors would have considerably more time to analyze and test potential code for problems. Still, I expected the results to be revealing, and I was not disappointed.

Many of the groups succeeded in building machines in which the hidden code was not detected. In addition, some fo the groups succeeded in detecting malicious code, and did so in a way that in and of itself was enlightening. In one case, the student discovered the cheating almot by accident because the compiler used by the programmer was incompatible with the one used by the analyzing team. The experiment demonstrated, as we suspected it would, that hiding code is much easier than finding hidden code.

I think this is a big part of the reason that computer security experts tend to be so skeptical of claims that independent testing has “proven” that a company’s voting machine code was secure. Even if the “independent” firm were genuinely independent, (which it usually isn’t) and even if they were to do a truly exhaustive security audit, (which judging from the Rubin and Felten reports, they usually don’t) it would still be unlikely that they would be able to detect malicious code that was inserted and camouflaged by a relatively talented programmer.

Previous post:

Next post: