 Presentation, we have unclear ballot, automated ballot image manipulation with Carl Kandula. Is that correct? Oh sorry, Cart Kandula, graduate student of University of Michigan. Cart Kandula received his BSE degree in computer science engineering from the University of Michigan in 2019 and is currently pursuing an MSc in the same area. He conducts research in the UM security lab under the supervision of Professor J. Alex Halderman. Currently, his research interest lies in problems affecting society and public policy, specifically election security. He has held internships at Microsoft and JP Morgan in the past and also Jeremy Wink, undergraduate student University of Michigan. Jeremy Wink is an undergraduate student at the University of Michigan currently pursuing a BSE in computer science. He has taken multiple security courses and spent time researching topics surrounding election cybersecurity also under J. Alex Halderman. Thank you so much. So Jeremy's going to get us started. Alright, hey everyone, thanks for coming. I'm Jeremy Wink. And my name's Cart Kandula. We're undergraduates from the University of Michigan and we're presenting unclear ballot, which is an attack based on automated ballot image manipulation which we developed jointly with Matt Bernard and Professor Alex Halderman from the University of Michigan. So to first give some background on our attack, post-election audits are one of the most important factors in guaranteeing the integrity of election results. If an attacker is somehow able to breach an election system in order to manipulate the outcome of the election, post-election audits should serve as a strong layer of defense in order to allow us to detect these breaches. There are several types of post-election audits including risk limiting audits in which a small random sample of paper ballots are inspected in order to give us statistically significant results. Another type of post-election audit is an image audit. In this case, an audit is conducted not on physical paper ballots but instead on images of the ballots which are taken when the ballots are originally scanned. This tends to be quicker and a lot more convenient than actually looking through the paper ballots, so this type of audit has been gaining a lot of popularity in recent years and was actually used by the state of Maryland following the 2016 presidential election. This is pretty scary because it turns out this type of audit actually does not hold up under adversarial conditions. If an attacker is able to breach an election system in order to manipulate the results of an election, they could just as easily get access to these digital images of the ballots. If they were then able to manipulate these ballot images in order to change the apparent votes within them, then image audits would offer zero security whatsoever and would effectively be useless. To demonstrate this vulnerability, we've developed an attack which focuses on automatically altering ballot images to change the votes within them. This attack targets tabulation machines which have been shown in the past to not be the most secure devices, and in fact numerous vulnerabilities have been documented in the past demonstrating how an attacker could potentially infiltrate these voting machines. The strategy of our attack is simple. Alter the votes within a ballot image in a way that is both undetectable and also consistent with the voters' original markings. In order to do this, we leveraged a variety of computer vision techniques operating under the assumption that the attacker knows what the ballot looks like in advance. This is a pretty reasonable assumption to make considering many jurisdictions publish sample ballots well ahead of the actual elections. The first step in our attack is to extract bounding boxes around the individual races within the ballot. This can be done by using a template match with a blank copy of the original ballot. The next step in the attack is to extract bounding boxes around the individual names and vote bubbles of each candidate. This is typically done with a huff line transform which you can picture here in the slides. For some ballot styles, the title of the race needs to be separated from the top candidate's name in order to prevent the title of the race from interfering with the rest of the attack. This can be done by running a vertical sweep on the image and using pixel intensities to find an area of white space above the candidate's name. It's also worth noting here that the exact algorithm using this attack varies slightly from ballot style to ballot style, but these same techniques are used pretty consistently across the board. The final step in the attack is to extract bounding boxes around the actual vote bubbles corresponding to each candidate. This is done by using a series of linear sweeps across the image, once again using pixel intensities to determine the boundaries of the vote bubbles. Once we've extracted bounding boxes around the vote bubbles of every candidate, all we need to do to alter the votes within the image is simply swap the pixels between bounding boxes. So here are a couple examples of votes that we've swapped using this attack. We've also recorded a video demo to better illustrate our attack in action. Before I show this, I just want to clarify that this demo was recorded solely for visualization purposes. The actual attack happens much faster and also has no visual components to it whatsoever. So here we have our target ballot. We first use a template match to extract the individual races within the ballot. Once we've done this, we use a huff line transform to separate the individual candidates. We now crop out the title of the race from the top candidate's window. Once we've done this, we can extract the bounding boxes around each candidate's vote bubble by using a series of linear sweeps. Once we've done this, we simply swap the pixels inside of these boxes and we've altered the votes. So I'll now be passing it off to Kurt to talk further about our attack. Thank you, Jerry. So unclear ballot is our POC implementation of this attack, which we did in C++ and we packaged this as a window scanner driver. We also tested this on the Fujitsu FI-7180 scanner, which is an EAC certified... Which is certified by the EAC for use in elections as part of Clear Balance Clear Vote System. So our malicious micro driver wraps around the original driver for the Fujitsu scanner and serves as an interface between the actual scanner and the election administration software, as shown in the image at the bottom of the slide. So to ensure the versatility of our attack, we tested it across six different ballot styles. For these are in use by the largest election vendors in the U.S. These are ESNS, Hart, Dominion, and also Clear Ballot. We also used two older styles of ballot from Hart and Divehold. So you can see all six of those styles on the slide. So to test our attack, we prepared 720 marked contests using the Bachi systemization. 120 per each of the ballot styles shown in the slide previously. So for each of these 120, 60 of these would be what Bachi would consider filled ballots. And we would have 10 of each of the marginal marks shown on the bottom and also 10 empty ones, which ends up giving us 120. So one key insight behind our attack is that an attacker would not have to alter a significant fraction of the votes in order to change the results of most elections. Because this gives us a lot of leeway with our attack because if we're not confident that we can move a marking without leaving trace artifacts that could be visible upon inspection, we can just skip over this ballot and move on to the next one. So at the bottom of the slide, you can see an image of a ballot that we would not attempt to alter. As you can see that it's an overfilled ballot. And if we were to attempt to create a bounding box around this marking, it would overlap with the candidate's name. So because of this, if we were to attempt to swap it with another candidate, the vote to another candidate, we would be taking part of the candidate's name along with the ballot. And this would be very apparent upon visible inspection. So shown here is the performance of unclear ballot upon each of the six different ballot styles that we used. So as can be seen, for the 60 ballot marks for each of the ballot styles, we were able to move at least 50% of the marks except for the case of heart e-scan. And across invalid and ballot marks for each of the ballot styles, we were able to move at least 18% of ballot marks. Given if we were to watch this attack and at this rate, we would have been able to swap the results in 48 out of 51 of the districts in the 2016 US presidential election. The only red districts that we could not have turned blue are Wyoming and West Virginia. Wyoming was won by Donald Trump, 68 to 22. And West Virginia was won by Trump, 68 to 26. And on the other hand, D.C. is the only blue district that we could not have interned as it was won by Hillary Clinton, 91 to 4. However, even if we were to be able to alter Wyoming's results, this would not be realistic as it would not be believed by the people, by the public. However, in a close election, it is very possible that we would be able to change the results to either one of the candidates. So along with the 720 mark ballots that we tested on, we were able to attain a corpus of around 181,000 ballots from the November 6, 2018 general election in Clackamas County, Oregon. So these votes were of the heart-barity style, and the votes were centrally counted with an optical scanner. Because they were centrally counted, this would have been an ideal target for our described attack. As an attacker, we would only have to inject malware onto devices in one location. So shown on this slide are the results of our testing on the ballots from Clackamas County. Our program rejected around 20,000 of these ballots, which amounts about 11%, which when we inspected a subset of these 20,000, and on many of these ballots, we noticed that there were scanning glitches. For example, on many of these, we noticed pixelated lines that ran through the length of the ballot image, which messed up our Huff line transform. We wanted to emulate the behavior of a real attacker, and a real attacker would want to ensure that they would not be leaving any trace artifacts upon swapping. So we used pretty conservative parameters with this attack. And even with these conservative parameters, we were able to alter around 62,000 votes, which amounts to 34% of the ballots. We sampled 1,000 of these ballots randomly that had been altered, and there were no visible artifacts that we could just by looking at them, we were not able to determine any visible artifacts. The alteration time for this was 279 milliseconds per ballot, while the fastest heart scan time was 352 milliseconds per ballot, which puts us well below the threshold we need to be at to conduct this attack. However, it should also be noted that we did not conduct any optimization on our algorithm, and if we had done so, we'd further be able to bring down our alteration time. So our image audience serve any use if they can be used to detect attacks. It turns out they're very good at certain things, such as catching non-adversarial error. So two examples of this are in Maryland in 2016. 2,000 ballots were discovered with the help of image audits, and a plough was discovered in the ESNS DS850 high-speed scanner where ballots were sticking together and being scanned in at the same time. So if there were two ballots to stick together and only one of them would be counted. However, these image audits cannot be relied upon in adversarial environments. What about image detection? Is it possible that vendors can implement detection into their systems in order to discover if alterations have occurred? It turns out that image detection ends up becoming an arms race at best because as detection algorithms improve, so do the manipulation algorithms to counteract them. So it's very likely that an attacker would be able to get their hands on detection code if it were implemented into a system, and with this detection code, they would be able to improve their manipulation algorithms to beat this. Another thing that an attacker could do to defeat a detection algorithm is they could use it as part of their mark-moving algorithm. So let's say our attacker has a ballot and they alter the ballot with their algorithm. They would then run that, the altered ballot through the detection algorithm, and if the detection algorithm detects that it's been altered, they would skip the ballot and just move on. But regardless of any of this, to our knowledge, no election vendor in the US has even minimal image detection in their systems today. So how can we protect against this? The best solution is to use risk limiting audits where people are looking at physical ballots. With these risk limiting audits, you ensure that the process is fully solved for independent and cannot be manipulated at any point. These risk limiting audits ensure that there's a high probability of detecting errors, there's a high probability of detecting fraud, and also of error. And even if every single machine in the voting system is infected, you will still be able to determine whether an alteration has occurred. So what do we take away from this? These image audits involve checking digital photos of ballots and then the actual physical ballots. However, as we have shown, an attacker can use computer vision methods to automatically alter and manipulate these images and change the results to their desired candidates. We implemented this with an EAC certified scanner, the Fujitsu F5-7180 scanner. And this scanner is part of Clear Ballot's Clear Vote system. And we showed that our attack works across all the ballot styles from all major vendors in the US. The best defense against this is for people to audit the ballots physically. So the work that we presented today, we compiled into a research paper and this will be presented at E-Vote ID and also known as the International Joint Conference of Electronic Voting. And we're releasing it to the public today on these two links. So if you go to those links right now, you can access the full papers. So we'll now be opening up before any questions. Yeah. One of the things I'd like to ask you is the images in comparison to the actual ballot, you find the attack. Yeah, you would. If you compare to the physical ballot. The same thing that we found in the Harrier attack, the Harrier attack, basically it only worked if you didn't examine the actual paper at all. Yeah. If you examined a portion of the ballots against the images using the Clear Ballot, this attack would fail. This would. Yes, it would. Which is why you need to, at some point, you need to go back to the physical ballots. That is correct. Yeah. Great work. When Boiney took the question, it seems to me what you described, that the bounding boxes around Mark's ballots, is probably the largest that Boiney's ballots are. So how do you swap pictures? Do I take this one? What is logic in place to normalize the size between all of the bounding boxes for like an individual race? I think it's a real difficult thing to guess. We're not going to be making it available just to. What was that? Yeah, that's correct. It's on the back end. Yeah. It's a flexible attack. Like every scanner is going to have a driver to interface with the actual system. And so this is, you can just adjust it to work with a different type of scanner. Yeah. So for the transfer of that to a computer, would it be a problem or the other? So our attack is on the drivers. So we create a malicious driver. However, you could also implement this attack at different points in the stack. You could implement it on the actual election administration system. And also you could also implement this attack on the actual scanner firmware. So that's a possibility. So if you were to implement it on the actual firmware, then that would be a possible attack on what you described. Yeah. We were manipulating the bitmaps that are returned from the actual Fujitsu driver. So before they're rendered as any type of actual image, there's a stream of bitmap data. That's correct. Yeah. I'm going to get to him. I can get to you after. Yeah. Yeah. So you mentioned a hash. For the hash to be secured, you'd have to ensure that the key that's being used with the hash is also secured, that what it's being hashed with is also not publicly available. And it's possible that an attacker would be able to get that and recreate a hash for the manipulated ballot. There are systems such as Skintagrity and Predebotare that this attack would fail with. And it would be very hard to make this work with those type of systems. So those are some systems similar to what you said. For example, Predebotare, the candidate order is randomized. And when you cast a ballot, you rip off the actual markings from the candidate list. So you cannot tell based on the ballot. So the ballot markings are tied with a key. And so you cannot tell just from looking at the ballot who someone voted for, because that order is randomized and it's removed from the candidate list. That's true. But with such a system, you could create a public bulletin for users to verify their votes. However, then you run into issues of selling votes. Not with Predebotare? Okay. My apologies. Professor Vanallo definitely knows better. Yeah. So that would not be an issue, actually. Yes? So that's correct. That case is when the tabulation occurs. Like the tabulation occurs with different scans from the actual scans used with the image audit. However, with something like that, you would have to attack both the systems and somehow coordinate it to ensure that you're getting the same results with the attack. That would be a little bit, that would be a more complicated scenario. However, our attack focuses on if the tabulation, if the scan for the tabulation and the image audit occurs at the same time. Right. Okay. Any other questions? Yeah, we have time. That's correct. So I think that's how the scan integrity system works, if I'm correct. So that would be more difficult. However, we could adapt our attack to that. Instead of actually swapping the ballot boxes, we could, for example, have a data file of marks that we could insert and remove. Yes? No, we have not done that. That would be a good stuff. I think we have one more time for one more question. To attack the voting machine? It's a post-exploitation tool, so we're assuming that someone has hacked the machine, which is very possible. There's a lot of publications regarding that. I think that's all we have time for. Thank you, everyone, for coming.