 Hey everybody, and thanks for coming to this talk. This is the ballot is busted before the blockchain, a security analysis of votes, the first internet application used in US federal elections. My name is Mike Spector, and this is joint work with Jimmy Koppel and Danny Weitzner. And all three of us are from MIT's Computer Science and Artificial Intelligence Laboratory. All right, so let's talk a bit about voting. In February, the state of West Virginia abruptly abandoned plans to adopt an internet voting phone app called Votes. And this research is actually why. I also think this serves as a great case for explaining why we should be careful before advocating for new technologies and civic processes. And it really highlights the need for transparency and accountability in election systems in particular. So our story starts in late 2019, when it became clear that West Virginia was going to pass a bill directing their secretary of state to allow disabled voters to cast their ballots over the internet. This is really important in the context of West Virginia because the state has a higher than average number of disabled voters. For example, according to the CDC, 22% of adults in West Virginia have some serious difficulty walking or climbing stairs. Another 7.7% have some sort of vision impairment. Interestingly, West Virginia had already been using votes for overseas military voters and we expected that they were going to be used for this expansion as well. So given the potential impact, we really wanted to know how votes provided what are often considered to be the essential security requirements of voting. In the crypto literature, these are defined as correctness, privacy, receipt-free-ness, and coercion-resistance. These bottom three properties are there to ensure that the voter is an unduly pressured to vote in any particular way and can't sell their vote. Really interestingly, votes advertised heavily on the use of cryptography and cryptographic tools. In particular, they said they used hardware-backed key storage, mixed nets, and of course, because every app has to do this now, the blockchain. And the question that we had was, is this app meant to be end-to-end verifiable? And end-to-end verifiability is a guarantee that allows a voter to tell if their ballot was counted correctly in the final tally. And given the use of the tools that were purported to be used here, be an expected guarantee of the system. What was really interesting about votes to me is that if you took a look at them from sort of a large distance, they looked really good. For instance, they had this bug bounty by hacker one. They had undergone a couple of security odds from people outside of the company. There's even some documentation of how their system worked. But the more you looked at them, the more red flags you saw. For instance, let's look a little bit at their documentation. Their documentation consisted mostly of this FAQ. And here's a link to the version that we saw when we began our analysis. Importantly, there was no formal description of how their system actually worked. And this was a bit odd given the fact that the use of the crypto, you would really imply something novel that might require further explanation. There was a number of security reviews that were claimed to have occurred, but none were actually made public. And in fact, there was no public list to fix vulnerabilities at all, which led us to believe that either the security reviews were just being kept secret or the security reviews themselves hadn't found anything, in which case it's sort of unclear if they were any good. These reviews were done by the National Cyber Security Center and Shift State Security, two entities I had never heard of before. The NCC was sort of interesting to look into as well. Their name originally gave me the impression that they were somehow associated with the government and I was really surprised the way that they weren't. They also appeared to be more of a trade association and I couldn't find a staff cryptographer or security researcher there that I could reach out to and ask further questions. So, you know, at a first glance, many of the cryptographic guarantees that they tried to claim in the FAQ were sort of interesting and they look like they're real hard things to achieve, but in reality, there's really no formal definition for any of these things. For example, N10 vote encryption really doesn't mean anything, but if you squint at it, it sort of looks like N10 verifiability. So we really weren't clear on what to make of this whole thing. All right, so let's look a little bit at their bug bounty. The FAQ bragged quite a bit about having a bug bounty and don't get me wrong. Having a bug bounty at all is fantastic except when we took a deeper look, we found it to be sort of less than encouraging. You were required to use this special version of the votes app, which connected to a bunch of different servers from the live version and there was no documentation of the differences between the bug bounty and production, right? And as a researcher, this is sort of dangerous because I really would like to know what the actual version of the app is doing versus what's happening in the special environment. It also had really limited scope. For instance, we weren't allowed to look into men and middle attacks and there was nothing allowed that required physical access, but they never really defined what physical access meant. So if I used a jailbreak to simulate a root exploit and then found a bunch of things that I could do with a piece of malware that happened to have root on the system, it was unclear if that actually was in or out of bounds. And the test infrastructure provided was also sort of wonky. There was no binary source for the servers or the server infrastructure to set up yourself. So you had to rely on their systems and when we first tried the app, it actually wouldn't connect at all. To make things even worse, both the live and bug bounty version of the apps were obfuscated. For context with Android reverse engineering, everything is really an APK. And APKs themselves are made up of a bunch of Java classes and metadata in the form of XML. And what's really cool about APKs is that they, for the most part, will decompile to source with the exception of C compiled libraries or something like that that are attached in there. But the Java stuff will almost always decompile to source. So we popped votes into a decompiler. And the first thing that we noticed is that all function names, class names and variable names had been changed into these random Unicode strings. This really isn't normal and is the kind of thing that's done as an additional step by some automated obfuscation tool to make static analysis more difficult. Maybe this was done on accident, but there was other things that led us to believe that it wasn't, right? So for instance, later on in our reverse engineering process, we noticed that there was a bunch of string obfuscation going on. So for instance, in normal cryptographic code, you'll, in Android, you'll see strings like ASGCM used. And instead, we would see this weird function call with a series of numbers after the fact. And if we actually looked into the function, what you'll see is this. And this is a really common runtime string de-obfuscation technique. And I rarely have seen this in the context of consumer software, except out of things like game kits and DRM kits. I've mostly seen this in cases of, say, hiding malware command and control. So this is sort of out of the ordinary and interesting. And as a reminder, this is the bug bounty version of the app, and this is the version of the app that they're trying to get us to analysis on. And there's all these sort of dodges to make it more difficult. Things also got a little interesting if you look at their safe harbor provisions. These are the conditions that they put on researchers before the company agrees not to pursue legal action after you report a bug. The restrictions on reporting seem to be the following. First, you must give votes reasonable time to patch bugs. And the second is that votes retain the authority to decide what reasonable time means. Taken together, this means that votes just hold on to bugs indefinitely. And to me, this sort of misses the point of using a bug bounty as a transparency tool altogether. If the point is to let us security researchers hold the company accountable to actually fix the bugs we discover, then there needs to be some sort of time limit here. To make matters worse, a previous attempt at security analysis appears to have resulted in the University of Michigan researcher being investigated by the FBI. In fact, votes as CEO was quoted by CNN taking credit for reporting the researchers to the authorities. And we never really found any indication that the company was bothered that it turned out to be a researcher rather than a malicious entity. To be fair to votes though, there was this quote in the bug bounty that explicitly banned examination of the non-bug bounty version of the app. However, the bug bounty's edit history tells us that this restriction was added after the University of Michigan researcher was reported to law enforcement. So I can't believe I have to say this, but the first unwritten rule of bug bounties is do not send the law after well-meaning security researchers. Flashing a bit forward in the talk, this combined with the rhetoric afterward really complicated our ability to disclose the results of our work. All right, so overall we found a bunch of issues, right? There's no formal documentation of the system. There was these weird security claims. There were no public security audits. The code itself was really obfuscated. And the bug bounty as a whole seemed kind of dubious, but really we needed to do this analysis. The impact was too great, so we did it anyway. All right, so a key challenge here was that we really didn't wanna touch anything that could possibly cause harm to an election. And therefore we had to make a number of assumptions about the backend. Our solution was to manually reverse engineer the app and iteratively re-implem the server to better understand the protocol and app functionality. For analysis, we always assume the best possible situation for the backend. And whenever we deviated from this assumption, we explicitly discussed why in the paper. All right, so let's take a very quick tour of the app from the voter's point of view. Initialization is very similar to any other app. You're asked to use the sort of standard one-time password. You're asked to provide an eight-digit pin. Then you're asked to log in via fingerprint. And finally, you are officially part of the system. At this point, however, you're still not verified as a real voter, so the way that votes allows you to become sort of identified in the system is you provide some sort of physical ID. And how they do this is that they ask you to take a picture of your ID, and they ask you to take a picture of your face. And then they upload it somewhere and do some matching, and eventually, the user is told that they have successfully been verified. Now, at this point, the voter can actually participate in elections, which looks something like this. Here's an actual ballot. Finally, they can review and submit their ballot. They're asked to verify and log in again using their biometrics. And finally, they have voted from their FAQ and reports from the pilots. We found out that the user is emailed an encrypted receipt of their vote, which appears to actually contain the real selections. It's really not clear how this receipt ends up working in practice, but we did find the screen in the app which provided a password for a decryption. From what we could find in the literature, really, there's no intent verifiable voting system that really works this way. All right, so let's talk about what happens behind the scenes. In reality, votes is, for the most part, just a rest app, right? Which means it's a bunch of HTVS get and put requests to an centralized API server. In addition, there's two other entities that you should be aware of. The first is Imperium, which is a third-party antivirus which votes uses to detect all sorts of stuff going on in the system. And a third-party ID verification service called Jumeo. So if you look at the network itself, it looks something like this. Solid lines here delayed a connection that we were able to actually observe or find in the documentation. Dotted lines are things that are either things that we sort of suspect exist. All right, so let's talk a little bit about attacks. Let's consider the scenario where an attacker has gotten control over a user's device and loaded some malware. The first thing that this malware would have to do is actually defeat Imperium's malware detection. The Imperium is initialized on app creation. So when the app first loads, it begins scanning and then after every app resumes, so if you were to leave the app and go to your web browser and come back, it'll then do another scan. It looks for a bunch of things like known exploits slash malware, any indicators of jailbreak, debugging or modding, and it's a bit of a snitch, right? If you're caught, it will alert both the votes API server and it's Imperium servers. And if we had to guess, this is probably what caused the University of Michigan researcher to be caught. The thing about Imperium is that it's really not meant to stand up against targeted attacks, right? Defeating this is really easy, it turns out, if you have root on device. In particular, we can check code in the Java runtime and modify it to stop Imperium from ever really being initialized. This might sound super complicated, but this is almost all of the code that's actually required. This sort of thing is really common and well supported by the tools in the modding community. And without Imperium, right? The attacker has complete control over the user's device and any receipt that you would receive over email is also likely compromised as well. And there's really not much that can be done to prevent this. Again, votes does not appear to be N10 verifiable. All right, so maybe you don't buy this attack, maybe you think that it's a little difficult to get malware on the system a priori. What happens if we only get access to the device after a vote has been cast? All right, so what's really interesting is that there's an encrypted database that contains all of the user's vote history and everything that is needed to authenticate as the user to their server. This database is encrypted using a secret key which is derived from the user's pin and the salt. The salt is a random value that's stored on disk unencrypted. The pin, however, is also stored on disk but it is encrypted using the Android key store, which means the keys are stored in this hardware enclave and never accessible to the app itself. And it requires a user's fingerprint or biometric to decrypt and this is actually kind of okay, right? So it looks great. Except that the pins themselves are only eight digits and numeral only. And of course, you know where this is going. 10 to the eight combinations means that there's roughly 100 million pins. And if we copy the database from the device onto our laptop and create a very short Python script to try to pre-force it, we found that it takes roughly 0.05 milliseconds per attempt so that you can try all of these pins in roughly 1.6 hours. Indeed, we were able to get complete access to the voter's entire vote history and their authentication data. Meaning that anyone with forensic access can brute-force the pin, control the voter and see how they voted. All right, let's talk about the server. Now on device, you can really just get access to one particular user, of course. The server has access to far, far more. And what's really interesting is that votes use this custom crypto protocol between the device and the server. It looks something like this. First, the device will establish a standard HTTPS connection to votes' API server. Then on top of HTTPS, it'll perform the following non-standard home world crypto algorithm. First, the device generates 100 ECDSA key pairs. It'll then immediately discard all but the 57 secret key. Then the device sends all 100 public keys to the votes' server. The server will generate its own 100 keys and perform a key agreement with the setter's 57th public key and generate a random value that a letter will be used as an ASGCM shared key. It'll then encrypt this ASGCM key with the ECDSA shared key, this 57th key, and sends the client the server's 100 public keys. Finally, the phone performs its own key agreement and decrypts the ASGCM key. So from this point forward, all communication is additionally encrypted using ASGCM. The thing is that was actually 100% of the crypto and the protocol used between the device and the server. The app actually never sees anything from any blockchain, and this includes what we would expect to be exist in an ecosystem like this, which would be a proof of inclusion. The app actually never even verifies the server's public key outside of the standard HTTPS connection. So active man in the middle attacks are still possible if HTTPS is broken. There's also no non-ephemeral public keys sent from the device and nothing is ever even signed. And the summary here is that the API server can really do anything that it wants, which sort of begs the question, what's the point of having this blockchain? All right, maybe you think that it's impossible for someone to even get access to Votes as Servers. We're not worried about nation states here. Let's talk about the least powered possible adversary, a passive network adversary. Now, that protocol that I mentioned that I explained before was non-standard. It has unclear security benefits, but it isn't inherently insecure on its own, but it does make this other attack way worse. But first I have to explain how Votes are cast. On the left is what a ballot looks like in app, one of the users selecting their preferred candidate and example election. This is generated from JSON strings sent by the server, which are variable length depending on the description of the candidate, various URLs and other metadata. For example, you can see the corresponding JSON sent from the server on the right. When a vote is submitted, you might expect that it just sends some ID numbers, but instead what it does is it sends all metadata for the voter's choice, but only that candidate's metadata. The result is a textbook side channel attack, which again is made far worse by Votes' custom crypto. In normal HTTPS, the plaintext is somewhat obfuscated by JSON compression. In Votes' protocol, however, the plaintext is encrypted before JSON even gets a chance. The result is that if you know the lengths of the ballot options, you can very obviously tell which of the options the voter selected because you cannot effectively compress encrypted data. This image graphs the size of packets sent from the device to the server immediately after a submission for the candidate with the short description and the candidate with the long description. Note that you can clearly tell which packet is the vote submission and which run is for the long candidate and or short candidate. The end result is that a passive network adversary like say the user's ISP or the insecure coffee shop Wi-Fi that they're voting from can easily determine ballot selections. All right, scenario four, let's talk a little bit about privacy and informed consent. Jumeo is this third party that actually does a series of things for votes in this ecosystem, including liveness detection, some machine learning to match the selfie and the voter's ID and forms OCR on the ID. What's really interesting here is that none of this is actually done on device. In fact, Jumeo servers actually gets both of these images which include all of the voter's personal information like their driver's license ID, et cetera, as well as their location via GPS. And the only place that we could find either in their votes is documentation at the time or in the app itself, where these very small translucent logos at the bottom of these two particular screens. And I'm not normally a privacy nut, but in this case, you gotta remember that this app was being used for soldiers in war zones. So at this point, it's actually somewhat of a national security issue. All right, so in sum, we found five high severity vulnerabilities and a really serious privacy issue. Interestingly, many of the issues we found were really basic implementation failures. This included mandated use of weak passwords, their anti-tamper slash AV solution was really easily circumventable. The side channel attack that we described earlier, right? And again, like the API server had complete control over all users in the election, it's also really unlikely that any of this is N10 verifiable or and it's unclear from what we saw that this could possibly be received free or coercion resistant. Below is a summary of the powers of a given adversary and I won't dwell too much on this here, but if you read the paper, we go into much further detail. What's really frustrating about all of this is that none of the attacks we found were really novel. I really started this research hoping to find some crazy new crypto and instead I basically found the standard CRUD app. So at this point, we really needed to talk to somebody and I'm a huge believer in responsible disclosure and we really didn't wanna hurt an ongoing election, but there was a real sense of urgency. And remember for context, the DNC had just announced that it was going to use a voting app to count ballots for Iowa's primary. And as far as we knew votes was what they were using. We actually really didn't think that the DNC would be crazy enough to try to launch an app right before an election. So the first thing that we did was we contacted the MITB Law Clinic who were fantastic and we sort of ran immediately into this very pernicious problem. And again, this is why rule zero of bug bounties exists. Both law enforcement and votes seem to have used this rationale for reporting the researcher that it was somehow required due to this designation as critical infrastructure. And we actually kind of agree that voting infrastructure is critical infrastructure and really should be treated this way. So rather than reporting the votes, we reported to CISA, the part of DHS that is responsible for digital critical infrastructure. CISA here was fantastic, professional and wonderful. They allowed us to report the vulnerabilities to both the vendor and those affected without putting ourselves at risk. Once we had confirmation that there were no active users of the app, we then released a preview of this paper. The media attention was immediate and effusive. We got a number of responses from both civil society and other researchers that were all very, very positive. Senator Ron Wyden spoke in support of our findings and sent a letter to Shift State Security asking for further information about why, for instance, they hadn't found the same bugs that we had. And I think that the reason everyone was immediately so sure of our results actually has a lot more to do with votes' response to us and which to put it frankly, wasn't very encouraging. They appeared to have two main concerns with our work. First that we somehow used an older version of the app, which by the way wasn't true. I literally have no idea where this 27 version thing came from. We used the most recent version of the app at the time of analysis. And the second was that somehow our methodology was flawed. Interestingly, votes never really denied the vulnerabilities themselves and attacking the methodology without denying the results is a really classic dodge. For example, Diebel did the exact same thing when Princeton's research on the AccuVote TSX came out. And it's worth noting that third parties were able to independently verify Princeton's findings after the fact. Speaking of which, about a month after we released our report, Trail of Bits, a really well-known security firm revealed that it had been contracted to do a full source code review of votes. And Trail of Bits report couldn't have been worse for votes, right? Not only validated our methodology, but also quoted votes as objections to our work and explicitly protected all of them. It also confirmed all of our bugs and revealed that the company had been alerted to their veracity before they spoke to the press. They also revealed that Zimperium wasn't running during the pilots, also that it was a blacklist. They also found no evidence of a mix net and confirmed the system was not N10 verifiable. They also found roughly 40 other bugs and vulnerabilities. So this is all sort of a mess. It's sort of important to do a little bit of retrospective and figure out how we got here. I think that there's a tension here that can be summarized pretty well in this quote by Bradley Tusk. He was the philanthropist that funded votes' use and seems to be really focused on expanding internet voting. He says that it's not that cybersecurity people are bad people, per se. I think that it's that they are solving for one situation and I am solving for another. They want zero technology risk in any way, shape, or form. I am solving for the problem of turnout. And while I really do agree with the overall goal of solving this particular problem, the issue is that introducing new technology tends to introduce new and unexamined risks. And in this case, this is all really compounded by asymmetric information. And here's what I mean by that. Election systems in the US are fractured and purchasing decisions are made locally, right? Each jurisdiction here shouldn't have to be stuck trying to vet the security of the systems they're purchasing. And unfortunately, the vendor is always going to know more about the systems they create than those purchasing them. So what can we do? The first thing that we should do is try to fight efforts to increase information asymmetry. Put another way, we should advocate for transparency in this ecosystem. This can be done through better public analyses of election systems, including reverse engineering like this work. We must also start demanding software independence in our voting systems. We don't really know how to do this very well with electronic only voting and anything where the voter gets to physically verify their paper ballot before it's sent in is really a better situation here. For instance, like mail-in ballots in person drop-off and in person voting are all great. Electronic only return or other electronic only systems like DREs are really, really hard to make and indurifiable and really, really hard to make software independent. And we're still trying to figure out how to do this at the research level. And everything that isn't transparent or software independent we should examine and really try to develop some replacements. Finally, it's an election year, go vote. Thanks for watching and feel free to reach out.