 All right, welcome back. Thank you everybody for coming. Our next speaker has been Brecht with man the middle privacy attacks on mix mode butterfly key expansion protocol, which he's going to be saying three times fast. Go. Thanks a lot. Thanks for coming to my talk. I have some wobbly legs right now. I don't know if that's because I'm standing up here and giving this talk or if that's due to the DEF CON bike ride we did this morning. But let's go and start my talk. I'm going to talk about butterfly keys. I will explain what it is and a man in the middle attack on the privacy there. But first of all, I have to give in this claimer. I prepared this talk in my private time. I do not represent anything else than myself, no company, no organization, no nothing. I unfortunately have to do that at the beginning of the talk. So butterfly keys. What are they good for? Why would you be interested in butterfly keys? Butterfly keys is first of all a key derivation function on public private keys, not on symmetric keys, but asymmetric keys, which is rather unusual I think. And it also is a protocol that can and is be used in low computational power devices that don't have that much bandwidth, so low bandwidth environments, and that maybe even have low secure storage available. But on the other hand, they might need certificates or maybe even many of them, not just one, but a couple hundred or thousands. That is what this protocol was invented for and specifically for V2X, vehicle to everything communication. But I think it can be easily transferred to other use cases as well. For example, IoT devices. So V2X is just one example that I want to bring up here and give a little bit background on. V2X is the communication, direct communication between devices, vehicle and infrastructure or vehicle and vehicle in sometimes unmanaged settings, but sometimes also managed settings when we look at cellular V2X. Unmanaged means there is no infrastructure involved. There are immediately sending messages. There is no hotspot, no cellular tower. They are communicating really directly with each other and manage the other way around. There is some infrastructure in between. To give you a better picture, applications that are being implemented right now or that are being researched right now are, for example, forward collision warning. If there is a car breaking a couple of cars in front of you, you get a message or your car gets a message and the car then can show you a warning that there is a car breaking or even start incorporating that into active safety features like engaging the brakes. This is something you can't do with traditional sensors because you most probably only see the car right in front of you but not a couple of cars ahead. Or intelligent traffic signals. They can count the number of cars coming from any direction and then react to this appropriately by extending green faces, for example. Another one is road condition reporting. This is more in my area where I'm from, not probably over here, Michigan, the northern part of the USA. Vehicles that are out there to maintain the road, they could report back what the current conditions are and traffic management centers could react to this in order to traverse traffic to other directions, for example. And just to give you an idea where this could go, another last application, four-way stop. When you have a four-way stop intersection, most probably or in some cases you wouldn't need to stop any of the cars approaching the intersection. For example, if a car is coming up and driving straight through and or intends to drive straight through and from the left-hand side comes a car that wants to turn right, they wouldn't collide so both of them could go immediately. They can't do this today because they can't communicate. Drivers have to communicate with each other and therefore it's safer to stop. But in the future there could be like a notification to the driver, just go straight ahead, just turn right. Or when we look more into the future into automotive vehicles, they could do this on their own. Okay, what do you need for that? When you exchange messages between cars, first of all integrity of the message, nobody should be able to change the message content without the receiving and noticing that. And the other part, sorry for that, is authenticity. You need to make sure that the receiving end is able to distinguish messages coming from authentic devices that are allowed to send those messages from devices that are not. Like somebody sitting on the sidewalk or somebody with a car from a junkyard. How do we do this? This is traditional technology, there's nothing new to that. You have a message, in this case a basic safety message that I brought to you. There are a couple of payload data in there, speed, position, heading, and then you have a signature and you have a certificate. And with certificates there is a public key infrastructure for that as we know. Nothing new about that so far. The issue is when you have certificates, you could use that for tracking the device. You could use this as a fingerprint or an ID of the device. So you only have to follow or to put up listening posts in two places that you want to kind of figure out if a certain car is going to. And once you see that certificate in the first place and can attach that to the car and you see this certificate showing up in the second place, you know that this vehicle was in both places. And we want to avoid that for sure. So how do we avoid this? First of all the overall communication stack in V2X is always required to change all IDs in a certain manner. This goes down to the MAC address, to anything that you could use for fingerprinting and also the certificate. So what you need to do is you need to have many certificates and you need to have non-traditional certificates that don't include any identifying information like a domain name or a vehicle identification number or anything like that. There's a standard for this. I'll give a reference over there. It's an IEEE standard which is defining the exact content of that. And then you get a whole bunch of them. Like you have masks that you can hide behind and you just exchange them every time you exchange all the other IDs and you throw this certificate that you used before away and never use it again. Therefore listening posts that are in different places wouldn't be able to figure out by just seeing the certificates or listening to any of the other IDs that this is actually the same car. So this is new. We need a whole bunch of them. Currently in the European Union for example, there are standards that require up to 100 certificates per vehicle or per device and week. In the US the debate is currently at 20 certificates per weekend vehicle. I think the truth is somewhere in the middle. But it's more than traditional PKI so far have provided to any other devices that I know of. So if you look at the number of vehicles in the United States for example, which is roughly something between I think 350 million or to 450 million currently in the car 70 million devices, new devices each year, this is easily going to be the biggest, largest PKI that existed so far. So we needed a new approach to issue certificates to those devices in an efficient manner. But at the same time we need to ensure that certain privacy requirements are met. Which is that none of the components within the PKI should know which certificate belongs to a certain device because with this knowledge they could start tracking. So if I'm PKI operator and I know this 100 certificates went to Benedict's car, then I could easily start tracking. Again, I only need those two listening posts and I just need to match the 100 certificates in the current week to the certificates that I saw over there. And the same goes for when I know which certificates belong together to a certain batch. I don't need to know which device it is. I only need to verify one of them against the target device and then I'm able to track. So we need to avoid that in the overall architecture. This is the PKI as it is currently proposed in the United States. The European Union looks slightly different but in principle it's the same. I don't want to go into too much detail here. I want to focus on the components that are directly or immediately involved in issuing the certificates to the devices which is the registration authority in the absurdum CA. And then there is something called location obscura proxy in between to remove IP addresses for example so that nobody knows where the vehicle was when it was requesting us certificates. But that's just a minor detail. So butterfly key expansion. This is the solution that especially William White came up with. I have references in the end if you want to read up on this. How does this work? First of all, we have a key derivation function for public private keys, asymmetric keys. Just one promise this is the only technical slide I hope or at least the black and white slide that I have today. So we have the typical public-private key in the relationship between them. In this case it's an elliptic curve that is being used and we have P and A which equates small cap A P. And if P is given the public key but A is not, it is hard to compute the value of A. That's the assumption that goes for all asymmetric key systems. We have agreed base point G of some order L which is on the elliptic curve. And what we do now is the first new thing compared to traditional PKI is we define a Caterpillar key. We call it Caterpillar key in this case because it's like the start of the butterfly which is an integer and a point A on that same curve. The certificate requester, so that's just a standard public private key. So there's nothing special to it yet. The certificate requester provides to the registration authority the value of A, the public key, and an expansion function which is a pseudo-random permutation of the integers modulo L. For example this could be a SHA-256 function that could be used for that. Just as one example that's also the example where that's actually the function that we just used in the PKI that I've shown before. And what the RA now can do, and this is the magic in this, it can create as many cocoon public keys based on this info without the requirement that the device ever comes back for that. So the device really only needs to send the initial public key and this expansion function and the RA can now execute this expansion time X times and adds this to the existing or the initial Caterpillar public key to get a derived cocoon key as we call it. The private key for that is generated in the same way but on the device side. So it takes the initial, the device takes the initial private key, executes the same order of expansion function on it, and gets the matching private key to that public key at the RA. This is a really nice feature because with that we have decoupled the generation of public private keys and don't need any synchronization on that between devices and the RA. The protocol which is then built on top of that looks like this. We have in the beginning as I said the Caterpillar key pair, the public key and the expansion function goes over to the registration authority. The registration authority does the expansion X times or N minus one times to generate as many keys as required, for example 100 for the current week. Then it shuffles them together so it takes multiple of those keys from different devices in order to avoid that the authorization CA which issues the certificate in the end, can just take a sophisticated guess based on the order of incoming week quests which of those certificates belong together. So that's another feature which is required in order to meet the two privacy requirements that I introduced in the beginning. So it shuffles them, adds the required metadata for the certificate like valid appeared for example and sends them over one by one to the authorization certificate authority or PCA as it's called in the STMS. The issue now is that the RA knows all of those public keys. So the RA at this time has the knowledge of all the certificates that belong together. It even has the knowledge about which certificate belongs to which device. So what the PCA needs to do in order to make sure that the RA actually doesn't know the certificate public keys it adds a true random number to the private key. This is just another not expansion but just another random value which gets added to that. And then the issue is how does this information go back to the device without the RA knowing. So there is a relationship between the initial key which was sent over by the RA but as the RA doesn't know the random number which was added it only can guess which how the final public key looks like. So in order to avoid that on the way back the RA learns which certificate or which public key goes into the certificate the device initially creates a second key pair a second Kda-pillar-p-care which is used for response encryption. Same thing at the RA does the expansion it sends the encryption key together with the cocoon key for the certificate and the PCA takes all of this together the final public key generates the certificate signs that and then it uses the response encryption key in order to encrypt the certificate itself and this random number that it just added to the public key. Both of that goes into one package which then is encrypted to the end device. Medler is in the middle. If I would be a medler in the middle I would try to get in this communication order to figure out which of those keys belong together. So what I could do as a medler in the middle is exactly at the RA level I could just introduce my own encryption keys. If the device sends me a pair of response encryption key and I just inject my own I can decrypt the response from the PCA and then just after I learned which certificate is in there I re-encrypted with the encryption key from the device because I just generated them. The device would never know because the package itself is encrypted with the right key but the RA learned what's in the certificate or which public keys are used. So what the PCA does as a last and final step it signs again the overall encrypted package so that if the RA would decrypt that the signature would break and the device could easily see that there was a man in the middle attack on the privacy for this certificate. It sends it back. The RA sends it back and deshoffers this, packs all of the or zips all of the 100 certificates or whatever it generated for the device together in a zip file and then provides it for download to the device. This is a completely asynchronous process. The device can come back at any time as soon as the certificates are generated and download them which is another nice feature of this protocol because when we talk about V2X communication, cars driving around, it's always a matter of connectivity. So if we just have to send the initial request and then come back whenever we have connectivity that's a way better way of doing this in a synchronous protocol with this type of devices. The device generates the respective decryption keys with the same expansion function, decrypts the package and then it generates the respective signature private key in order to be able to use the certificate going forward for V2X communication or whatever else you want to do with this if you look at IoT devices for example or another use case. So the advantage of this approach is we have a single request from the device and this is not a 100% true, this is true with the next more sophisticated, more optimized version of butterfly key expansion. So we have just one request, in this case just two key pairs that have to be generated on the device and the generation process is actually way more optimized because we just do a char calculation instead of generating new public private keys each time we need a new certificate. So from a computational point of view this reduces the load that the device has with certificate management a lot. On the back end side we also have a big advantage and that is we can constantly pre-generate certificates for all the devices. When we look at traditional PKI's we always have the issue that we most or often times have a synchronous protocol where we get the certificate signing request, we have to sign it and turn around to give it back to the device and when we look at especially personal vehicles they have peak hours. Whenever you go to work, whenever you come back from work, we have way more cars on the streets and active powered up than doing off hours. That means that the PKI has to have respective resources in place in order to manage those peak hours. But with this approach we can constantly pre-generate devices 24-7 and we get away with all of the peak hours. There's nothing, there's little bit for the download process but this is just a regular file server and standard IT approach for that. Less computational power or we mentioned that. The bandwidth usage is also less from a device to RA perspective because we only have a single request and with a traditional certificate signing request there would be a public key each time and this is exactly what we save. We have a single one, single request and therefore just a single public key that needs to be transferred from the device to the RA. Other way around it's still the same but this way we utilize less bandwidth. When we look at this protocol you might have recognized that there are already two different keys on this string of signing keys that end up in the final certificate. We first have the cocoon key which is generated at the registration authority S and then we have the final butterfly key U in this case which is generated by the PCA through adding this random number. So why do we need to have a third key to do response encryption? We actually don't and this is the approach called unified butterfly key expansion. What we do is instead of using the response encryption key for encryption of the certificate and the random value we just use the cocoon key and that way we save another key that needs to be communicated from the device to the RA and we save one expansion function at the RA. The nice thing about this is also that if we would now do the medallist in the middle attack on this, the device would immediately know because if there is no encryption key that the RA could inject in this process. If it would change the response encryption key from the device in order to learn the certificate, in order to be able to decrypt, the keys would match at the device level anymore because the device does exactly the same expansion as the RA does to gain the private key and if it would do this on a or would try to use that decryption key on a package which was encrypted by a different key, this wouldn't work. So the device immediately knows. This is not advantage of this approach because then we can get of this explicit signature of the encrypted package. So we have a more effective, more optimized way of doing public, of doing butterfly key expansion. The issue now with this is that advantages of this, again, less computational power of the device because it's just one key that we need to generate. Even less bandwidth usage because we only have one key that needs to be transferred, not two. But the issue with that is what if we have a mixed mode setup like we have PCAs or authorization certificate authorities that support the traditional butterfly key expansion mechanism where we use two keys, one for the certificate and four for response encryption and another set of certificate authorities that provide the UBK protocol which can easily be happened because the traditional butterfly key expansion mechanism is already deployed in the U.S., for example. And going forward, there would definitely be interest of this more, into this more optimized way of doing butterfly keys and with that we will certainly see CAs doing, providing this more optimized way. So if we have a setup where we have both types of CAs, the middleers in the middle, again, could start decrypting. So if we have an RA, an evil RA in this case, which promotes I support the UBK protocol, but in actually in turn works with an authorization CA that provides the traditional butterfly key expansion mechanism, the device wouldn't send any response encryption key. It would assume that by generating the right private key for decryption of the package it would be able to tell if the RA is in fact an evil RA. But by using a CA which supports the traditional butterfly key expansion and therefore expecting a response encryption key, the RA could just generate on its own a key pair for response encryption. And if it does so, it gets a package which is encrypted with and then it could again decrypt those packages before it uses the right key for the device to encrypt again. And with that, the device wouldn't even know that there was somebody in the middle. And that's the issue that I came up with. And so far this overall mechanism is currently being standardized in IEEE. The only two solutions that we came up with so far for this is either we reintroduced the explicit signature on the encrypted package, which gives us the disadvantage that again we need to have a more powerful HSM at the CA because we not only have to sign each certificate but we also have to sign it again after we encrypted it. So it's two signatures per certificate each time. The way that we discussed is that the CA would need to be explicit about which mechanism it supports. This could be, for example, a flag in the CA certificate itself saying I support the unified butterfly key expansion mechanism or the traditional way of doing things. And with that the device could determine if there was somebody in the middle trying to learn the certificates that belong together. I personally favor the first approach because with that it's explicit to the device. There's no extra check that the device has to execute for that so nobody can forget to do this check. But the standardization process is not yet done and we will see what we will end up with. That's the end of my talk. I hope it was informative. I hope there is interest in using this in other setups than V2X. And if we have any questions feel free to reach out to me. Those are sources I will publish the slides later on. I think they will be published here by DEF CON nevertheless. Feel free to reach out to me. That's my Twitter handle or just catch me. We have some time for Q&A. In fact, if you're ready. Anybody has questions come on up? The requirement to have multiple certificates and randomized MAC address and things like that. That seems like an unusual amount of attention paid to privacy in the design of a protocol like this. Do you have any background or intuition on how that came to be, why this cares so much about privacy? Yeah. Regulators are actually looking into regulating this technology or requiring it in devices or they did so in the past. The USDUT published a notice of proposed rulemaking for this which would require each and every car to be equipped with this technology. Under the current government, nobody knows where this goes. So it's stalled right now officially. And the European Union was about to issue another regulation which would require devices that do V2X to do it in a certain way. So it's an if equipped regulation. But this got rejected just a couple weeks ago. And that is the reason why this technique or this technology got invented. Because if you regulate it and each and every car has to have this technology, you need to ensure that certain privacy requirements are met. Other questions? All right. One big round of applause for Ben.