 Tonight we have Matt with Prototyping Cryptographic Protocols in Python with Charm. Thank you. All right. So, uh, what is Charm? Why do they have to use that word? So Charm is actually the name of a Python framework that came out of JHU. It uses a mix of Python and C and, uh, it's used for Prototyping Cryptographic Protocols. Um, so in this talk I'm going to go over a number of cryptographic primitives and I, uh, intend to demo how some of them work in Charm just so that you can get an idea of how you could play with some of this yourself. Um, and some of these will be ones you're familiar with, perhaps some, not so much. So, uh, let's start, get started with that. All right. So, um, some of the basic primitives in Charm, um, are block ciphers in math. So, um, so as far as block ciphers are concerned, they include DES, triple DES, and AES. Um, it does also have stream ciphers. I just didn't really look too much into them, but, uh, it's going to be a lot of back and forth for the microphone. Okay. So, um, I just put basically as a, a, um, stretch pad here, uh, a bunch of pipeline code that we can use to demo some of this. So I'm going to, um, explain a little bit about what this is. So first I'm just going to, uh, just convince you that, uh, we, it does support AES encryption and decryption. Um, here I'm just doing an example of CVC mode. Um, and when you create a new AES object, um, the first argument is the key, the second is the mode, and the third is, uh, the IV as applicable. Um, so, you know, if some marine was, it's a bit of a shout out to the metasound challenges. They, uh, like to have a lot of music references in their challenges, and that was their favorite AES key apparently. So it's become my favorite AES key, uh, not in production, but, uh, just like, it's a convenient one for examples. It's, happens to work out to be 16 characters. So that's nice. All right. Uh, so I'm going to jump into the console and I'm just going to restart it real quick. Let's push that up. Okay. So, hold on my moment. Well, uh, I guess that would be nice. I, it's intended for, uh, questions. I'm not sure if, uh, yeah. All right. Fred, what do you think about this? I guess, yeah? Okay. Yeah. When I first saw the setup here, I was like, I don't know how to turn these off. I don't When I first saw the setup here, I was like, hm, no, no, no, no, no. So anyway, do you know how to turn these off? I don't know how to turn these off, because that went on over there. Excellent. That's an after-right error. Yep. That's a lot better. Alright. Okay, so first I'm just going to copy some imports. Are you know what? It might be just faster for me to type. So, from, so the structure of charm is that as a, some of the core stuff is under core, and then crypto, AS, and we'll import the AS class. Oh, oopsies, what did I miss? Oh, crypto, ah, thank you. Maybe it's a, it should be. Um, I swear that this was just working. Huh, let's see. So, let me just make sure that, oh no, right, it's not the one, okay, let me just make sure that we can do some of the other ones too. So, a real shame if the other demos don't work. Okay, so, okay, so at least that one works. All right, so perhaps you'll take my word for it that AS works properly. As you can see also, we have to import padding, so it's not gonna just assume what padding you want, it has a PKCS7 padding included, so, let me just look at the code. We'll create a new AS object, we create a PKCS7 padding object, and then we can encrypt sample messages. And one thing to know also, this is fairly common, I've noticed, with most implementations in software is that anytime you want to decrypt, if you're doing both encryption and encryption on the same, in the same process, you need to create a separate instance of the object. So, you notice that before we decrypt, I create a new one with the same key and IV. And, yeah, trust me, it works. All right, so move back to the slides a little bit. So as far as math is concerned, some core mathematical objects are integer groups, elliptic curve groups and pairing groups. So, for integer groups, there are Schnorr groups and RSA groups. Schnorr groups are groups that are used for discrete log-based crypto systems. They have the property of being in a multiplicative group with a large, of a large prime order. So it's a large prime order subgroup of a prime field. And RSA groups are the groups that you use there are RSA. And not just RSA, there are other crypto systems that we'll see that use this same group. So, yeah, we already got RSA group here. And so you can create an instance just by calling RSA group. And then it'll allow you to just generate a new one. So there's a pram gen method and say 1024. And that's just outputting the prime factors that make up the modulus. And the 1024 in this case, it's the size of the primes. It's not the final modulus. So this will be a 2048 bit modulus. And all the groups support getting random values. So you can do group.random and you can multiply. Oh, I think it's the other way, other side. No, maybe not. The other groups, they go, okay. Anyway, oops, okay. So the elliptic curves, so both elliptic curves and pairing groups are, well, pairing groups are elliptic curves. But these elliptic curves are more the standard curves that I think they include the standard NIST curves. I don't recall which ones they require. Other ones include. These two files that I just mentioned are just pointed out here because this is how they split up their elliptic curve implementation. So the eccurve.py just specifies which curve it is. So it has just a huge list of variables. So for example, prime 256 v1 is, I believe, the NIST 256 bit prime, p of 256. And ec group will implement all the group operations or at least provide the interface to the group operations. Everything should be implemented in C. So we'll get the tool box. So first import from eccurve, say, prime 256 v1 and from ec group, we'll get the ec group class and then we'll have our group equals ec group of prime 256 v1 and we can check things like group order. So that's how big the curve is and we can get a random point, p, and this one, I know we can multiply. So we can do point-wise multiplication. We could add points. So if we make another random point, q, then we can compute p plus q. So I'm talking mostly about these math basics just because, so the whole idea here is at any point you find out about some cryptosystem that you don't know of an existing implementation, you wanna start working with it. So if you're, this is a more lower level one where you wanna implement maybe some new scheme that uses elliptic curves. You can, nice misspelled pairing. You can use elliptic curves or pairing curves. Was anybody here for the Pythia talk yesterday, by any chance? Yeah, so I suspect that, I haven't actually tested this but I suspect that it'd be possible to implement that scheme using charm because you have access to pairings and from what I could see, all you need are hashes and elliptic curve pairings but I could be wrong about that. Okay, so that's all the basics. Now we're gonna get into what they call schemes. So we'll talk about a few of these. The first being public key cryptography. So we've got RSA and PyA. How many of you are familiar with PyA, anybody? Okay, so PyA uses an RSA group but it has a neat property that the private key trapdoors discrete log so you can have an additively homomorphic scheme and actually get back what the exponent is. So when you use the PyA class, PyA class, you can encrypt values of numbers and add them together or multiply them by scalars. So let's take a look at that. Okay, so we'll get an import from charm.schemes it's under a public key encryption and then pkinc py99 and we'll import py99 and if we call, we'll call, okay, it's py99, right, I've got it. So we need the RSA group object so we'll just create a new one here and then we can do key generation. It'll give us both a public key and a private key and we'll specify our security parameter which is just bit size and so there we have our public key which you can see. So it's a generator g and modules n and then it also has n squared and that's what the n2 is and private key is this lambda in uvalue which allows us for doing the trapdoor of discrete log and so if we want to encrypt the value, let's say we'll do m1 equals, so I have, those are just random digits in the little scratch file up there so I'm just gonna do a few more random digits like that. m2 be other random digits, m3 is m1 plus m2 and now we'll encrypt each one. The encrypt method requires the public key and the message you wanna encrypt and now I'll show you so we can do c3 equals c1 plus c2 plus c2 so it lets you add ciphertext. So ciphertext structure is just a dictionary with the key being c and then the value is just a long integer and if we then decrypt it, yeah. So pi dot decrypt give it the public key, private key and c3, I'll show that that's equal to m3. So you can see how this operation just really is preserved through the encryption and this can become quite useful in a number of schemes so I think yesterday there were multiple talks that talked about homomorphic encryption. I plan on at the end of this one talking about a protocol that I'd worked on with another team that did secure text pattern matching using additively homomorphic encryption and one of the cryptosystems they used was Pi A. Oh right and I'll also just demonstrate that in addition to adding, you can also multiply by plain numbers so it will let you do whatever you times c3, oh right. This is the one where I remember you have to put the number on the right side. There you go. So I mean all this really means is that you can do linear algebra, you can do matrix multiplication or vector math using encrypted values and unencrypted values. Okay so the next thing I'm just gonna mention are commitment schemes. So commitment schemes allow for zero knowledge proof so this allows you to say I'm gonna commit to this value but I'm not gonna show you what it is first. So it has first binding property and hiding. So binding just means that I can't take back that value was from me and hiding means that you won't know what the value was until I open the commitment. And they have support in here for Peterson commitments. So this is just a quick overview of how Peterson commitments work. Basically the way that Peterson commitment works is that you're gonna do like algalmol encryption and then when you open the commitment you just give them the private key and then that allows them to decrypt it and verify that you give them the key and the original message and you can show that they're the same thing. All right so I'll talk a little bit about the protocol class and I'm gonna go back and actually just show a little bit how it's structured. So one of the nice things about charm is that if you have a protocol that involves communication between two parties then you can just write this in terms of methods that call and just return back values that would get sent back and forth and it maintains the state machine of where you are in the protocol and handles for you the transport portion so sending and receiving of data. And this also can be used as a sub protocol so you can have other protocols within your main protocol and refer back to what you already have contained in the bigger protocol. So I'll show the, I'm just gonna, no actually, I'm gonna get out of the presentation a little quick for just a second. Okay so the first, the init method just sets up all the sort of the common parameters that both sides of the protocol are gonna have and then in particular it has what kind of, what are the party types so are you gonna refer to it as a sender or receiver or a client server and there's a state machine that you set up in your init method and this self.db just allows you to store data between different method calls. This setup is all the information that tells it which instance it is whether it's a server or a client and this add instance is just used in the initialization method. And yeah it just has a variety of different useful methods that you call to keep track of state, get state and different points, send messages and receive messages. So this is one thing that was not entirely clear to me when I was first using it and it takes a bit of work. This is probably the hardest part of actually using this protocol class is that, so when you initialize the class you can pass it a socket, either a server socket or a client socket and then it will transmit the data but Python objects aren't just raw binary you need to serialize and deserialize it and it turns out that oftentimes it isn't smart enough to just deserialize all the objects it knows about so you have to implement some serialization and deserialization to get data back and forth. But once you have the appropriate methods and it just looks like passing data back and forth one stage in the protocol will give you a return value and that will end up being the input to the next step in the protocol on the other side. Just transparently. So one of the first types of protocols I'm gonna talk about that is used is called Sigma protocols. And so Sigma protocols, they're called Sigma protocols because of this three part structure that looks like the, in cryptographer's minds I guess looks like the lines of a Sigma, a great capital Sigma. And so there are a number of these proofs where you wanna show that you know something about a public value without revealing that information and it follows this general structure of the prover sends some message A, the verifier sends back some string E and then the prover sends a Z value and the verifier will either accept or reject the proof based on the values of X, A, E and Z. And so I'll give you a little bit of an example here. So the first one is this is a discrete log based Sigma protocol. So how can I prove to you that I know a discrete log of a given group element and the given base without revealing that discrete log? So here what we do is we start with a random exponentiation of our generator G and that's our A value, we send that back. The verifier just sends back an E value that's, so this two to the T less than size of G that just means it's just some random value in the size of the group that we're working with and then the prover sends back Z equals R plus EW. And if you look, since A is G to the R and H was originally G to the W and the verifier is gonna compute H to the ETH power when you multiply all those together that should be the same thing as what's on the left side that G to the Z value. So this allows you to hide the information about that original exponent in the new exponent value. So I'm not gonna quite show that because the Sigma protocols aren't in charm already, aren't really, we're implemented with the intention of being independent protocols. It turns out that they use those for a different protocol called oblivious transfer. And so I'll talk a little bit about oblivious transfer. The idea here is that you have, some, say a server has multiple messages and a client might wanna know what one of those messages is but they don't wanna tell you what that message is and the server doesn't wanna reveal all the, any other message besides the one that the client's asking for. And this I think is a very interesting privacy preserving protocol. You can imagine a number of situations where you wanna verify some information about say some domain but you don't wanna tell everybody that you wanna check out this domain. It might be one that, you know, people will be like, well, why are you looking there? So using oblivious transfer as at least one way that I thought of that you could keep the privacy of this inquiry while still being able to satisfy these security considerations of doing validation on the domain. So this text just basically says what I was trying to say that, you know, in the original definition it was just two messages. It's since been expanded to, you can have N messages and allow for up to K requests. And so I'll show you that they implemented what was called an adaptive oblivious transfer protocol in here and it'll demonstrate a little bit of how the protocol class is used, how it's subclassed in Python. So, all right, so you can see the first couple of lines under the protocol that are in it are talking about what I was saying before about how the protocol class maintains a state machine. So the way this works is just by having a dictionary that maps which state you are, which is just a number to the method that you have to call for the next part of the protocol. And, you know, typically it just goes in order one, two, three, four, but as you can see, I guess like the sender has the option of going from three to, back to three or three to five. So the dictionary below, these are the transitions. So these basically are the nodes in your state machine and these are the transitions between them. And here's where we add the different types of parts, the receiver and sender. You'll notice a lot, and a lot of these protocols use pairing groups. So we use a pairing group here, and then most of the, so these methods right here are actually ones I added because it didn't exist at the time. So when I tried first running this, I found that it would just crash and I was like, why, it was saying key array. So what would happen is that when, because the serialization methods were not actually occurring, it was not actually sending and receiving data and then it would move on and try to access data that it didn't actually have. So I had to go through and serialize the data. So more or less what should work is you can pickle the data, so you can use loads and dumps. However, there are a number of the classes that they have in here, pickle can't serialize. One of them being pairing groups, the integer groups are also ones that just can't pickle, but those objects have their own serialization methods. So you can just call the serialize method on those and then it's just a byte string and you can pickle that. And then you can see that, so in this case, the first step in the protocol for the sender, that's the one that has all the messages is to set up these values and when a receiver first initiates contact, it's gonna say, okay, let's run sigma protocol one, that's what it calls it. Okay, and they structure all the sender methods together, so you'll see that there's another one that involves where it's gonna do the proving for sigma protocol one, I'll do sigma protocol two and three, so on and so forth. And then down here are the receiver methods and it will create the sigma protocol class and initialize it and I had to extract out the necessary information for me to actually construct the values that need to go into the sigma protocol, in this case, the public key. And more or less works that way. At the bottom, they did allow this to be a script, although I couldn't actually get this to run properly here, so I actually, what I did end up doing was, mostly because in PyCharm, I haven't found a way to run the same file with different parameters at the same time, so I just, I don't know if anybody in this, anybody know how to do that? No, okay. So I just basically created two files that have the same functionality and just one gets the one parameter that gets the other and it implements the protocol. And so, locks? Oh yeah, so I mean, I'm not exactly sure, so I didn't end up doing that. So what I did was I created two separate files, so there's, this one's called a blues transfer receiver and a blues transfer sender. And in this case, the receiver acts as a server, so let's hope this one actually works, did work earlier today. So it's operating a receiver and then we'll do the sender and wow, that went fast. So if, here's, we've gone through all the steps and oh, that's the sender here. The receiver said, yep, it looks like the protocol completed. So this is, with a little bit of work of implementing serialization and deserialization methods, this protocol is basically ready out of the box in Charm. So I'm gonna now actually talk about that protocol that I mentioned earlier. Mostly, I think it's pretty neat and I think it would lend itself well to prototyping in Charm here. So when I actually did this, this was a while ago, it was about seven years ago, we did the entire implementation in C++, but we looked into a Python-based, well, it was a Python derivative called TastyL and Tasty stood for something about multi-party computation, I think. I mentioned that in my talk last year, so I forgot what that stands for off the top of my head, but I think we always had problems getting it to actually work and compile. This should actually work, haven't fully finished getting it to work, but I'm gonna just quickly explain how the protocol works. So what we're trying to do here is we have a server that has some body of text and you wanna run a query on it, some sort of wildcard-based expression. So you can think of like a regular expression without allowing for multiple characters, like you need to know exactly how many characters you're looking for, but you could in principle ask, say, I want a pattern where the first character is A through F and the second character is B through G or something like that and then maybe the rest of them are all fixed characters or maybe you'll have just like, it can be anything wildcard star character and this protocol should work. So the idea here is that what you're gonna do is for each letter in your pattern, you're gonna construct what we call, so this is a CDV, a character delay vector. The idea is that let's assume that if we find that letter in the text we're searching, that it's part of our pattern, then how far away would we have to go until we reach the end of the pattern? So and we're gonna say, we're gonna go that much further and put a one there and in the end, we should end up with a total of, like say for this pattern of T-A-C-T of tact, we'll get a four in the location where the pattern is found. So I'll just go through it real quick. I explain this. So when you construct the CDV, you'll see you first have a T here, that corresponds with this one here, the one at the end because that's saying that if this is that T, then we have to wait three more characters before we get to our pattern. And then if you get to C and A, that means okay, if that was part of the pattern, then we have to wait two more characters, I'll be over here, the C corresponds to this one and this T corresponds to this one right here, that's saying we're there, right? If the T we're on is that one, we're done. And then what you do is you start off with what we call an activation vector and we go through each letter in the text and look at the corresponding character delay vector and add that chunk to the vector and keep moving forward and then shift one over. So G had an all zero vector, so nothing happens here. The A has a zero, zero, one, zero, so we get a one over here, there's this T, another T here, ACT, okay. And what happens is, see this one, as I said, corresponded to the end of the pattern and it landed right here where that's the length of the pattern you want. If you want a wild card situation, you actually do that by saying I'm gonna look for one less. So say instead of looking for a tact, I just wanted TA and then it could be anything, AC, G and T, then instead of looking for a four, I'd look for a three. And this actually comes from, what I think are called time delayed neural networks. That's where this terminology comes from activation vectors. So the whole idea being that the activation vectors, how much time is delayed between when it sees it and when this neuron fires. But my colleagues realized, well, we could just, we could sort of borrow that concept and apply it here in this way. So in this protocol, what we do is we use an additively homomorphic encryption scheme. It can be PIAA, there's an additive or an elliptic curved algamol. And one of the things that's nice about this protocol is we make a few tweaks to allow us to use those bottom two. And that is that instead of actually looking for four, what we do is we subtract by the four that we're looking for in every single spot. And that means we're really looking for zero. So what happens in our secure version is we do all this homomorphically. So what the server will do is it will generate this initial activation vector by creating an encryption of all zeros. And they'll all look different because it's a probabilistic encryption scheme. And then the query that you send is gonna be encryptions of all these values. So these are all gonna be encrypted values and the server won't be able to tell whether this is a zero or one. But because of the homomorphic nature of it, you can add and effectively subtract. And so we'll get back down here where this one will be zero. And then this protocol, the secure text pattern matching adds a few extra steps to make sure that we don't leak any additional information. So what we also do is we multiply by random values. So anything else will become random except for the zeros because anything times zero is zero. And then you can also shuffle around the values so you can't, so ultimately what you get is you find out how many times that pattern showed up in the text but not where. And you don't learn anything about, like so in this protocol right here, you can learn that you could get some information about how far off the text was at each location from your pattern. But by doing this, so the subtracting by four is just to allow us to do a blinding step of just multiplying that destroys the information of the other ones and then the shuffling just eliminates the information about where you found it. So I was hoping to actually demo this in Charm but those serialization issues have been dogged me so don't quite have it working right now but I can show you basically the structure of it so. Okay, so I wasn't sure if there was any way for me to just have one step on the server because really there's only one step. Each step is between communication so we'll either receive information and then send or do everything inside that. So since all that happens is the server receives an encrypted request and then returns back a encrypted reply, there's really only one step here but I put basically a dummy step here that just returns none there. But okay, so go back up to the init method. So here I again have my states, my transitions, add parties, do all that kind of stuff. If you notice in the diagram I had before it was using an alphabet of A, C, G, and T and that was because one of the original applications of this protocol was for privacy preserving searches of DNA so I think one of the applications they were saying as the suggested was you might have some researchers who know some gene mutations that make you more prone to cancer. You'd like to know, hey am I more likely to have cancer but you don't want other people to know what the answer is, right? That could mess with your insurance or anything like that so you get your gene sequenced, you encrypt your DNA in this special way, send the response back and then you can see if the appropriate genes match any of the patterns they have. So in this case I just tried to keep it to a relatively small alphabet of just the escalators, digits, some white space and punctuation but I also allow you to specify the alphabet if you want and the init method has two other protocols and parameters, one is a text file so that's the text that's on the server side and the query so only one, only the server is gonna fill in that one and only the client is gonna fill in the query parameter there and then I just have a few helper methods here this one constructs the character delay vectors, this one does the encryption, this does the searching on the server side so this does all the homomorphic operations and it does the blinding and the shuffling and then for the actual steps of the protocol the client just first generates a key, it constructs the character delay vector and sends both the character delay vector, the pattern length and the public key and then what it expects back is an encrypted activation vector and it just does a search through that to see do any of those values to crypto zero and if it does it'll print pattern found otherwise pattern not found so fortunately my demo's not quite working so I'll basically leave it there, I think we have a few minutes for questions if anybody has any but other than that I think that's all I've got so thank you.