 Alright folks! Let's get started. Happy with breaking Senate 3? Crypto? You have the power now. You can break stuff. Are you coming to dangerous in your own rights? Cool. What was that? I said I only broke my own Senate bill. Well you seem to be here, so that's fine. Unless we're all in here or anything, you're probably okay. Sanity is, I don't know, you can get more. Alright, cool! Let's keep going. So I know John, I went over his lecture on Thursday. I just posted that recording today. So if you weren't there, you can review it. We got a lot of jam for today, so we can get to the next really fun motorcycle. He said it's pretty good. Okay. So, what properties of crypto systems have we seen so far? So what are these encryption algorithms? What do they allow us to do? Okay, so in some sense we're keeping messages secret so that we can transit them to some other people, which gives us the nice properties of confidentiality. We saw how we can get with public key crypto systems, non-repubiation, by assigning something with your private key, then everyone else can check it with your public key. But we didn't really talk about integrity. So why is integrity important? So why is it important to make sure that the message that was sent is the same message that is received? Sure, but it's confidential, right? So nobody else can read it. So what do I care if somebody else messes it up? So maybe, so say that again? Okay, interesting. So more of the human element trust in the system, trust in the crypto system. What else could you do without, let's say, integrity? So you just have confidentiality, right? So we have Alice and Bob exchanging messages, and you have Eve there, who can see all the messages that get sent across. Yes, that's right. So if you potentially had like a banking system where you had messages that got sent with a transfer of funds, if it was encrypted in a way that you could read it, but you need the structure of it where, say, like the amount would be, you could then change those bytes to be a different amount than the original intended amount. Okay, so maybe, let's say even if it's using very weak, terrible, as you know, you can break it, a visionary cipher. But if you know roughly the message, even if you go into the key, you know if you can flip some bits in the message, without actually reading the message, you could maybe change what the decryption result is and what the amount is, so you could have the transfer instead of whatever, $500, $100,000 or something. Yeah, that could be one possible attack, one other possible attack, yeah. Just maybe if I can find the real message, like garbage. Yeah, maybe I flip a bit and now what the other person receives is garbage, right? So how do they know you actually didn't mean to send them garbage? Right, how is the recipient of the message supposed to know that it's a whatever, garbage file, or maybe it's further encrypted or I don't know, right? So you could mess with the availability of the message that way. What else? Any other thoughts? Yeah, so maybe they can use that as a way to manipulate bits to change randomly the decryption result. What about, so going back to the banking example, right, what if Eve sees this message with Bob, we can say it's Venmo or whatever, Bob's paying Alice $500, right? What if Eve takes that message, doesn't know the contents of it, maybe he doesn't know what it's doing, but what if she sends that message again and then sends that message again and then sends that message again? How does Alice know that it wasn't for transfers of $500 and it was just one? Is there anything we can talk about that can help solve these problems? For all this encryption that might be able to say it's like metadata that they send with it Yeah, so we need something, we need either some additional metadata and really what we're trying to get here is we need some way to verify the integrity of a message, right? We might be able to say what if an attacker flips a bit, what if a bit is corrupted? How can a bit just become corrupted? Packet lost, maybe the packet gets dropped along the network, but I'll say wear and tear on the server. What was that? So wear and tear on the server, maybe the memory modules are bad on the server and it randomly flips a bit, maybe a gamma ray comes in and flips a bit in memory either before it's sent or when it's in transmission. And so we want some kind of mechanism to allow somebody who receives a message to verify the integrity of the message. And we want to do so in a way that actually we can actually trust this process. So the idea here comes back to it and we'll be very specific here when we talk so we're going to talk specifically about cryptographic hash functions. So I believe, have you talked about hashing yet? In what class? Cool, so the idea behind hashing and so, you know, it's kind of an interesting problem. So how can I have you trust that this message is exactly the message that I think it is? One simple stupid way would be, well I send you the message secretly and then I transmit the same message and you can verify that the message is identical. What's the problem with that? Or if they already knew the message then they get somehow verified. So the idea is we'll use functions that let's say map and arbitrary size data. We don't want any limits on the amount of data that we can verify the integrity of that's way to output a fixed size bit string. Okay so what does this mean? You're going to say well it's going to always output 25 bytes of whatever amounts of bytes but they're going to be based on the original message to some extent but it's not as efficient. Sure, so what do we mean by a function? It takes an input and like deterministically produces output. Yes, so like the mathematical concept of a function, right? There's no side effects, you have input that goes in and output that goes out and so if you give it the same input it will always derive the same output. What properties would I want from this hash function? Let's say I use the hash function and you said what 25 bytes? Let's say I have a hash function that takes the first 25 bytes of the file and that's the hash function or the input. Should be what? 1 to 1, so that like if you map one input it will map to another specific output so you know what input you're getting back. Okay so it should be 1 to 1 mapped so what would that mean? So every input maps to a unique output. Okay so if I have 25 bytes of output how many inputs can I possibly have? Yeah not bits but I still need whatever 8 times 25 is to that number. Is that satisfying my arbitrariness definition? What if I fit more data? What if I want to be able to encrypt arbitrarier data and arbitrarier data is larger than 2 to the 8 times 25? Yeah. Right so I have several properties right from this definition of what I'm looking for here it has to be that different inputs will hash or map to the same value. If you want to have an arbitrary size input and have a fixed size output that is fundamental that's going to happen. So what's the problem with taking the first 25 bytes of the input and saying that's my hash? Does that violate this property? I can give arbitrary size data into it I can also make a hash function based on this definition of just how it puts 25 zeros. So every input map is still one value. What are we trying to get from this hash function? Yeah we want to be able to see if that input is changed. So if I have all of the input map to one value does that help me detect if something has changed? What's the problem with using the first 25 bytes of the file? What if the corruption occurred later on? Yeah what if the corruption the first 25 bytes is the same and then I just add on to the message and in a way ignore what I first said I really want you to transfer a million dollars to this other account. The hash is the same right? The hash is the same value. We'll see specifically how we use this but we want to understand some of the properties of this function. You can also look at this as a trap door although I don't know. I'm not very familiar with trap doors but we want a one way function. What does this mean? Difficult to reverse right? So I have an input I can easily get an output but if I have some output it's difficult to go backwards and derive what's the input that generated that specific function. You talked about this in one one mapping we can determine this thing and we want somebody else mentioned this property right? We want a small change in an input bit should completely change the output. Why do we want this? Or a flipping a bit attack. Right so the flipping a bit attack if that changes one bit of the output that probably leaks some information about what the input data is. I was going to say so it would be hard to spoof much part of what you were saying. Yeah so this is part of it right to spoof a or to find a different input that maps to the same output right if you could easily just if one changes the input bit cause that that could make that a lot easier. Okay so some of the things how are we actually going to use this in practice? I think Jan maybe you got the feeling of this while he was doing live Python demos but public key crypto is fairly expensive why is that? So for some of the RSA operations yeah. Just cause it takes a long time to get built from two primes. So that's creating it right? So there's creation of public and private keys that definitely takes a long time as you'll find out shortly but even once you have that to actually encrypt a message what do you have to do? A bunch of math specifically raising you're doing exponentiation right? So you think about a CPU what operations are fast? Additions, subtractions, binary operations, XORs, ANDs, knots right multiplication is a little bit slower, division is even slower and then exponentiation is even larger especially when you have these huge values that you have to exponentiate a very, very, very large prime because of that we want to minimize the amount of data we need to run through these public key crypto systems. So let's say that Alice wants to make a statement and that everyone knows this from Alice right? So we've already talked about how we can do this right? What's one way to do this? Choose your private key doing cryptomessage and then everyone else can decrypt that message to verify that it's from her. What if this message is a whatever, a video she just took of her trip around the world and it's four gigabytes? Do you want to do exponentiation across the message that's four gigabytes in size? Oh it would be insane right? It would be very, very, very large. So how can hashing maybe help us out here? Because when Alice, so another way to think about this when Alice encrypts the message and with her private key does she have any confidentiality guarantees once she shares that message out? No, why no? Because her public key is public and anyone can decrypt that message so why go through all the hassle of this expensive encryption operation? What is she actually trying to protect? Yeah. Yes, this message ends exactly what she says it was when she signed the message. When she encrypted it with her private key. Could she she's going to use that as a function? Could she then create the hash on her side? She knows what the order of where a hash should be and then as long as the range is the same function they should be able to verify that. Cool. Okay, let's go to that. That's good. So one thing that we can do is Alice computes the signature of an M so she develops a hash of the message and then what does she do? So walking through it. So then she would make that public so that anybody could decrypt the private key. So it makes the signature public along with the message. And then whoever receives the message can then send the message to the same hash function so they can generate the same signature. Perfect. So then Alice takes her signature of the message and the message itself passes it to Bob. So what does Bob receive now? The signature and the message and then what does Bob do? Hatch the message. Hatch the message with the same hash function and then verifies what? That the signature is equal. So now does Bob know that this message came from Alice? What could have made that a video or a message? What could our attacker EVE do in this scenario to alter the message? The hash function. Say it again? Yeah, so created some new message. So this signature of M and the message itself gets passed from Alice to Bob. EVE gets it, has her own message and will call message prime and then what does she do? Makes a new hash, calculates her own signature of the message, prime. So now we have a signature of M and prime and EVE then sends the signature of M prime and the message M prime to Bob and then what does Bob do? He takes the message M prime passes it, compares it with the signature says this is great, this message hasn't been tampered with. So what's the problem, yeah? So if that hash is a fixed size can we encrypt that? Yeah, so if the hash is a fixed size we can encrypt that with what? Alice's privacy? Do you agree this is your scheme? Not Alice's. Oh, now we can do a line made modify in between to get her on a sign up with the recipients of the key. So it depends on our goal, right? So we'll go with the second one right now, she wants to make a statement and then everyone knows this from out, right? So it wants everyone to be able to check. So what key would she want to encrypt that signature of it? Her private key, right? And so if let's say this hash function is well 25 bytes is terrible, but we'll say it's 512 bytes or something. Let's read this. So let's get off the bytes because that's going to be great. But so now in the scheme if we have Alice takes in her secret key encrypts the hash of the message and calls that the signature of the message and then sends to Bob the signature of the message and the message M, what does Bob know? So Bob gets the signature of the message and M and now what does he do to verify this? He would use Alice's public key to decrypt it and then run the hash for quality. Yeah, it takes Alice's public key to decrypt the signature of the message which gets the hash of the message and then she performs her own hash function on this message and verifies it that correct. So if they're equal then what does Bob know for certain at this point? That the message was signed by Alice so that the message came from Alice and the message content is exactly what Alice says it should be because the hash is not. Yeah. So how does this solve the E's problem? Because E does have Alice's public key too. Yes. So E could spills people this and I don't know if you can get the hash back and I don't see how that changes anything. Perfect. Okay. So now we assume that E was between these two people Alice and Bob. Right? So E again just like before right? E can create her own new message M prime she can create a hash of M prime but then what does she do? She can re-encrypt it with Alice's key again. Right? That's what we can't have. Right. So the key problem here is that she can't encrypt it with Alice's secret key. Right? Because she can encrypt it with her own secret key but then she has to trick Bob to think that her key is Alice's key but if Bob uses Alice's key this operation will fail on the signature M prime. Is that your question? Yeah. So yeah and again this is actually predicated on the entire security of this model is predicated that it's difficult to create a new message M prime that hashes to the same thing as the original message. Right? If it was trivial just like we said it you could easily because you can think of in this signature right we can't alter the hash of this messages but if we could create a new message M prime that hashes to the same thing then we could substitute our own message in for M. Does that make sense? Yes. Yes. So you can define it very broadly as a function that maps arbitrary size data as a big size big string. That's just the hash function part the cryptographic part has other properties which we'll talk about in a second. So for instance over I want to know the internet tables right so over that layer of the network there's a LMC check CRC 32 which is a type of hash function that maps the size of the packet to a fixed size value that's just used to detect the bits because you can trivially find the new message that maps the same CRC values. So it maps arbitrary size inputs to a specific small I think 32 it must be 32 bits so here you have a hash function that maps the 32 bits but it doesn't have any of the nice cryptographic properties so but other hash functions do have good properties that makes your question any other questions yet? So is Alice sending the plain text M to Vaw? Yes. Because Alice doesn't care so in this scenario Alice wants everyone to know this message and everyone to know this message comes from her. So she doesn't care about keeping the message secret necessarily. So this actually has a lot of uses in actually everyday computing so I don't think I'm using it right now but sometimes I've used email encryption to sign my messages. So here you want to send an email to somebody you don't care if anyone can read the email but you want them to verify that it actually came from you. Files so anybody use is it ZFS that uses this a lot? Anybody use or heard of the ZFS file system so it's the file system that does hashing of all of your blocks on the file system so all the parts of your files are actually integrity checked why is that important? Because actually surprisingly you can have hard drives have silent failures they will just corrupt or alter some chunk of a file and you'll have no way of knowing that but this way you can actually detect that and if you have redundancy you can recover from it. It's a very vulnerable file system every week or whatever to detect these things. We'll see how this is used in password verification so this again as a little preview the key idea is a hash function is only one way so if somebody were to steal our database of username and password we don't want them to be able to extract everyone else's password so this way if we hash the password in a secure way it's not as simple as just and improve the security of stored passwords ideas also used in proof of work so like all of the blockchain and the consensus algorithms most of them do some form of proof of work and you actually do that by solving a specific generating input data that matches a specific hash function that the output of the hash function with your input has a leading number of zeros and that gets exponentially more difficult to find that input data also I'm going to use a git share a git commit with somebody I'm going to go more on my identifier that's actually the shot to the no if I call it 2D6 2D6 the shot to 2D6 hash of essentially that state of your source control system that state of that repository so that's what all of those hashes are the reason why they can use that is because they know that people will be unique and it's difficult to find all of these properties so the properties that we need from our hash functions in order to have the cryptographic properties right so that we can use them to actually verify message integrities and in other ways one of the basic things is they call it pretty interesting if I give you a hash value H right this is the difficulty of going back it should be difficult to find a message so that's that when you hash it it equals H so if my hash size output is 256 bits hash size output of 256 bits is it impossible to find a message M that hashes to that value so theoretically it should be it's not impossible how do you do it just brute force and how many times do we have to try it's a 256 bit output system the hash size is 256 bits yeah so 256 which is very large I don't know what it is but that's a pretty large space you have to search for right and maybe you'll find it sooner who knows but you'll have to at least try that many inputs and then even when you find one are you guaranteed that this was the message that actually was used to create this hash function no why not I was going to say aren't there instances where that doesn't really matter like for example when you're in a classroom you just need something with a hash you still don't want somebody to be able to get the hash and work out the password because that one's the point oh no what I'm saying is if someone's trying to hash to get access they could find an equivalent string that created the same hash in that case then it doesn't matter if you've left it anyway so yeah that's also another in that sense yeah as long as they find another string the hash is that same value it's fine actually other things that we want from our hash functions we want that if I give you one message and so a message M1 it should be difficult to find an M2 such that the hash of those is the same right again we know because there's this big size output there must exist some other message that hashes the same value so if my output of my hash function is 256 bits it better be the case that I have to try 2 to the 256 other messages to find a hash can somebody do what's 2 to the 256? 10 to the 77 10 to the 77 yeah that's a lot and you need to do a hash function to have a bigger size of it outputs of like 512 we also and this is actually a slightly subtly different tweet here in that it should be difficult to find 2 messages M1 and M2 that hash to the same value so it's different between a second free wind resistance and collision resistance and the other couldn't be more like incidentally generating the same result I guess so which gives the attacker more freedom and more power yeah so it's kind of interesting right so the collision resistance so you have the attacker has a choice between M1 and M2 where a second free wind resistance is your given M1 so you don't have any freedom of deciding when M1 is so this actually makes the attacker's job more difficult than they're trying to get to a specific area right so some of the interesting ways that this has actually come up so this second free wind resistance would be in the case of the eve scenario where she steals the message and tries to create a new message M prime that has to that same value because she has that signature so she's trying to create a message that maps there and the interesting thing about collision resistance would be if I'm going to create a dissection trick they've done before when they break a hash function is you output you find two messages that say let's say opposite things like who's going to win the super bowl the Patriots and somebody else so you can make two different messages that say the Patriots win or the Patriots lose and if you can make those hash the same value you can output the hash function the output of the hash function to the world and say here's my hash function I'm going to tell you what this means afterwards and then after the super bowl depending on what the outcome is you release that message and they both have hash of the same value so people have done this with breaking things like md5 you can do this with pdfs that hash to different pdfs that hash to the same value that's kind of crazy so another kind of more an important thing and a part where hashing is hashing is really used is often times on a website so so our scenario is the web server wants every web browser so every person's browser so when you go to a website you usually ask your browser to store some bit some little chunk of information that allows the website to see who you are later so this is the core idea behind cookies if you ever seen cookies you go to the website and say hey I'd like to access your website they say great here it is and by the way store this little bit of information this cookie and the next time you talk to me send that back to me so this way the web server can link all of your requests that you made to the web server and sometimes I want to do things like store the user ID is 50 so this is what's the problem here so what's the problem with this scenario so if I, yeah well I mean if it's you're storing them on a user's computer then you can't yeah so fundamentally the web server is trusted but your machine is not trusted and this is actually kind of one of the exact settings of the last button I'm using but I think on the submission site this is a similar style so if you look at the cookies for there it should maybe says your user ID I'm not sure right which seems insane because well you can probably alter your user ID because you have access to your computer you control all the data that's on there so you go into your web browser you change the cookie value to change your user ID to another user like user one which is me do we want that to happen no and this is actually something that's really fun I think now you can just right click and do the inspect element and I think one of the tabs in the developer tools will show you the cookies that are stored with that particular site and so this is actually interesting so we want to have and what if I just took this value and hashed this value and said okay we store this user ID equals blah blah and the hash of that information if you're using serial user IDs most likely you don't have so many users that through forcing would be easy to figure out or I could just the web server has to verify yes this is a valid cookie from me then if all it's doing is hashing that message and comparing with the hash that you sent it could very easily create a new user ID hash it and send that hash back to me and for kind of interesting architecture reasons usually we don't want to have to worry about private keys so we could use a signature we could sign this value with the private key the problem is that I guess that's kind of the same technically you could do that but there's actually another interesting idea of using just an actual key a random value that we can get all authentication so we can see that we actually were the machine that made this value and integrity of the hashing here so basically this key idea is how do we combine this key that we have with the actual message that we want to store so one thing that we could maybe think of is called a message authentication codes so one thing to think of is well we take the key which is some random value or let's say and the double bars here means append so we append the message to the key append the message to the key and then hash that and use that to send it back to the user so now if we use something like this why can't somebody just change the message and re-run the hash function so they won't know the key right they won't know the key the key stays on the server and never leaves the server what does the so it relies on the secrecy of the key so what if my key was just added could key that key you could guess you could guess it right so yeah the actual the whole security here depends on how again this comes back to searching how random is the key right if it's just in a dictionary how many words are there in a dictionary how many people's names are there is it 10 to the 77th amount no it's in the there's probably what 20,000 words something 30,000 words and then add names and maybe add different cultures and you're up to maybe 100,000 guesses right computers are very fast if you make 100,000 guesses you can't make 10 to the 77th guesses very quickly just a huge number right so if we assume our key is is um excuse me if we assume our key is randomly generated and the interesting thing here is actually and it seems like a scheme that should be secure because we can calculate the math of our message based on a key we can give somebody the message and the math and then we can verify that when they give it back to us that we actually generate this because we can re-run a hash function with this key they cannot run the hash function it turns out that depending on the hash function you can actually extend the message so the way that some hash functions work is like mv5 it does I don't know it does a complicated kind of very similar to what we saw with DES where the input comes in and kind of propagates, moves bits around does all this kind of stuff and what it outputs is this state of the hash function at that point in time so it feeds all of this input to the hash function it changes around it with the internal state and its internal state is the output of the hash function the interesting thing is this internal state you can load back into the hash function to recreate the hash function at exactly the point that it was at the end of this message so then you can keep adding messages to the end here you can actually extend the message and add more things to it it's really crazy, I'm telling you this so you can look this up, this is really interesting we'll take the message then and append the key and it turns out that again due to issues with different hash functions you can actually easily find that it has hashes of the same value without knowing the key so there's a, this is a interesting way of doing this a hash-based map so you can think of it as a function that takes in a key and a message either extends or shortens the key based on the block size of the hashing algorithm takes that, xors it with a different padding so if you're kind of doing both you're hashing the key with this pad you've got a hash of the key and another pad and then append the message so that actually makes the whole thing so I just kind of want to show you one instance that's actually used in practice all the time like every wrong, most websites not most, I'd say these amount of websites use some form of this to store cookies on your machine such that you can't modify it without knowing the key and so I wanted to show a real world implementation to achieve interesting security goals and now I want to turn our discussion to we talked about public key crypto systems and what are some of the problems so it seems like we started with symmetric encryption we said, okay, this has clear weaknesses of exchanging keys, how do you exchange secret keys when you're using symmetric cryptography it seems like a very difficult problem so hey, here's this new thing that people created of public key cryptography where now you never have to share a secret between somebody but you can still send messages to each other that are encrypted and confidential and you can get integrity by using hash functions all of these nice things so what are some of the problems in this public key crypto system that we talked about what are some of the assumptions if we step back, what are some of the assumptions of anyone talking about public key crypto secret keys will remain secret that's definitely a problem if the secret keys get out it's hard to fix that problem I'll say that but actually in real world implementation they do have ways of dealing with that with revocation lists you can revoke your key and say that this key, don't trust this key I lost access to this key what else I was going to say that from the beginning we're kind of assuming that the key that's the public key that everyone can see, we know for sure that it's that person that's not like some other person yeah so the other assumption we made is that everyone knows everyone's public key so in our message attack we assumed that Bob would use Alice's public key to try to decrypt the hash right if Bob uses the public key but thinks that it's Alice then we have a huge problem there so what else what other problems assuming that you can't once we can't prove force it that's like having to pick another is assuming that you can't go backwards from the public key to the practical key yeah so those are all the and even things like cryptography that uses like RSA that uses factoring one of the hopes of quantum computing is that we'll be able to factor keys and factor numbers in I don't know without brute force and so they're already working and they already have public key crypto systems that are quantum resistant so even assuming you have a quantum computer you won't be able to easily reverse that operation like I think the elliptic curve crypto is I think quantum resistant but don't vote me on that okay so you have those problems what else what are we supposed to do from the cryptography to crypto yeah so there's a lot of additional operations that need to be done right so we talked about the efficiency of crypto systems right so we already basically used hash functions to get around the idea of having to encrypt the entire message if all you want is non-repudiation but that still doesn't help us if we want to send a file that's 4 gigabytes using public key crypto we still need to think about that right so these are all kind of interesting problems that come up when we start actually looking at okay this is a cool idea in theory right we started with this idea of the box on the table but then how do we actually use that data in various systems so one of the things we'll look at first of how to trust public keys and again we talked about so what if Eve replaces all the public keys with their own right so now everybody's public key is actually Eve's public key now what can Eve do send messages on behalf of other people yeah not just send messages on behalf of other people she can also perform the man in the middle attack and essentially rewrite every message that's being sent so for instance Alice can so Alice takes a message 1 encrypts it with Eve's public key which she thinks is Bob's and gets some cyber text now Eve can take that message because it's encrypted with her this cyber text is encrypted with her public key but Alice doesn't know that decrypts it with her secret key gets a message and then uses Bob's real public key to encrypt that message a new message into cyber text 2 then sends it to Bob and Bob decrypts it with his real private key and gets this different message so we can see here that this all of this mechanisms that we've been creating were predicated on can we actually create like can people know exactly whose keys are whose right so how do you do that you're smart people solve this problem one person that everybody trusts to know every single person's who have one person you trust to know everybody's who's that person going to be Jesus Mozilla that's interesting more than some other okay you could maybe centralize things right you could have one person that knows the identity of all of you in this class or you know which in some sense we're doing already not without public or private keys but you know you all have ASU right IDs you all have ASU IDs like I can map and link who you are based on that information what else to background that so what's that if you use a specific person's public key to encrypt something you gotta like reply to e-mail right so maybe out of band communication where I can verify maybe what your public key is what else we have some storm hashes of those public keys that way you can verify sure we're assuming that the public key can change the hash we can change that as well yeah so we can look at public keys we can say okay so we still have some way of securely transmitting them right what else do we actually need so yeah what is the future opposite yes but you don't have access to the private key all you ever see is their public key right so think about me let's say you took this class you want to send a secure message to me how would you find my public key yeah so it's kind of a bootstrapping problem so how did because still Eve can always substitute her public key for what you think is my public key what if in order to check that it was valid then use public key that you use to decrypt their message to re-encrypt a new message then use your private key to encrypt the message again then send it back to the original owner and the only way they would be able to decrypt it is if both keys match so therefore if they can decrypt the message you send back to them you have the correct public key that matches their private key I have to work through that math but I think Eve could rip off you signed it with your private key she could rip that off with your public key and then within there substitute her home message that she assigned with her private key and fakes you into believing her public key is your public it's fundamentally right so even you could go to so if you wanted to send me an encrypted message you could go to my website you could download my gbgp but how do you actually know that's my key I was going to say can we have like central hubs in the middle where I think this is kind of awesome somewhere so basically everybody has these central hubs but everybody then has to have the public key just for those eight and we all know we've all like set to read for those eight or trustworthy then those are the only ones that have to worry about the senders not being sold but if you're sending it to a central hub then there's not much chance they're just forwarding the message along but everything that's signed to that hub is then known by everybody so it's interesting it's kind of taking the centralization idea we have one trusted entity extending it a little bit so let's have N trusted entities and have them do stuff I was thinking kind of the same thing about the centralization but instead of being a person just like a completely open where everybody just posts their public key and you can't ever have any duplicates you just keep populating the list and it is real real good so what she does is gets in between your connection to the central service and so what you see is actually unique keys that she's generated for all of these people on their behalf it's got to go a little deeper it's too much work yeah I guess kind of on the center there's something you're saying with being only trusting the one who knows what the information would be so it's kind of like how you were saying how do we get your information all the trusted ASU to relay your information securely to us so what would these people have to do so let's say we have this central organization what do they actually check it where you're trusting that they have but how do they know my full list you should say make the one time use tricky yeah I don't know they can still have the problem of how do you then share still how do you map even if the public keys are only being used once how do you know that how do I know that you own that you don't figure a card at all yeah so sometimes it makes it harder it makes other things like for instance if you mensize the two with a fake key your next one will at least hopefully be good assuming you're not being intercepted there you also just meet the person in person you can meet the person in person and then do what and then do what but let's say as you meet someone they say they're having to pay and then they give you the public key assuming you already know okay so you can assume that you know then what else government maybe you can check their driver's license you can create a fake driver's license if you're nobody in here has experience with that and be done you can so the other interesting thing to think about is so let's say you meet me and you say okay well you've done your work to verify that I am who I say I am so you can actually so anyways so these are all different approaches that we can take we'll see in a second but this idea of delegation centralization having some people that you trust some central authority that's going to do this identity verification or to do this mapping you need what does this key represent does it represent a person a domain name, a company what does that actually mean and then somebody needs to do that verification so for instance getting an SSL certificate that we'll see in a second requires you to prove that you own this domain because you're getting a certificate that says this certificate is valid for this specific domain so you're getting a public private key so somebody else is certified that you own this key another approach so a lot of your very trust you don't want to put your trust in other organizations you want more of a decentralized approach which is and this is brings up a model called the web of trust where essentially you can use the signing that we talked about using hashing where I can sign your public key with my private key and then you can use that to prove to other people that I trust you and so you can build this web and you can verify your public key and you verify their public key and I trust you therefore I trust that you verify this person's public key and then depending on how far that trust goes you can then actually start trusting people so first looking at the public key infrastructure so the way this works on the web is you have a certificate authority that's responsible for verifying the identity of people who ask for so anyone can generate a public and private key for the web we're talking specifically here HTTPS so when you get that nice green lock icon anyone can generate a public or private key I can generate a key a certificate that says I am Google.com but the problem is again the exact same problem when you go to Google.com and Google says ah here's my public key how do you actually know that it's Google as on me pretending to be Google and the way this happens is again with signatures Google says Google says hey here's my private key and by the way here's my public key and by the way this has been signed cryptographically by a certificate authority that you trust so if you trust that certificate authority you'll trust my public key is what I say it is I as an attacker shouldn't be able to convince this certificate authority to give me a key to give me a certificate on behalf of Google.com this crazy hierarchy scenario where you have a couple trusted entities that they trust other entities so you can have this hierarchy of keys being signed and you can go up and it's called the root of trust ultimately though the interesting thing is your browser is the one that decides essentially what you trust or not to trust so are public keys pretty static or do they change every so often so most of them will have an expiration date on them just in case of compromise you want it to be refreshed every so often so yeah that's the basic idea there and again the problem of revocation lists what happens when somebody's private key is compromised you actually have a really big problem that Google faces because the set of certificates don't trust I think we're talking about this a little bit who here uses a corporate laptop that's not theirs yeah so some people so your corporate laptop your company can install their own certificate authority on your operating system and say this certificate authority is trusted which gives them the ability then to sign for any website you see which gives them the ability to name in the middle all traffic that you're making out to the internet Google doesn't want this because they know exactly what key they should be using they know exactly what their public key is so they use what's called certificate pinning one way to do this is through a header that the very first time you visit a website it tells your browser this is the hash of my public key never accept anything else like it will not change the problem is you can really break your website if you mess that up so I need to manage a lot of different things of issuing certificates, of managing my vocation you also can get into weird political problems where a certificate authority in a company or in a certain country like because you have browsers that are mostly made in the US and other countries have other certificate authorities and maybe we trust or don't trust the governments of those countries to super and create certificates that they're not supposed to and the interesting thing in here as you've seen are there are actually different types of certificates depending on how much the company has validated you and verified your identity the best improvement I've seen in this area if anyone runs a website check out that's incorrect this is an organization that's jointly run by the EFF where they will give you free certificates for your domain so often times these companies want to be paid to do this work of verifications and let's and CRIP figure out a way to verify that you own a domain and they can generate a certificate for you you'll get different visual indicators of your status so these are constantly changing so you can see this from my website it's just a let's and CRIP certificate it's just basic HGDS if you go to apple.com you see actually a green lock with this apple ink which tells you a little bit about the identity of that company there so the idea is this company proved to their certificate authority that they are actually apple.com incorporated that you can see this in different browsers here and these are home space party there's a problem somewhere in there so I don't really understand the context of a website having a public what does that mean so this is what keeps your communication with a website confidential so fundamentally they have the same problem you go to a website and say I want to see a page you need to and it's a negotiation process so you first talk over HGDP say hey I want to visit see something of your page and I support HGPS and they say okay great let's figure out what protocols everything we're using by the way here's my public key that you can use to talk to me right and so but of course that public key could have been intercepted in the middle and somebody else could be substituting their public key for Google's so you want to make sure that all this communication you're doing to Google is actually to real Google.com they man in the middle of your connection they can impersonate you they can alter the content that comes back from Google to make it appear like it came from Google they can steal your cookies and log in as you they can perform actions on every half on the website all kinds of bad stuff so yeah this is fundamentally when you see this the walk icon this HGPS means that it's a confidential connection so the thing on the other side somebody observing is that you're talking to Apple.com but won't know the content of what you asked Apple.com there are interesting things that I'll just briefly mention like we talked about with encryption the attacker actually doesn't know the size of the message that you're sending because that is an encrypted and so you can get interesting research that looks at even with Netflix so you can they can figure out just based on the size of your data from Netflix what movie you're watching even if even though it's 100% encrypted so some of the websites there's interesting stuff there and the other approach to this when we, yeah please so there's a little update does it have that visual indicator for sure yep so the unit if there sorry let's uh so say it again even if there is a visual indicator I mean theoretically yes it's possible if either the certificate's been stolen or there's a rogue certificate authority that generated that certificate on their behalf if you don't, if it's just regular HTTP then anybody who had been in the middle injected any content if the certificate's been messed with though your browser will show a crazy red warning that says this is not the right certificate and you have to click like five times to actually go through if you ever set the clock wrong on your computer this can really mess you up because it again uses kind of the expiry date of the cert and you can get weird stuff happening so the idea behind a lot of trust is a completely different model instead of having these trusted entities that we want to identify users we actually allow users to decide who to trust we force the users to verify the cool thing in here is you can propagate trust so I can say yeah I trust you about like 50% and so the people that you trust I trust less and so you can get this really interesting graph of who trusts who so interesting areas of crypto research that we talked about here so we basically there are very interesting aspects of breaking crypto so like crypto analysis and other stuff there's crazy breaking the theory so this is behind like md5 they the first attacks on md5 reduced it from I don't know I'm just making years up from like 2 to the 64 brute force to 2 to the 63 and that is the stage where a hash function is considered broken and you need to move to a new hash function implementations always have problems I will highly recommend if you're interested in this area learn more about breaking real crypto systems there's a website crypto palace.com you can go visit this they have a series of different challenges that take you through and you actually develop the crypto system itself and then develop the attack on that crypto system so you can see how like I mentioned brute force keys when you have a padding or sorry not brute force but steel keys when you have a padding oracle attack super cool stuff you can also come up with new types of crypto there's all kinds of crazy things like homomorphic encryption is a way the idea is let's say you want google to store all of your emails encrypted but we also like it that we can search through all of our emails right which is very difficult to do if the emails are encrypted because google fundamentally should not be able to see the contents of those emails homomorphic encryption is a way to actually allow another party to learn information about encrypted content and perform different operations like search or indexing secure multi-party communication is a super interesting area of if the two of us are let's say we're health organizations and we have much in health data about patients for a certain type of illness we're rivals so we don't want to share data and we may be prevented from sharing data but we want to compute some function across all of our data like the average of the length of the illness or whatever with secure multi-party communication we can actually do that on and get a result the only thing that either of us learns is this result there's all kinds of crazy cool crypto stuff any questions on this? we're going to go to the homomorphic center do those tickets also have operations? the pinning has an expiration so if you say you say trust us for the next year on the back alright alright so for this assignment you're going to go deep into so what you've been doing basically up to this point is writing programs breaking programs breaking crypto and now you're going to break each other so this whole idea is you will learn how the web of trust actually works by creating your own web of trust so you will learn about public key cryptography you'll learn specifically about GPG which is a specific public key crypto system verifying identities and the web of trust so at a high level the goals are to create a public-private key pair that's what we've been talking about you will then register your public key with the submission server there's an upload form where you submit to the submission server and that way everyone knows what keys are fair games so everyone's key you will have to create a key with your real name that will be your name you'll upload it to the submission server where it will be signed by our class key you'll get that back so that you can prove to everyone else in class that this is your real key because you need to have your key signed by at least 20 of your fellow students in this class there's 200 of you, this should be not a problem and the trick is you need to avoid signing any fake keys so each of you when you submit your key to be signed we will generate you a fake key this is called your adversarial key your goal is to trick people into signing your fake key for extra credit very important note I need to make this 100% clear I put it in bold over and over again the whole notion of A so as you talked about public key crypto keeping your secret key secret is important also keeping your secret key is important and specifically because we signed your key the course validates that the course validates that you are actually a student in this course and that your key has the right to name on it this is how you can actually use that as a bootstrap in to verify identity all that fun stuff so the main part needs to be exact and this means that do not lose your key pair if you lose your key pair you'll be wherever you are at this assignment you'll be able to progress any further we can't have you regenerate keys and then we recreate those keys because your keys were already signed by other people in the class it's just a negative so this is 100% on you to there are a lot of guides out there for using GPG to create public and private keys create a public and private key pair that is exactly what it is an ASU system if so we actually already have this for all of you so when you go to this page oh sorry I already uploaded it but it will tell you exactly what your name is that you can use in the name part to generate your key so it has an email it doesn't matter what the email is so you can choose whatever you want from the email and GPG keys have a comment so have no comment upload it to the submission site I should have showed you to do this and important point do not lose your key pair I mean back it up take whatever steps you need to secure and back up your stuff consider this the most precious thing you'll have for the next one and a half weeks just do not lose this key pair I don't really care how you accomplish that fact but yes okay so then okay the key another thing you'll need to do and figure out so this is the courses public key so every valid key so every key that's fair game in this class has been signed by the courses public key both real keys and fake keys you can basically ignore everything but it wasn't signed by this key and you can verify that it has this exact fingerprint so this is actually the hash of the key that you can see that it's actually this key now the server will as you can see here so once you upload it you can download your signed public key so you can download this and so now that you have that signature that says my key was signed by the course submission server then you can start signing each other's keys and you also have your adversary public key and the adversary private key that you can get it'll have the random name of the same email as your key and you'll have the public and private key that we're storing for you okay so your goal 45 points signed by at least 20 fellow students there's a lot of you I know talk to each other sign keys there's many approaches to this there's a lot of information online this is something that part of this assignment is going out and learning how to use these tools so this is a part of that assignment and walk through this step create fake keys sign your fake keys so you understand how this process works and the key part here this needs to be 20 real keys you don't necessarily know what are real keys so this is up to you to decide how do I know that my signature is not by real keys finally sign 20 of your fellow students public keys so again this should be easy signing each other's keys you sign get your key signed at least 20 times I would suggest more and sign at least 20 people's keys the key is 10 points in this assignment are not signing in battle keys so again this is the trust the web of trust model is actually verifying that people are who they say they are I'm going to try to prevent myself from giving too many hints for those assignments I've seen this multiple times and I've seen the cool stuff that people do so 10 keys will be 10 points right I'll figure out the exact functions roughly how much it's usually pretty easy if you trick people to sign your adversarial key you will earn extra credit the amount of extra credit will be determined exactly at the end because I don't know if one of you gets 200 if you all decide to turn somebody into a super person and have them sign 200 keys I'm not going to give them plus 200 points on this assignment roughly something that is fair based on how many people are doing adversarial keys finally and this assignment is not actually up yet but if you submit your public key and your public adversarial key you will see how many signatures you have on there from there it should be very fun