 Let's get started today. The first thing that we have to do actually, so one of your fellow students wanted to invite you to a certificate of authority or some kind of scheme to assist you in homework two, part three, they'll explain in a second. But before we do that, know that them doing this does not mean that I endorse or have reviewed or otherwise am placing my trust or any endorsement on what they're doing. And so, since I'm giving these students this opportunity, if anybody else would like to come up with something similar for whatever reasons they see fit, you can definitely have time to advertise your, it's called a scheme. I like schemes, schemes are your general, your scheme on Tuesday. So just come up to me at the beginning of class and you can make this happen. So, you want to use the mic that way. So now that our scheme has been kind of introduced, I'd like to mention an opportunity that we might have thought. So, Ken and I were talking about Web of Trust over the weekend and just thought wouldn't it be nice to have a list of people in the class and just how that could, how we get that, how we know to trust it. And we thought, well, why don't we actually, we put our heads together, we had a little bit of a problem that we actually drafted up a charter, we have some new policies. We said we should, as a community, provide the list. So basically, we have a thing called key confidence. We're saying that after class, we should have a bunch of us to get together, just kind of do a public key signing. And also, you'll be able to get here about one thing, let's get access to the list. And just as a community, we can kind of all build a list of known trusted keys. All the list would be able to do would be give you the name of the person in the class, you need two forms of identification that we'll be checking. That way we know that you're real from this class, an ASU ID, and a government ID. And that way, when someone else sends you an email in the future and says, hey, sign my key, and you go, I don't know if this person's in the class, you can look up where you have read access to a list on Google Drive of authorized people. Or at the very least, we get 20 people up there and now we can get our 20 keys signed and that's 50%. So it's kind of in our hands if we want to create a little authority of ourselves. But anyway, so we have a charter, we have policies drafted up, I'd love to show them to you after class. We're going to hopefully get one of those up there or maybe on the stairs. We'll see you a little bit. Oh, the third thing we need is social security and a bank account. So this is a very good opportunity in learning how to trust or learning how to distrust or learning how to verify and trust people. Cool. Any questions on not that specifically, but homework's pretty straightforward, we're getting all the things worked out, should be good. I would highly recommend you start generating your keys and doing that now rather than the day before the assignments do. I can already envision a mass amount of emails flying about. Verifying trust in a short timeframe is always difficult. Cool. So we've finished talking about cryptography. So we've covered the, I mean, I guess I should say we've just scratched really the basics of cryptography, right? We didn't, we've learned about different types and what we use cryptography for, but we didn't really get super deep into the weeds there and that's deliberately we will not be doing that. Because this is an overview class and not a curriculum class. So the next topic we're going to talk about, so now that we've studied cryptography, we're going to actually circle BASH and study authentication. So the first question I have for you, what's the difference between authentication and authorization? Authentication is you as like letting me know, I'm saying I'm Richard and is am I Richard? I claim to be Richard, that's authenticated, authentication, authorization is like the level of privilege that I have, right? Am I authorized to view this class? Right. So authorization would be basically everything we looked at and studied in access control, right? What can a person do? Authentication is how are you actually, how do you verify that this, essentially this person is who they say they are or this person says they are who they are on this website or on this system. So I think it's like authentication is who are you, right? Trying to prove who you are. And then authorization is what can you do, right? So it's one of these tricky things. It's because we talk about this slightly backwards because in order to know what you can do, I first need to know who you are, right? But to get the authorization, we need to do crypto stuff and I think so that's why I organize it that way. Anyways, so the idea is, and this is kind of circling back, we reuse a lot of these terms in the authorization and access control. So the idea is we have, there's a principle, so there is some unique entity and the identity specifies that principle and is an internal representation of an entity. So you would be the principle, the identity would be let's say your account on the submission server, right? That name is tied to your specific identity or your entity. And then a subject acts on behalf of identity, we talked about this process, a process usually runs with a permission to the user. And finally, so using this authentication really is a little bit more formally binding some identity to some subject. So we can say that yes, this person is associated with this login information. So how do we do this? How do we do this in the real world? How do you know, and this gets back to kind of identity is how do you actually know who people are, who they say they are? How do you know that people are who they say they are? People have IDs. People have IDs, so what's that? Grab his license, school ID, some just a normal. So what does it have on that though? So how do you actually use that to try to verify that a person is who they say they are? Some kind of issuing authority. Does what? Not super, I mean, what's on an ID card that you look at? A picture and a name? What else do you look at? I just hand you a piece of cardboard paper where I pasted a photo and I've written my name on it. Is that an ID? What's that? It depends on the institution. So what actually are you trying to validate there? I mean, is it just the fact that there's a picture and a name there? So one thing you're doing in any case is you're trying to see does this picture, picture match you, right? The person you see in front of you. There's also a third party that has verified more information about you and issued the ID so that you can generally assume it's an ID. Right, so then there's, so the reason why we use some ID rather than just a random piece of cardboard or paper, right? Is the fact that we trust that some other entity has done this verification process of identifying your, you, the person, your, I don't know, what you look like with your name. What else? I'm just gonna, like, say that it kind of depends on if it's other people vouching for you. It's like an American passport. It's like the Secretary of State says, well, this is who it is, not a British passport. It's like, it's just different entities vouching for you with different levels. Interesting. So how do you get your driver's license or your ID? What do you need? Where should we get? Yeah, often another form of ID. It kind of depends if you may look this up at some point. Okay. So, okay. So you can verify that somebody is who they say they are based on, let's say, a photo ID. But let's say I grow my hair out a lot and then I grow a beard, which I can't actually grow, but let's say I can grow a huge beard and I don't look anything like my picture. How would you know that it's me? So I have to do that, which has one. I don't know that. I have to take my fingerprints and they'll be through a card. So how can you verify that, though? Well, let's say you wanted to use my fingerprints. So how would you verify me, the same person? I don't care color and stuff. Nothing too wild has gone on. You can generally verify one or two other aspects of the person. So you mean, so, okay. So you're talking about not necessarily fingerprint, but back to the ID case. Well, I wasn't the question, how do you, if you grow out your hair and grow a beard, how do you tell it's still you? Yes. Well, on this data ID, you have approximate height and also approximate hair color. So you should use a couple other aspects plus the possession of the ID, which you know wasn't stolen, but it might be. Right. So yeah, so, okay. It's interesting. So there could be other metadata on this card that maybe helps us try to identify even if the picture doesn't match 100%. Let's say, yeah, as a, I mean, common what Mark under tactic is to then ask the date of birth or issue date or so it will require you to regurgitate some information from the ID. From the card that they just need. The tactic. No, because that's why I think issue dates the one, but I was like, I don't know. I never know my issue dates. So I had never asked that because I was like, no, sir. Is there any other, I mean, think about like a crazy like soap opera or something. Right. For somebody like comes back after 10 years. They don't have an ID or anything. I would you be able to verify that it's actually them. How would you verify me if I show that that I claim to be Adam. I don't have any documents. I'm all disfigured or my face and we can start to accept an accident whatever they use. So you need to like use like, you basically, you couldn't like, you couldn't probably unless you have like, maybe like fingerprints of like the government database, but otherwise like. Your friends got burned off in the accident. Yeah. So otherwise here until like, try and just find like the ponderance and evidence. So like, I don't know, like if you could like name some facts about like your childhood or whatever, like it basically like check your story lines up with. Okay. So what would that, so you have your thoughts over here. Oh, just DNA maybe. Interesting DNA. Yeah. I was thinking the same thing as him, like a question, like a security question to have online. Yeah. So what is that actually trying to prove that you know something about this? You have some information that only you know that you're so. So what kind of, what would you ask? Do you ask me what my favorite college football team is? Why or why not? Questions that are verifiable with the answers have been established before. Very viable. So that somebody can actually verify that. So that somebody has to be able to say whether that was a correct answer or not. Yeah. And like anything like DNA, you would have had to like submit your DNA prior to this theory. Yeah. And or you would have had to like, if there are other questions that can't be verified with your own means. Right. Yeah. So that's, yeah. So those are definitely good points, right? So you need to have actually had DNA either stored somewhere or it had to have some easily recoverable place that they can actually extract some usable DNA from an expert. So I don't know all these places where they could or could not extract that from. And then all these security questions is kind of what we think of them now. But even, you know, that's kind of just quizzing you on yourself. So, yeah. Why? I mean what are we trying to authorize you to work for the DOD or are we trying to give you a driver's license or anything? I mean because there's, depending on the context, depends on how sure or not sure you want to, I mean there's no way to be absolutely sure. I could take my friend's driver's license that I look kind of like and go to the DMV and get a new driver's license issue in his name and send to my address. There's no guarantee there. Yeah. Yeah. Two comments. I had an acquaintance in college that would buy booze with your sister's driver's license all the time. She was about 21. And we don't even have to go to the extreme of the soap opera. I'm sure that there are lots of transient or people currently experiencing homelessness maybe have varying degrees of paranoid schizophrenia that live out of a duffel bag and don't have identification. Yeah. And when you think about that problem where they need food, they need money, they need a job, how do you get a job? You have to file an I-9 to pay taxes and you have to have identification to do that. So there are constantly people even in this city and this town, you can probably see them on the street, that are in this state of I have no identification and I think I know my name and I need to collect my social security that I qualify for because I have a mental disability and they can't because they have no identification. So it's a very real problem. Definitely. Yeah. So that's a great point. Yeah. I'm going to continue to not all Americans have social security problems. So not everyone has a social security number? I think... It's the Amish or pre-national and whatnot but it used to be the Amish, the New York Police Department and the New York Firefighters and remember there are unions that said we don't want social security numbers and it's just not, especially with this percent we don't want to deal with that because we don't believe in things like that. Yeah. So yeah, this is definitely fraught with all kinds of issues. Let's go back to... I want to go back to the asking you something that you, let's say, knew in childhood or something, right? So if you ask me, what's my favorite... I would go back to the example college football team. So I guess the other thing is when you're verifying somebody, you're trying to verify are they actually who they say they are or are they an imposter pretending to be somebody else? Right? So if you were to ask somebody who claimed to be me, hey, what's your favorite college, collegiate football team? If they're an imposter, what would they probably say? I would first guess UC Santa Barbara. They don't have a football team, so... Then I would guess whoever won the district championship with the year that you graduated around there and then probably Google what the most popular team was that would be my third guess. You're forgetting, like... So you don't have football class? Yeah, ASU, right? Because I don't have any other teams. You just weren't the actual none. Yeah, I work here, but... You could hear what you're saying, none. You could say none? So you could say I don't? Yeah, you could... Or you could try to learn more about me. You could stalk me on different social networks and try to figure that out beforehand before trying to be me. You could try to hack into my Gmail or my Dropbox to learn more information about me in order to get those questions and those answers. These are all really good. So... Now let's take it to... kind of the computer realm. So when you're actually... So we'll... Not to pick on any site in particular, so we'll use the submission site because I made it in my... So how does that submission site verify you? How does it actually tie that account back to you, the person? It asks you for an ASU ID number? It asks you for an ASU ID number before you register, right? So those are... This isn't a foolproof system. Those are not public knowledge and they are considered sensitive information, so it's... acceptable issue to use that. And nobody's complaining that somebody else signed up as them, so it's also fine. So they also have to be unique. You would be able to tell if somebody else signed up with yours. So after the ASU ID, so that ties what to what? The username to the individual. The username to... The ASU's record of the individual. ASU's record of the individual, yeah. What I would say is you can't really... even tie it back, like who actually verified it. You could be assuming somebody's identity or whatever. It's all kind of crazy. So that ties there and with there. So you register, you log in, you register, you get an account, you have a username. And now you go to login again because you need to submit homework too. But how does the website know that it's you? Does it remember you and go, hey, good to see you? I remember you from two months ago when you first signed in. Cool, but what does it do? I mean, what is the server doing? Right? So hey, most of the time I can't actually... Honestly, do not remember how long the session's last, but it should log you out of that interval. Is that correct? No. Okay, it didn't. Okay, it's assuming you've logged out for security purposes and you go to the website or you're on a different computer. Let's say that it's on a different computer. Have you proved to the site that you are that user account? Input your username and password, which does what? Where did that password come from? When? When you created the account. So that password is a piece of information that links that account number to you, the person who created the account. Right? So you have these two pieces of information. The password links you, links the person who created the account to the account, and then the ASUID links that account with your ASU information. Right? So... And actually, this is kind of similar in some way to what we talked about with using maybe something from the past. Right? So this... A password should be something that only you know. Right? You created that account. You're sharing some piece of information with the server. And so the server basically says, hey, whenever you come back here, tell me this thing again and I will know that it's actually you. And not somebody pretending to be you. So this is one of the main authentication mechanisms that we deal with now, which is essentially, so there's like four categories we'll talk about, very broadly what you know. So what do you know? So I can try to authenticate you based on what you know. So if I want to authenticate you as a student in the class, I would probably start, like we said, asking questions, right, interrogating you about what were the homeworks, what kind of things, what things have we talked about and try to see if you're actually in the class or you're just somebody pretending to say, yes, I am in this class, but you've never actually been to a class or watched any of the lectures. An ID card, something you know? It's something that you have that shows that you are... So if you've ever... That's not a good example. So yeah, so the second hand category is something that you possess, that you have, right? So this is the ID card, right? You can always lose your ID card, you can get issued new ID cards, but when you have this ID card, you can give that to somebody and they can try to authenticate that, yes, you are the name on this card and they'll do that through all the methods we talked about, right? Look at the picture, compare the picture with you, look at the other metadata on the card to try to compare that with the person they're talking to. They may quiz you questions about the card. They may even try to swipe the card to see what data is there or try to contact the card issuer to see if it's a valid card. They can try, which may get around the look-alike sister attack. They may ask you for other cards that you have that have the same name, right? So show me a credit card even if it doesn't have a picture with that same name on it, right? Which can then try to verify that, yeah, you're having, you're showing essentially multiple things that you possess to try to prove that you are this person, right? Yeah. But it depends on where the card is from too. Yep. Because like a school ID, you can't use it for like going to the airport. Absolutely. Why not? Just because it's, they can't verify that it's a source that can be trusted. Exactly. So yeah. And so that's definitely something when you think about when you're doing the verification, right? Just because somebody hands you a card doesn't mean you automatically assume they are who they say they are, right? That's how it goes back to the example of a cardboard ID card you make yourself, right? You can hand that to anybody, but whether they actually believe it or not is up to them, right? That's a good point. So our fingerprints, either of these two, or DNA, is DNA something that you know? Are your fingerprints something that you know? You can look at that, yes. I agree. Do you possess them? That's it, a trait. What was that? That's it, they're a trait. They're a trait? I think that's part of it. You don't really possess them in the part of you. Right. So, exactly. Yeah, so they're not necessarily something you possess. They're not the part that you can just lose. You can't go get a new set of fingerprints or new DNA, at least not yet. So, yeah, so the third category is essentially what you are and things like fingerprint stanners, facial recognition, what are we talking about? DNA or retina scanning, right? These are all things that are what you are. Voice recognition, I think, would probably follow that, too. So again, just like the previous example, let's say they match, does that mean that they definitely are that person? No? So let's think about voice, right? If you're authenticating somebody because you know their voice and somebody comes up to you and you can't see them or whatever, it's dark and they start speaking in my voice, would you know that that's definitely me? Why not? People can mimic other people's voices. They mimic other people's voices. You can record the voice. You can probably string together co-nearing sentences from everything I've said in my lectures on YouTube. You could probably take that and use that to create co-nearing sentences, which is not scary now that I think about it, but whatever. So yeah, what you know, what you possess, what you are, these are the three main categories of things we think about that go into authentication. There's a fourth one that's actually really interesting. And that's... So I'm trying to think of a good example to give you that. Okay, so... Is there a difference? I'm just thinking about like a CEO attack. So there's this... basically this type of fraud where people will either call or email a... usually like a CFO or somebody in the financial department organization and they will say, is the CEO, I just closed this deal, this $2 million deal with this other company, I need you, this is going to be huge for us, we're going to make us tons of money. I'm jumping out of play right now, but I need you to wire $2 million to this account so that we can close this deal otherwise our company's going to go under. And so they'll either call or they'll send an email. And so one of the things that would go in there, right, is now the person on the receiving end essentially has to try to authenticate this even phone call or something, right? So is it any difference, let's say it's exactly the same voice content if it's somebody in person talking with you versus on the phone? Let's use a different example. Yeah, okay, this is too difficult, sorry. Where you are. So I was trying to get like in person versus somebody who's calling you on a phone call from far away. I think another example would be maybe the caller ID is also tricky. So you would think of like a local like maybe it's the CEO's number that's calling you versus it's an external foreign phone call coming in. That would tell you that they're out of the country and the caller ID can be trivially spooked so this is not a good example. It'd be like giving your credit card number over the phone or like to somebody in person could be a good example. Yes, that is a good example. I'm a little more hesitant to give somebody a credit card over the phone than in person. That's a good example, that's great. And then also Visa and MasterCard and the credit card companies feel the same way because they will charge the merchants different overhead rates depending on whether they have in person credit cards swiping and or sign the credit card versus doing over the phone or just over internet transactions. Because they know that the fraud rate could be higher. So these are kind of basically the way we think about different types of authentication mechanisms and I think, can anybody come up with one that doesn't fit into one of these categories? Yeah, it's similar. I'd say though that from the website's perspective it doesn't care or know that you're using a password manager, right? All it knows that it gets the correct password back that it got back before so you're proving that you know somebody. You're using that website to prove to someone else Right, right. So you can prove that I'm the person who owns avenuepay.com because I can change the content of the home page to be whatever I want. That would be... Yeah, I think that it's kind of a combination there because you need some of what you know in order to access the site individually but yeah, that's an interesting one. I know there was something with keyboard rate like how fast you typed and how you typed the passwords in or the information in so that would be kind of something between what you know and none of those categories. It's informed. It's trying to get at what you are. So there's also some work in authenticating doing continuous authentication on your phone. So you take your phone, type in either your pin or use your fingerprint to log into your phone and then while you're using the phone it's actually keeping track of your gate and it's trained before on the past on your gate and so that that way somebody else steals your phone and uses it even while it's unlocked it can lock itself because it doesn't think it's actually various types of things. It's trying to get at what you are. It's trying to develop some unique signal just from you that... But it is tricky because yeah, you can always change your gate maybe but consciously but unconsciously you shouldn't be able to do that. Similarly with typing so you can measure the speed in between key presses on the keyboard and that's roughly ish unique. Anything else? So just like when we looked at crypto systems we're going to be thinking about authentication systems in kind of a more abstract manner and this will actually help us then think about how do you attack these what's actually the attacker's goal how does different mechanisms that we're going to talk about salting, hashing, whatever how does that actually change the authentication system how can we think about this in a more rigorous way Actfulus I wish this was a good acronym that would be cool I think we'll think around later Catholus probably not good to just shout out random acronyms it's not realising Okay so A is a set of some authentication information that proves the identity so this similarly to the let's say the plain text key space right this information maybe restricted in some way some systems will only let's say accept as a password lowercase characters or if you're setting up let's say a pin code for your credit card or your debit card that can only be alphanumeric so A, the set of authentication information is all four digit four digit pin codes numeric codes some authentication systems are super broken and no matter what long password you type in will only use the first eight characters so the space there the A's you could possibly use is a lot lower too C is the complementary information so this is some information that's stored by the system that's used to validate the authentication information for now we can think of it as you know someone got a DNA example so A would be your DNA and C would be anybody a biologist or in a bioprogram and taking this class and knows how much how much data is stored in like a DNA sample from one person I know it's a lot I think it's a lot it's not a small amount of information even when it's compressed because you only have whatever the four I can't remember what they are I've got my head for DNA anyways rather than storing all that information what I may want to do is only store a subset of that and check that against new samples that that subset matches I want to make sure that the subsets are relatively unique among people but the idea is that C like I could just store the entire DNA sequence and then match that with whoever comes to me who claims that they're you and wants a DNA test but so C and A could be the same or they can be different or C is a restricted set so F F is a set of functions that can turn A's into C's so in this DNA case it would be the function that selects which gene sequences I think is the correct term which sequences it actually stores I should not use the examples where I don't actually know what I'm talking about so I apologize to any biologists who are listening to this and so L then are functions that verify the identity so these are functions that take in A and C and return either true or false so this is like the login function or in our DNA case this would be the DNA testing lab that you give them some new A they do you give them C no C comes from the system right C is stored on the system so you only externally get A the system takes C the complementary information that you've stored and will tell you yes this is you or no it's not you and then I think you could make this a little bit more complicated especially in the case of DNA where you can have percentage matching so you could change that but for now we'll think of it in binary terms either true or false and S is really just for completeness these are ways that you can change either create a new A or create or alter your stored C so this will be changing your password or something like that any questions here so we'll go with a super simple password system so these are just examples so we'll see kind of how this framework works so passwords stored in plain text so what does that mean normal so the authentication system here would be what so what would be that A is A is a set so describe the set A it's a system-wide thing or it's a system-wide thing in the sense that what does the system allow not a specific instantiation of the system so A is not if we go back A here is not so these are all sets so A is the set of possible authentication information that proves an identity so you as a user select some little A in A and that is your authentication information you have a select in the case of DNA you have some A in this little A in this big A and that generates with F so F takes in that little A and produces some little C which is an element in C and that's your actual complementary information and then L can take A and C and return lowcase A and C and return either true or false so what would be the set A in this case alphanumeric A character alphanumeric cool so what about the set of complementary information are those the specific passwords that's the set of little Cs that have been chosen no that could be chosen could be chosen yes the same as A so it's any you're storing the password the most 8 digit alphanumerics then the sets obviously have to be the same cool so and maybe I think I just asked this so we're talking about this sense but couldn't we also talk about A in terms of like this system that we're talking about we haven't talked about a username but I would think that A well I would think that C as the information stored on the system would be a strict subset of A if the particular value is any of it would actually let you into the system why is C not the actual condensed box of a bag of passwords subset of A that's actually used so we're describing essentially all possible instantiation of a password system that uses plain text passwords that are 8 characters long alphanumeric so this is all possible instances so basically we're saying A is 1 through 8 alphanumeric C is exactly the same so those are what passwords and our complimentary information are drawn from if on the other hand we said we're using SHA-2 to D6 as our function as we'll see in a second passwords into complimentary information then C is not all alphanumeric characters it's actually limited to 256 byte numbers or whatever that is a hexadecimal I don't know but it's a hexadecimal hash so it just defines kind of the space there so it's not it's not talking about a specific system and say okay we have these complimentary information and these passwords and this is how we do this but it's nice because then we can use for instance f here is so f is a singleton set so there's only one element in that set that's f so it's the function f in this case the lowercase f that's the identity function so it's return whatever the argument is so this just maps every A to the exact same element in C and then L to test is very easy just do an equal so you test is the A that you're given equal to the C that you have and if so then the true and S is kind of in this case just whatever function is set or change the password and create new users in your system so this is a good model so let's think about what do we want from a authentication system C well I mean at a high level what do we actually want right we talked about what do we want the function we talked about we're talking specifically now about something that you know at least in this case right or I guess it doesn't matter but what do we actually want let's go back what do we want from this I mean this authentication system it's not a so access there's no access right so access implies access control yeah that actually authenticates the correct person that's trying to authenticate right so how did it do that using some of these terms making sure that the information right so definitely so that's so one thing we want is kind of as I think both of you are getting at a little bit is more just functionality right we want to see make sure that when I use the system if I provide the same A that I originally enrolled in the system or I created a user with that the that L will return true right if I come with the same A they should not dedicate me to the system but what if I have an L function that just always returns true which you laugh there's actually I'll try to find this bug but there's a bug in Dropbox I think it's probably back in the 8 or the 9 or something like that no it doesn't matter but they push some chains that cause all password checks return true so you can log in to anybody's Dropbox account just by putting a different email address and whatever you want for the password it would return true right we're just finding for a lot of people like using a password manager that automatically injects your password without knowing you would never know that it's actually doing this or even if you're just logging in correctly right you'd never know this I think somebody found it because I think it was live for like a half hour or a couple hours or something until somebody like that figured or knew they mistyped their password and then logged in and realized it still logged them in and so then they logged out and started testing and then told Dropbox who fixed it immediately so why don't we just turn this L function and say we'll always return true because it's still functional right if I come back with A and I can log in I'm in right L returns true I'm good purpose defeat the what? it defeats the purpose it defeats the what? what kind of purpose? the purpose of the system and everyone is authenticated all the time and everyone's happy right yeah so it's more about like a security or at least authentication requirement that we want to eliminate a group of people that don't have the necessary information correct so how would we then describe that requirement here in our system L is non-trivial no you could write something you could write an L that's complicated but that hasn't that still is not secure L could randomly say true or false right that's not trivial we want L to evaluate to true only when that specific A cross C happens right and to make sure that I don't think one to one is the right word but yeah so it's more about it's what we've been talking about we're trying to link we want to see basically what we want is when you register and you provide us with a lowercase a that later on you're providing us with that same lowercase a right but we don't actually always store A which is what the system says we're storing C so we're storing some transformation let's say of A to some other thing and so we check if those things match and then we usually say yes so then oh sorry L returns true if and only if the f of little A1 is equal to little c1 so if the we run the f function on a little a that it responds as output the corresponding little c yeah that's the way I would think about that there's a whole bunch of other ways to formulate that you could say like if if let's say some a that L returns true for some a and c1 then it must be the case that that a is equal to a1 right so that's saying like if you can get in with that same a that means that must have been the original a if it's anything else that you're violating that clause right if you could just type in gibberish and get in then that would violate that cool alright so does this get us what we want from those cases the two properties we just derived you literally gave us the the second one so functionality and security so is it the case of the system that if I walk in with an a an a1 or I create an account with a1 then I come back with that same a1 later will L return true is it the case that if somebody else comes with some a prime but it's not the same as my a1 will L return true tell me I just said it's not the a prime that isn't specifically not a anything any element that is not a a1 and no so it does it does satisfy our two basic requirements right it satisfies functionality and the basic let's say authentication requirement awesome cool look at one other one we did some of them and then we're going to talk about attacks and then we're going to circle back to think of why these are actually different methods but Unix kind of originally was one of the first systems to have this kind of standard hash function and the idea was so just like what we talked about so actually the set of all characters you could use in your password for Unix was eight characters or less it was not I think the only other restriction was no it was no null characters so the null byte was not acceptable but I believe everything else technically is look how you actually type those is interesting and so the idea here is rather than store the passwords in plain text we're going to run some hash function on the code and store that as our C so we actually are not storing the passwords themselves so we'll talk about that we have this ID concatenated with 11 character hash so why do we want to use a hash now that we know we're studying crypto or what is let's say this what does using a hash mean yes so assuming that we have a good as we talked about cryptographically secure hash function if you're able to take some little C in there you can't you shouldn't be able to go backwards to find that a right that's kind of the premise of hash functions right that was one of the main properties of hash we'll talk about what this ID does but you can basically think what they do is actually and this is why we have think about these things in terms of sets rather than f being a single function that does the hash right f is something that takes in and lowercase a and returns a C f they'll actually be 4096 different versions this two character hash ID will tell us which version to use L is the things that actually do the checking are the login function su any of these programs will try to verify your password and they basically will take the a run it through the algorithm f and then verify whether the f that the C that's stored is the same as the one that was generated from running on that new input and changing that is basically you have all kinds of ways to change your password so does this actually does this scheme satisfy the two requirements that we talked about does it satisfy the functionality of the requirement see if our other one our more basic case we said met the two requirements right so this is just restricted our basic basic case right I don't know that I would argue that way just because you never know just because something is based on something basic that you prove in or demonstrate yourself doesn't mean you didn't introduce changes that completely break things but we can use the same reasoning there right you can say okay if I have I register with some password a which is the strengthening characters or less the server will choose one of these 496 choose a function f will then generate a C which it will store and then later on if I get that same password a to the login program it will then look up C, C tells it what f to use and it will run the algorithm and check that that C matches the one that came out of running f again so it would say that yes these two match so that would verify that yes when I actually use it with the same password I get there so what if I put in a different password a1 that is not a yes, exactly so so why is it possible so it actually does fail this test that we just came up with is more or less that hash functions can map multiple passwords to the same it has to it's only 11 characters so you have a limited amount just what we talked about right a hash takes an infinite size space and shrinks it down to a fixed size output whether the thing I don't know for certain whether the set of all strings you can do in 8 characters there would be a collision but it probably will be at some point you can assume that there will be some so it actually fails this basic test case but how would you find so let's say I do find some 8 prime that hashes to that same value will server let me in that is not the original password it definitely will because all it does is take whatever I give it run it through the hash function and then compare that hashes to the same thing as the original A that actually doesn't matter that I don't know the original password what matters is that the hashes match and that's what the server uses does it use the C? does it use the 2-char hash ID of lookup which modified des is to use? yes yeah that's the only thing so it can use that later but you can think of it or you can just think of it as it stores the modified des is a hash it's not a symmetric the important thing here is it's a hash not that it's des they took that as something that can generate random stuff this was way back in the day we now have much better hash functions that we would use here not that it's modified weirdness of des so how many so let's say it's it doesn't actually say but let's say the hash is 11 bytes how many passwords would you have to try to find that new A prime that hashes of the same value 254 to the 8 yeah because uh for an 11 byte hash would we do the 11? oh sorry sorry I was thinking the 8 yeah with the hash so the end of the hash so it's 2-56 to the 11 which is a fairly large value but and so this is kind of then when you analyze the security of this you say okay so we know it's not it doesn't provide us with perfect security where the password definitely is but because we do know that an adversary could try 254 to the 11th tries to get something that hashes of that same hash of course the trick is it has to be something of A so that's another part there but they could find something so it doesn't provide this kind of perfect secrecy but we know if we're using a secure hash function that they would likely have to try all of these possibilities before they get a collision right they would have to do 2 to the or what we just said 254 to the 11th so at least then we know that's what the security of this is predicated on so if we use a 2 byte that would not be good because they could do that easily or even a 1 byte hash value it's highly likely that they'll find something very quickly so how this works kind of in practice is the subject says hey I want to sorry it uses an S this is the function to create a password that says hey I would like to register with the system I want the user account Alice and the password is password this is a bad example we'll talk about why in a second so what does the service provider then do that store password so this is an example of the Unix standard hash function yeah so it uses something of f it then generates the name so it always has to associate kind of back to the identity so it needs to be some way to map back there and then it calculates the uh Unix hash of this password which is here so this is a this is c that gets generated from this is the lowercase a and the lowercase c that gets generated from here it stores this yeah so that's very easy so then it stores this in a file which we'll see in a second which I think we've already looked at a little bit and then hey this is Alice right she'll provide the password and so the service provider will take Alice's take the given password the new a hash it with the function and verify that outcome with the stored c that it's stored in server if it's correct if it's true or not in return false so I'm going to break this I don't want to think about this like an attacker what is our goal as an attacker access the system as another user right so um so this I think is an important distinction right because it actually doesn't matter as long as we can get onto the system as the other user I actually don't care if I know their password or not and we can express that in our language by just saying that hey we want to find some a so some password in the password space such that for the f that's used well when we run that on f it's the same and c is associated with the entity that we want so it's usually two types of ways we can go about this so think about this now from an attacker's perspective so what information does the attacker actually have probably have the user name that's usually we'll touch on this a little bit but yeah definitely we want to think okay the attacker knows the identity of let's say a person that they're trying to do the attacks but what else do they know think about the different situations maybe you use this I have at least a vague idea what this is they probably know I would say f they would know f right they would know the set of functions that are used do they know the given c's or do they know I guess that's a good question do they know the c's the yes when right so let's say this let's say you're trying to break into my account on the submission server and you're trying to do this do you know the list of c's no nobody's found one yet through four security courses I don't think so not to say there's not yes no do you know your hash or any fellow hashes from anyone else do you know what algorithm I'm using to hash your passwords no what do you have access to how would you know how do you know when it matches right so you have access to L right you have access to the login function and that will take whatever password you give it and would say yes you're allowed or no you're not allowed so there's two ways to think about this basically direct approach indirect approach online attack offline attack are two different they're all different names for similar things one way is direct attack which is the offline approach so the attacker has the c value so you have like we said the hash you broke it into my sql database you've dumped all the information so you know all the user names and all the hashes or like actually what used to happen in unix is all the passwords were stored all the hashes were stored in etc password that's what literally the name came from it's the password file and so anyone could see those hashes they weren't a secret information so the goal of this approach is that the attacker has some c so they want to find any a where running after a gives you c that's usually what the attacker's goal is so should be an indirect approach here so the indirect approach which gives us directly 100x the indirect approach is the attacker does not have c but we know that they have access to l they should always try to be able to log on otherwise your users who are using the system can't actually log on and the idea is if you could tell if it was an attacker or an imposter or an authenticated user when they're trying to log in then you've literally already authenticated them so you can't make that determination because that is what your authentication method is this is actually a super important distinction to keep in mind so it's not purely hypothetical we'll talk about why that's important in a second so this is kind of our high level attacking goals right so we think about the attacker this is what the attacker wants to do and we always need to consider these two cases so how do we prevent attacks in general that's the attacker's goal so maybe it'll help to think about well how would you then so let's say we're on a system so let's say we have a direct approach you're on an old school unix system in the 70s 80s it has an etc password file that has everyone's hash in it how do you break that that doesn't just give you how do you do it root force well we have we have a hash right so that's where we just start running through every possible combination and seeing what that hashed out to and then when we found what would you how would you root force it so you just try it so let's say we're going to guess it's just all alphanumeric characters so you start with a string full of let's say 5 or whatever group length you want to do and then you start with all of our case A's you try that hash it compare and then you change the last one to B hash it, compare, change the last one to C hash it, compare for a strict root force, yeah that one but a better I don't know if you'd call it more of an intelligent root force but try to find common passwords or like when we thought of earlier with what interests you had try typing in password or admin or hashed so that could be our way that's not the best way I'm sure yeah so yes one thing would be if you found if you were able to actually break the underlying hash function then you could maybe try it that way a much easier approach is you always need to remember these are humans creating passwords most times unless you're using a password generator automatically generate that password but fundamentally there must be some that a user created to try to know because they need to remember that when they come back this isn't something we necessarily talk about here so we could try for a given C we could try let's say the top 10,000 or 100,000 most common passwords yeah that would definitely work let's even transport us so let's go let's simplify this version so actually the original version I believe did not have this two character hash ID so it just had a was used let's say MB5 or some hash function so now the idea is you have not only just some C that you're interested in you actually have a list of all the C's on the system you have every single hash on the system so if you just want to break into any account all you need to do is run your password cracker run your brute force algorithm generating common passwords maybe try English words and try hashes and compare those hashes with all of the list of all the users on the system when you get a match that's when you know you've found something and you've broken that password so how can we prevent this is it impossible limit the amount of attempts, password attempts with like a time variable so where would you do that in our system in our system it would be at the access function what does that have S in manipulation function or login L so we could have L be such that usually you may have an unsuccessful login you actually add some delay right so would this would this help in the case that we actually have access so that you can see the hashes to help in the case of the websites so have you probably noticed this before I think most systems when you type in your password to login to your computer incorrectly it'll usually delay it and there are sometimes I think in some systems will be an exponentially longer delay how do I tell this story about the phone maybe so you can also set that up with your phone so that when you put in an incorrect passcode it will lock you up for I think first 30 seconds and then a minute so I had a friend who thought it was hilarious to do this to people to lock them out of their own phones so he was doing this on somebody's phone and putting the incorrect pin code locked it out to 5 minutes and then did it one more time and then the group policy kicked in to delete the entire contents of the phone so the phone started wiping and he felt really mad luckily the person had everything packed up I think so I think it was a good story but maybe I should check I don't know if they're, I think they're still friends but yeah, he no longer does that to people because it's not funny because you can actually set up policies like that on your phone and point the phone after a certain number of tries so that way if somebody wants to try to reinforce it they have 5 tries oh I should bring my device so I actually bought a device for older iPhones where I was living here to try to crack the pin using that interface and then you also put it up to something I think it's on the battery or on the reset or something and so before it actually triggers that you input it in correctly it resets the phone so it doesn't actually store the fact that you made an attempt and so it just keeps doing this I think a 4 digit pin in like 3 or 4 days or something anyway it's really cool so when we come back we'll talk about preventing attacks if anybody, once again maybe wants to make any announcements for good or evil at the class on Tuesday go free