 All right cool. So we are carrying on a little bit of logistic updates. In a week there's the midterm. Everyone knows you talked about it last week. And on Tuesday Ferris is going to do a review session. So we'll release a practice midterm and Ferris is going to go over in class on Tuesday. Then you have the midterm on Thursday. Yes. When will the practice midterm be released before Tuesday so we can work on it before the review session or we'll be released on Tuesday. Well I was going to let you decide. No way you can have the wrath depending on your answer. Okay we'll decide. I don't know. Any other questions? Yeah. Everything that we're covering right now. That's till today. End of today. The beginning of the semester our very first day together. Can we expect the review exam to be similar to the real one or is there no guarantee of that. Similar in what aspect. I mean it'll be in English. Topics covered. No I would not say that. Anything that we've talked about from the beginning. So you guys got to think like me. What's the point of having midterms to make you suffer. Yes. Let's go read. To just if you know we're following along. Not even the answer. Less of knowing if you follow along. To me it's more about have you like do you understand the content in the course that we've covered right. That's the whole point we meet here every week. You're paying for class. Like the whole point is to understand the content right. Ideally I could just pick your brain and point a little laser and figure out that yeah you know everything in the course. You're good to go. But I can't realistically if you think about as a set of all topics that we've covered. I can't realistically an hour and 15 minutes test you in depth on every single topic. Right. So I have to limit in some sense what I test on. But if I tell you what I'm going to test on then you'll only study those things and they'll be huge goals in your knowledge. So the goal is I don't tell you what I'm going to test on so that you study everything. You will come super prepared ready for any questions and then I will test on certain things and that's how it's going to go. Cool. All right. So on to authentication. So some of you remind us what is authentication and when do we talk about it before. Access control lists in what sense what does it mean. What does authentication mean. Are you who you say you are and how does that differentiate with the other concept. This is related to access control. Yeah. So authorization is authentication says who are you. Authentication authentication says who are you and authorization says what are you allowed to do on the system. So is authentication important. Can you authorize people if you don't know who they are. Yeah. What's the point. Right. You're ultimately just making stuff up or if you just trust that they are who they say they are. Yeah. The administrator on the system then you may not as well have any authorization rules in the workplace. Cool. And there's a little bit of you know so clearly access control and authentication are really linked. I like to talk about it after we talk about crypto because a lot of the concepts of how you do authentication rely on crypto concepts which is why we're talking about it now. Cool. There you go. Boom. You all just did that. OK. So we're going to define some terms to talk about. So when we talk about authentication what are the what are the important parts so that we have a high level goal of who are you. But what's actually important. Clarification and what's that for me. He's of use. Why is that important. I don't want to provide three forms of documentation just to log into my computer. Let's think about it in this way in terms of identity what actually are we talking about. There's a system like when you authenticate to a let's say a computer system what are you actually proving or demonstrating. Yeah. Possibly. I mean maybe but is it. I've forgotten all your names. Sorry. I'll use Ferris. I know him when Ferris logs onto a website. Do they say you're Ferris. You're from Turkey. You are a PhD student at ASU. Do they know his identity like his physical. Maybe but not necessarily. Maybe but not necessarily. Why. Because they only know the data that's tied to his identity. Right. Like relevant to that site to that specific system. Exactly. Right. So when we talk about identity and identifying and asking the question who are you. It's not in terms of a philosophical or like a personal who you are. But who are you in terms of the system. Right. And so why. And why is that important to understand. It's what you're authorized to do on that system. Right. Because that's why they're tied together. Right. Your identity on the system defines what you can do in terms of that authorization. And so authenticating to the system is all that on the system's terms. They don't care if you know whatever they don't care about your background. They just want to say who you are in some sense that the system actually cares about. So so we'll go over these at kind of a high level. But the basic idea is and I guess another question to pull us back a little bit. So you like a principal will call is like the unique entity. So I like I'm a principal you're all principles in terms of authentication. Do each of us have a unique identity on every system. Hopefully. Hopefully. Why. Do we need. Every system. No. Why. You're just logging on or you just like go on a clip to play some video games. You don't care that your scores are saved. You don't. You don't need it. Like they don't need to know who you are. Right. So in that sense. So maybe maybe there's no concept of authentication because we don't even care about authorization at all. What else. Every system you ever use as a unique every person or principal has a unique shared accounts. You guys never use shared accounts. So everyone definitely has their own Netflix accounts. And even more so you think about systems and we talk about role based access control. Maybe for certain systems that admins share the same password. So if you think about the root password every admin maybe shares that password. And then so they're all different principles but they all have a different or the same identity on the system. So we talk about identity. We're going to talk about that in the system. And the subject. This just kind of helps us think about it in abstract terms. Something that acts on behalf of an entity. So what. So on your computer system what things act on behalf of you. Auto correct. Okay. Yes. That's good. What else. System process. Yeah. System process acts on behalf of you your user to do something. And then authentication really just finds an identity to a subject. So saying that this subject can act on behalf of this identity. So I think I've alluded to this before when now we're going to have an in depth discussion. How do you and this actually ties in very well with your homework assignment. How do you know people are who they say they are. Right. This is the core problem. DNA sequencing. How do you actually do that though. Can you tell who I am just for my DNA. I could. It's not practical at all. But how could you. So let's say I gave you a sample of my DNA. Do you know that that's my DNA. How do you know that I did. If I collect it yourself. If you collected yourself. I'm going to pump somebody else's blood in me. Usually direct interface like a direct exchange with a person or you go through a trusted third party. So a direct exchange of the person. So to find that precisely. Like in the real world. Yes. So like I watch you from your blood and give it to me. Yeah. But you just came from my body. You don't know that it actually was my blood. Yeah. You're going to blindly trust me. Thank you. I'll see you later. If you repeatedly show up and provide a DNA sample every single day. At the end of every single day. The trust we have in you. And the fact that it's your DNA. Okay. So maybe you have some tests and the trust in me over time. You've seen the movie. This is the exact concept here. It's why I'm thinking the opposite approach. Yeah. You could have like some sort of like government like database to like match with. Like all of these guys. If these guys say they're good then we can say that they're good. They're good in what sense. We're not talking about good. We're talking about who they are. Good is in like. They're up in. Like. I mean there's always going to be some. Percent chance that you don't know who that someone is who they say they are even like. You know. Obviously with digital transactions or with something that you've never met. That's pretty high. But even in person like. It's not impossible for like. I don't know. Someone replace Jason and like got plastic surgery to look like him and like. Fall break is a perfect opportunity. Yeah. So. You know there's things like oh we're going to meet in person and exchange our keys is like pretty good. And it definitely decreases your like. Margin of mistrust but I would say don't resort to plastic surgery to win the Web of Trust Assignments. Advise against. Okay. So bring it down though. So how. So okay. So one thing you said is DNA. Right. What are other ways you could identify me and how do you actually. But so what I was trying to do that. If you just look a piece of my DNA. Does my DNA actually say this is Adam DuPay. No. What does it say. Says this is the DNA that I plucked from your body on this day at this time in this place. Right. And then how do you then later authenticate me that I was that same person that you took that DNA from more DNA. You have to take more DNA and then compare it with a database of what you've already known to see that it's me. What other aspects of this come up that doesn't involve taking something from me. A password. A password. Yes. So in what sense. So elaborate. I mean it's kind of the same idea and like we all do it whenever we create accounts. Like you say hey I'm me. Here's my username. Here's my password. That's like the initial DNA sample. And then every time you go to log on after you provide another sample which would be the password. All right. Good. Sort of like this in a second. Yeah. What are some other ways. If I knew Ferris before this class I trusted Ferris. You know I could turn to him and go like is that really Adam standing up there. Is it just a random guy. I see. So you can use like the web of trust kind of thing and ask people you trust to see if they trust me. So what else. How do you get into your cell phones. Password. Fingerprint. Fingerprint. So it's a fingerprint like a fingerprint or non unique. Or with current databases. Right. So fingerprints. What else. Which is. Face is face recognition. What's the difference between face recognition fingerprints DNA and a password. You can change the password. So you can change the password. Have you changed the password before. Yes. Have you changed your DNA recently. Yeah. In the back. So a password is something you know. Those other things are something that you are. Yeah. So a password would be something that you know. So some kind of you talk to somebody. You either give them a secret phrase to give you a secret phrase and later you could verify that they're the same person because they tell you that. Yeah. The first two are actually like reliably verifiable while the second two aren't in which can you define which two I've lost. So like fingerprints with the systems we have are unique. There's like a case where a guy got accused of bombing a train in Spain because his fingerprint happened to match and he was like I've been in. I think it was like New Hampshire for like the last five years or something. And their fingerprints were a perfect match in the system because of how that works. And then like everyone reads the articles about how much face recognition screws up. Like that's not perfect. So the first time they came out with that all you needed like especially in the computers. The face recognition all you needed was a picture of a person. And then they did something and they tried to do liveness detection to see if somebody was alive or a piece of paper. And specifically what they looked for is if the eyes move. So then what they did is they made a piece of paper cut out the eyes and make little splits for their eyes to move and it would log them into the system. So for all these methods so we can think about them in different ways about what are their kind of innate characteristics but also how useful are they for authenticating a person. What's the level of so a false positive in this case? Well, a false negative would be a system saying that you're not who you say they are when you actually are. So deny you access. So that would be if you've ever old iPhones, I think if your fingers were sweaty, the fingerprint reader would not be very good and not let you in. Other way would be a false positive would be if they let you in when you're not on the system. So that would be all of you. If you were able if somebody in here had the same thumbprint as me and was able to log into my phone. I don't know what the odds of that are. But fundamentally, so when we think about authentication mechanisms, we want to separate them into kind of different categories, at least at a high level. And this goes back to so password is something that you know. So it's a unique piece of information that is something that you know, your DNA, your fingerprints, any of your I've actually never had it done, but the retinal scans, those are actually real things where they can scan the inside of your retina and use that as an authentication mechanism. And geometry, was it geometry as well, geometry like of your API, like you put your hand into a scanner and you get hand geometry. Oh, interesting. So they can do some kind of hand print in some sense. Voice recognition systems? What's like the most common voice recognition systems? Yeah, you have been in your homes on your phones, right? Siri, a Google all that stuff. It's supposed to not work. I mean, it's supposed to be trained on your voice. So other people can't use it. That does not work very well. As I'm sure you're all right. So these are all things that you are. And so the key difference here is you can't change something that you are, I mean, ideally, right? Beyond what we talked about of crazy measures to actually change all these features. The only way that you've ever identified or authorized people by a secret password or testing their DNA, an ID. So an ID, what's an ID useful? Yeah, so that you're verifying that. So if I showed you my ID, that would mean that what would that mean? I showed you my ID. So you've met the picture with me, the person in person, and you probably match the name with what you think my name is. Does that mean I'm definitely by saying? Yeah, it also depends on like what ID you're showing us for. Like if you walked in here, and you just pulled out your driver's license and said, I'm having to pay, that doesn't matter because it's your driver's license, not your ASU ID. You're still here to teach a class not to drive us around. Yes, very true. So yeah, so then maybe the context is important there. What IDs you're willing to trust? Is anybody from a state that doesn't have the whatever national ID requirements or something? No, there's some states where their ID systems don't match federal requirements. So if you're going to TSA or going to a government agency, you need to have your passport to authenticate to them because they don't trust your state issued ID license. So think about another way. So one way to authenticate somebody when you need a person would be like we talked about on the web, some kind of password, which would be some kind of knowledge. But what if I gave you a unique, let's say an object? And then that way, every time you saw me, I am out of and here's my object. I would say, Oh, yeah, that's the thing that I gave you earlier. The analogy is a little weird. Another thing would be a cell phone if I get somehow uniquely identify your cell phone. And I can say, Oh, yeah, that's your cell phone. So these are kind of the three main categories we think about. What do you know? So what's some information that you know that supposedly nobody else knows what you possess? So this would be something like anybody use students don't have to use it yet. Two factor authentication for ASU any employees? Yeah, so you have to and what's that verify that's like a factor that you own the device associated with that two factor authentication, right? So that you have the device that is used in that in that authentication mechanism. So do you does that mean you don't have to provide a password? No, but if somebody steals your password or guesses it, they don't possess your phone. And so they shouldn't be able to access your systems. Cool. And so the third category is what you are. Right? So that biometrics, those kind of things that should be very difficult to change. Any other examples of these three that we didn't talk about? Face ID, face ID, which of these would be face ID? Like kind of like a biometrics, I guess. Yeah, so it'd be what you are. Yeah. challenge questions like first pet's name. Yeah, I think I, well, I think I would agree. There was a lot of what you knows. It kind of maybe it doesn't depend. It depends, I guess, if you were the one to originate the answer to that question. And then later on gave it back. So I think in that case, it is a what you know. Anybody done like a credit report or anything, and get authenticated that way? What do they do? Right, exactly. That's why in that sense, I'd say it's a little bit trickier to precisely categorize that because you can't go back and change those things. But they are should be something that you know. But you can't change your past addresses. That's like the hardest one. I would have to go to Amazon and look at where I've shipped things to see if I've lived in these places. Yeah, so like, would a driver's license fit into what you possess? Because it's technically yours, but the government issues it to you. So you're kind of authenticating by other government. What do you all think? Yeah, it's what you possess, because it's like the government is giving you this object. And then, and then you can give that to somebody else, and they should be able to verify that you are that. Yeah, I probably put that there. Yeah. So back to the like, credit card thing that you gave an example of. So wouldn't that be like what you know, because like someone else could, like, theoretically, just use that same information? Because like what you are kind of saying, like, something that only you could provide? Yeah, it's very tricky. It's more like a but the other key difference between the two is change of like, can you change that thing? And you can't change what addresses you've lived out of the past, right? Those are a fixed quantity. It's more like almost assessing what you were, which is weird. That's when we can add. Yeah, maybe there you go. Let's write a paper on it. Yeah, what other people about you? What other people know about you? In what sense? So going back to the earlier thing, kind of also the government but also like other people around five really trust Ferris, I can go back to him and say, you know, is this person is that true? Is that true? Right. So then you could get other people to vouch for you. So you could. Yeah, so I think this happened to me in the brickyard at least once where I got locked. I didn't have my ASU ID card. So I couldn't get in the building after hours. But I'm like, I'm a professor here. I have an office. And so they escorted me up to my office. And I was like, look, my name, my like, I am, this is my office. So yeah, they led me in because of that. And that was kind of more of a using different objects to try to demonstrate who you are. Yeah, kind of on that topic. So like my friend went into the army or something. And they did like a huge background check. And so some guy like contacted me. I had no idea who he was. But he was like, do you know this guy? Like, what has he done? Like, who is he? So that's kind of like, what, like what other people know, right about the person. And they're matching that with what they told them to see if it's if it's the same thing got a number of background checks for students getting clear. So I have like that. So when you lose that number that it's like to ask you to look for your friend. There's another option of getting like sending to your friend. So they they input that number, send it to your account and then that will unlock your account. Interesting. So that would maybe be, yeah, I think that would be like a web of trust style of what other people know. It's you also weird one Facebook used to do this where if it saw us with suspicious login, it would show you pictures and say which ones are your friends like match them up to the friends, presumably because you would be able to do that, but a random person wouldn't be able to. What other things? Maybe ever mark street signs in a picture? Yeah, what's that for? Not robot. Yeah, so you're authenticating yourself as a human in some sense. You, they have turned the question of are you a robot or not into a way to help them. But yes, fundamentally, you fail that they'll say you're an automated go away. So they're actually are authorizing authenticating you in the sense of human or not. Yeah. What? Why can't robots do that? Like, we have robots that can recognize street signs. It's Google's think the feet of purpose. It's learning how to recognize things. There was definitely a paper that used Google's own image recognition software to break their own capital systems. It's more about the effort level, I think, in terms of making it more difficult for people to do that. I mean, we'll talk about captas later. So that's like a whole subject of it's like good or not. Yes, so yeah, there's capture solvers, solvers, services, which do you know, so everyone's familiar with the capture. That's the thing that says are you an automated person or not. So there's how you solve that. So you're a bad guy who wants to bypass this for whatever reason. What are your options? Write a program to solve it so you could write some complex AI that solves the system or uses some deep learning. Yeah, there's also the capture. So you literally just check a box that says you're not a robot. So yes, I don't know the software in place behind that like what prevents other softwares from just checking that box. So I'm running complicated checks of your browser environment to see if you're a real browser or a headless Selenium or just an automated script or whatever. So it does some stuff. But not to say that that couldn't be bypassed. What's your other option? So you could write a program to do it. You could pay someone to do it, right? You can pay and this is how actually a lot of the capture breaking services, they just farm it out to people who will do it for pennies. Or the other way is let's say you go to a site that has content that you really want to see. What I call this, say streaming some video of something that's not available in your area, and they would just pop up captures to make sure that you're a human. But you don't know that that's actually a capture for somebody else trying to break or automate some system. So basically farming it out to other people as captures and then putting the results back into the automated system. So yeah, that's that's ways that you can break that. Cool. Okay, so Oh, another one we didn't talk about is where you are. So what would that be? Yeah. So if I like, check my credit code history, you know, there's an awful lot of charges for gas up in Oregon, and I've never been to Oregon. That would kind of be flag or you didn't have any flights purchased to Oregon, and there was no rental car purchase in Oregon, right? So they can try to infer your location based on that. Yeah, yeah. Yeah, this happens all the time, right? When you're accessing a system, they say hey, or if you're traveling and try to access Gmail, it's like, ah, it's a little sketchy. Like this is somebody now in a completely different location. Yeah, work network. So or I mean, that would be maybe when are you as well, they can say, are you accessing this at 3am your time? Okay, so there's a lot of a lot of companies restrict IP access to make sure that only certain IP ringers can access certain things. Yeah, so this is actually the main way that like public Wi-FIs that require registration are authenticating you with your MAC address. So if you, and you know, you can change your MAC address. So like if you want internet on your phone and your computer, if you change their MAC address to one that you've already paid for, then you'll have internet on both, but you can't have it at the same time. Otherwise, that would cause a massive havoc. But you can definitely do that. Cool. All right. So when we think about, though, again, we're going to think about these things, an authentication system conceptually, so that we can then dig in and be able to discuss what types of what we mean by an authentication system. So so we'll call a some kind of information that proves identity. So what were some examples of information that proves identity and what we've talked about so far? Passwords, government ID, retinal scan, knowledge of your past. Yeah, all these kinds of things. Objects. Objects. Yeah, so objects that we have in show. We'll call C because and this actually makes sense when we think about so think about what we talked about the retinal scan. Can an authentication system give you a yes, your authorize, sorry, authorize. Can a retinal scan authenticate you just on your retina alone? The scan of your retina? What else does it need? It needs to match it with something else, right? It needs to have some complementary data that is collected in the past of who you are. Maybe you get that from a government system. Maybe it adds you to that information the first time that it sees you whatever. So there's some we'll call it see some like complementary information that's used to identify who you are. So that will be a function that maps the information you give the complementary information. So for a retinal scan, this would be the database lookup of looking up this complementary information in the database based on the retinal scan. L will actually do the authentication function. So you need some function to actually give you a yes or no. Is this person authorized? What should the type of this function be based on the types we've already discussed? That's too low level. This is a high level authentication system that can model both password based hashing and rent this scan. I was gonna say it should be like a one to one function, a one to one function that maps. Yeah, anyways, that's probably more of what I would do. But we'll say right now it, well, I guess we don't have any information here about what we're trying to authenticate. We'll say that a kind of has that claim in there. So we'll say like true or false. Yeah. Um, do you know, this is kind of a question for like, it's kind of a question or is a question. It is a question. It's not super like related. But for biomemetric authentication systems, is that all of that data is still hashed to be compared? Or is it stored raw? Usually, I wouldn't call it hashed. Like, you can't just get on their database and get raw retinal data from everyone in their system. Can you? Well, if you're in their system, and you wanted that, I would wait and get all of the authentication scans and store them all of everyone who's authenticating the system. You right. It's not, I wouldn't call it a hash. I would definitely call it a cryptographic hash, because I don't know that it's completely irreversible and has all the properties we want. But yeah, I call it more of a fingerprint where they take the retinal scan. They'll do some, they'll figure out what important points that are using some algorithm and store that information. And then when a new one comes in, they do the same transformation and compare the features against each other. Is there a reason they don't do like cryptographic hash for it? Because of the noise. So the cryptographic hash, if you have a one bit difference in input, you should have a completely different hash. So that's the key problem. Yes, the function simply just maps the information we give it to any cobbled entry information that they, so given, let's say a retinal scan, it would return the, let's say C in this case is the features that are stored about that specific user that's for that identity that's trying to log in. And then L would then look and say, Okay, this user with this retinal scan is had this stored fingerprint do these match? Yes or no? Any other questions? And so basically, then you need some functions to be able to add information to see or alter information. What what are some cases where you would want to remove information or alter information? Yeah, I want to alter like your password if it's been compromised. Yeah, alter your password, right? So change your password. So you want the identity once they've authenticated to the system to be able to then change that password that's stored. What else? Now somebody leaves, you may want to delete their account completely. So the difference between A and C is that A is really only captured once. A is the information provided by the user that wants to authenticate. So you can think of the identity that they're trying to authenticate as as long, along with any other information that they have to try to prove they are who they say they are. So in this, in the case of a username and password, A would be the username and password. C might be if we're talking about a stored salted hash database, that would be the database of all passwords. And then F would map what user they're trying to get with the salted password. And then L would compare those two and say true or false. Yes, the authentication. This person is who they say they are. And then S would be the ability to add your password, update your password. I mean, S is just a set of functions that basically taking an A and return, taking an A along with some whatever information you need, and returning your new A or C. So I'm trying to authenticate as a user that doesn't exist for with that information, how does F map A to C because result would not be in C? It would, you could hand with that away by saying that C includes the empty set. And then now if you're trying to empty set, you would say, okay, nothing, no information stored in here. Or it could be maybe there's a guest user in C that it would return. And then that way L would use that to check, hey, guest user go away. Because maybe you are allowing guest users in your system, right? So this captures that behavior as well. Again, just like when we look at crypto systems, this just gives us a high level way to think about, okay, that's given an authentication system, you can say, okay, what are the different components here? How do they work in this overall terms of an authentication system? So how would a password based system be represented by this? So let's switch to handwriting mode. Okay, for a super simple password system, what would it look like here? Why can't I make this smaller? Okay, so we want a password system. So somebody described me at a high level on a password system does, they give you a password, and then you compare it with some stored version of that password. And then if they're the same, then the user, if they're not the same, then we're not the user. Yeah, so what else do they give you in addition to the password? The username? Right. Yeah, so username. So so the authentication information. So how would you describe that set? Yeah, so a couple of username passwords, I would say of strings. We'll just call them username. As much as I love writing, I'm gonna say you name and pass your name. We'll say all possible string combinations. So then what's C? Yeah, so right now, we'll just keep it simple. We'll say it's the same. It's actually the same set as a. So we'll just store all the user names and passwords in our database. And then F becomes really simple. I mean, we're not gonna write it out, but take the username in from a, look it up in C and return that password, right? Because F is only mapping F takes in an A and a C and you could easily write a function that basically I'll use it like a terrible hash table. A, this is the great thing about actually coding. You can do whatever you want. So look up in C, based on the username, whatever A's username is, return that tuple. So then how do we actually authenticate this? So we're given, we'll call it, yeah, we'll call it a and little C. So how do we, what's our authentication function? The passwords are the same? Yeah, so well, is that so let's do a dot pass, C dot pass. Yeah, so I may, I may need to I think this guarantees that the user names are the same, the C function. But I don't think there's quite a lot of harm in doing that. Anything else? Good. Cool. And then what kind of S functions that we want? password change password. So it takes in a C and a username and password and that would update find the user in there update the password return a new C with that new. Yeah. When people change the passwords, is it best practice to generate a new salt for that password? Or does it not matter at all? We have not talked about salts or hatching at all. So what is this? So if you were implementing this, let's stored in your parliamentary information C password for all users. So for right now, we would not do any of that. Which one? At the very top? The L. And what does well, what should it be? So I just give you the correct username, should it be Levin? What's the essay? S. So this would be a change. So this is like what kind of functions do we want? We want something to change passwords? What else do we want? Delete? Maybe reset add. Yeah, so very important, right? We start out with empty parliamentary data, we need to add a new user to the system or a new authentication information to the system. And you could write these very similar to kind of how we did here in English of good stuff. Yeah, just right. Can I ask a question? This username and password are what? The username and password are what? A tuple. This was I'm kind of using notation. So the set A is a set of tuples where the first element is username, we're going to call username the second one is password such that username and password are all possible strings. So you can think of so A is a set containing tuples. The first element of the tuple is going to be any possible string, the second element of the tuple is any possible string. If we had requirements on usernames or whatever, we could define them kind of in here. Any other questions? So we just developed a super simple password based authentication system. We'll see another way of doing that where we did it. So yeah, so basically, this will be it, we're just storing passwords without the user information. But I think it makes more sense to think about the user information. Cool. Okay, so what's the problem with this authentication system? All these user names and passwords stored together. Yeah, but it's my site. I own this site. And these are my users, except are you allowing other people on your site? Sure. I mean, it's a site, there's accounts, people can sign up for accounts. That's my site, right? I know they're, they're my users. So I know their user names and passwords. That's their problem. Well, if they are storing sensitive data on that site, maybe shouldn't be allowed to see those. If they're storing sensitive data on that site, but this I own the site, I also own all the data that they're uploading. So why can't I see the passwords? Yeah, you should be able to. But if somebody pretends to be you, it's like a single point of failure. So if somebody pretends to be me, then what could they do? Right. So if anyone breaks into my system, they get the list of all the user names and passwords in my system, right with this seat. Or if they trick my application to somehow divulge see, now everyone knows everyone's using passwords. And then what can that person do? As any user, right, they can authenticate to the system as any user, because they now know all of the passwords. I don't know, should I really care about that? I feel like that's your problem. I build really safe software. Wouldn't your information and credentials also be stored in the seat? I don't use the system. I have a problem. Well, ultimately, you're running the site properly to make money. And if you were to have a true statement, depending on your business, some businesses make more money than security. Yes, there was. What's the camera with the name of the Oh, I think it's Equifax is now back to exactly where it was before the breach was announced. 40 or 50% and now, maybe even better. Yeah, companies turn around and sold to all the users that they exploited, or well, they exploited them, but they're users that were exploited. Yeah. Yeah. Okay, so I don't know, what do you think? Is this is a problem or isn't it a problem? At the end of the day, you're you may be a developer that's implementing an authentication system. Yeah, so I may. So maybe I'm trying to talk about me as the organization, but an organization doesn't actually exist. It's composed of different people, right? So you may have the database administrator who has access to maybe some of this database, you may have the developers that have access to some things. But they don't just have the ability to go log in as any user. But if they can see all these passwords, you have to trust that no employee is ever going to use this information to log in as a user. You also have a trust that if C is ever released to the public, your database of username passwords. If you know about it, is that I mean, that's good, because you can use your reset password functionality to forcefully change everybody's password. If you don't know about it, what happens? Yeah, you someone else could use the delete passwords to log in. Yeah, as your users, right, potentially causing harm to your users, and you're going to get probably inundated or get calls from me, depending on how they do it, people where money's been transferred to their accounts, they didn't authorize this and you have now a huge problem. And then we get to the problem of password reuse. So I said that's not my problem. Is that true? So what do I care about as a person who, let's say, running a password based system? What are those password based systems that you use? Email, bank, bank, literally almost everything, right? And they're all websites that at the end of the day, usually are trying to make money. Right. So one way of thinking about it is saying, well, we don't want our users being harmed, right, either by us accidentally by real employees, or by people hacking into us and stealing this database. But so let's say that a user has used their password on multiple sites. Somebody give me one of the odds that one website gets hacked and all their passwords are released. Hi. I should, you know, no, I think it's a relatively low, say, say it again. If it's any one website, it's relatively any one website. Yeah, I'd say it's relatively low, but also over time, probably certain, almost, I don't know to phrase that, but even if it's 10%, now, if you've shared your password, the same password on two different websites, now what's your chances of your password being leaked? Double, because now either site, both sites have a 10% chance of getting hacked. And either one means that an attacker can now log into both sites. Now, what about three sites? Four sites, five sites, six sites, the hundreds of sites that you use daily, including your bank, your email, along with your sock company, and the company because you do want to save your high scores on your video game. So you use that site. Now, any of those get compromised. And now an attacker has access to all of your other accounts. That a problem? Yeah, I mean, I would think so. Does this do anything to combat that problem? No, not really. That is actually more of a recommendation for everyone here to not reuse passwords on their sites. We'll get into it more later, but my recommendation is use a password manager much. It has its own issues. But I'm, I use LastPass. I like it a lot. It works on mobile, the new iPhone updated awesome, because you get, you can put your passwords right in from LastPass into the website. It's great. So how do we solve this problem? What's the key thing that we need? So what, what, okay, let's think about it in terms of requirements. What don't we want to do? We don't want to store the password. We don't want to store the password. So why don't we use encryption, right? We have this great encryption stuff. Isn't that an entire purpose of encryption to take something from plain text and turn it into something that people can't decide on. So what if now I generate a super long random 256 bit AES key, okay, and then every password that comes in before I store it, I encrypt it with K. And so now what's going to be in C, a username and not a password, but what? Yeah, so it'd be a we won't do the whole thing, but we'll do basically it'll be your name and encrypt pass with K. Now, if somebody steals C, what do they get? A list of user names and encrypted password. A list of user names and encrypted passwords. And we just solve all of our problems. Why not? Yeah, you get the key, you get everyone's password. Why is that a problem? Super long random 256 bit key. It also semi leaves you open to if they know someone else, if they know a password that they reused on your site from another site, they can use that to try and attack the key on your side as well. So modern symmetric cryptos systems are immune to chosen by text attacks. So you should not be able to do that. Yeah, if they hacked your site already to get C, then it's likely that they might have also been able to hack it and get K. Yeah, because what so how do you so where is K going to be in this abstract system on the website or in your server on this abstract system? So where is K? So we didn't we just put it here. But where else? Who else needs to know about K? L needs to know about K. What else? F maybe needs to know about actually F just needs to look up the user in the database right now. Know about K was it? S everything in s probably needs to know about K. So here you have some bit of information that 256 bit key that 90% of your system needs to know about. So if somebody's able to hack and the question is, where does that key live? You put it in the database? Along with C? Yeah, it has to be separate, but but still accessible by all of these components. So this is a dangerous game to play. Because you think you're being super clever, you're using modern encryption that is very good encryption. And so, and fundamentally, right, once they get this, if they have everything that's been encrypted with K, they get the whole database and they get K, what do they then do? Just decrypt everything. Everything. I have all the plain text passwords. So and their companies, you can look up, I don't want to accidentally shame a company. There's a company whose data was breached that was basically doing this. And it's one of those things like, I guess, technically, it is slightly better than doing nothing. Assuming you're doing it properly and where you store your keys, but basically every server that you're running needs to know the key. And so that keeping that data confidential is very, very difficult. And as soon as that's leaked, then the whole game is over. Right? So you have this key that rather than being the secret key needs to actually be on every piece of your infrastructure. And now your attack services will steal the database and steal this 256 bits from the machine, which, depending if they can get to your database, the odds are that they likely be able to do this somehow. So scrap that. Yeah, question? So public private key crypto. So what do you want to do with your scheme? I guess like when you create a username, you encrypt it with the public key, or something, and then it gets stored in the database with your public key. And then if you want to change or access it, decrypt it with your own private key that the user has the user. Yeah, okay. So that way each user has their own decryption. But I know the public private key takes a long time. Yes. So the short answer is for something like this logging into a website. Yes, this is usually overkill. There's a lot of other challenges to actually get that system working because if they have their secret key, the only thing they can send you is things signed or encrypted with their secret key. So you would have to generate some challenge to them, where they would then take that, encrypt it with their secret key, send it back to you. And then you would need to verify that with their private key that it was signed correctly, and then it was the right thing. And then you need to worry about time so that somebody else can't steal that communication and just log in as you. And then the other thing is, this would mean that. So if you think about a web browser, you would either need a secret key on every single the same secret key on every single device that you use, which then is difficult. How do you do that? Or you need to have multiple public keys? And how do you you now the bootstrap problem of how do you log in from a new machine? I've never seen before. So it's it's done. And this is how a lot of like infrastructure that talks to each other, but for something like HTTP, which was never actually intended for something like this, doing that is difficult. So what else? That was good. Good. Yeah. In crit, wait, again? Okay, so then we do e, you name with pass. And then e pass with you name. Cool. Okay, so the username is encrypted with the key of the password. The password is encrypted with the key of the username. So somebody steals this database, what do they get? A bunch of cyber attacks, can they go backwards? Hopefully not. Hopefully not. What do they do? Are they stuck? Have we foiled? Feels like there's an attack there. Yeah, I wouldn't know what it is. Yeah, the back to be a dictionary attack and then you do what? So you could go. So just try for every password. What would your first password be? Password password. So you try that on every one of the user names. As soon as that pops out a username, you now know the password for that username. And then you try probably password one or QWERTY or I don't know about up to date on all the common password schemes. Yeah. Also for a lot of sites, the username has a form. So you could just take common names like john and a driver. Yeah, so then you have in this scheme, now you could attack the username, right? So you could try taking a username that you guessed, decrypting the password. If you get something that looks like a password, you're correct. And then you can verify it on this one by using that password to decrypt this thing, and then popping out the same username that you started with. Are usernames secret? Generally not. Generally not, right? So for a site like so. Think about if you did this kind of a scheme, now the username becomes a secret piece of information that you need to hide. So you couldn't actually anywhere on your website have usernames because an adversary could just crawl for usernames, use them all and try each one on every single hash and then break basically all of your passwords and all the user. But if it's emails, I just get email lists, you can get lists of emails figure out who your users are. That's slightly better. But yeah, this is I think the key thing is that you can't the whole point of the username password approach is that the username should be available or noble to everyone and only the password is secret. All right, so what else has is why is a hash useful here? What about what are the properties of a hash that makes it amenable or amenable to this? So there's like different passwords are going to generate drastically different hashes. So you have right so different passwords generate different hashes, you can't reverse engineer it, you should not. It's a one way function, right? You should not be assuming let's assume we're using a cryptographically secure hash function. So take some hash function of the past. So we'd store the username and a hash. Well, that's ugly. So what does this get us then? So now let's go through a tax scenario. So just like before, we said C leaks. Now what? So then all the user names, which is fine, we'll assume those are public anyways. Can they take that hash and go backwards to the passwords? Yes, in what sense? Can they take a hash and go backwards? Yeah, without any other information. Okay, that's why I wanted to talk about first. So just getting an hash without any outside information or any other knowledge, going backwards should be incredibly difficult. But that only is true in what case the password is random or large enough that we can't guess it. So now you want to go? No, no, we're there. I was setting it up. So it's still vulnerable to a dictionary attack, because assuming the hash function is public, you can just try common passwords and then hash them and then see if the hash is the same. Yeah, so actually, even before that, would you know which users have the same hashes? Yes. Why? To wonder what? Well, it's an end to one. All right, the same password will hash to the same value. Let's say that we raise precise what we need, right, which now means that every password, every password that is password, you will identify in the the hash, right? And you can even just easily do statistics and say what users are using the same passwords. And then you could go if you needed to figure out those passwords, then you could try you would just you could take you could brute force it in terms of length or however you want to do it in terms of most likely password lists, you take that password and hash it and see if any other hashes matter. And now you know the password for all that users. Do you actually know for a fact that it's definitely their password? No. No, but what do you know? It hashes to the same value. So it doesn't matter. I mean, this theoretically possible for two ASCII strings to hash to the same value. And so possible that you did find their password, but it doesn't matter, because you're still into the system. Cool. So what were the problems with this approach? That sounds pretty clean, right? And actually, to go back, we'll actually go to do some history. Remember, when we talked about Unix, how does where does Unix store list of users password? E-D-C password, T-A-S-S-W-D. Were there any passwords in there when we looked at it? No. No. But it used to be that, yes, they didn't store passwords in there, but they stored hashes of passwords in there. And so your password had to be here, we'll only focus on passwords, not usernames. So here, your password was a string of eight characters or less. And this was dictated by the systems at the time. Oh, so wait, okay, more, more, more history first. Okay, so you have a Unix system. Now have a password file where everyone can see the hashes of everybody's password. Why is that a problem? You've just given them C. You've just given them what? You've given everyone on the system that can view that file C. Okay, what's the last one you said? C? Oh, C. Yes, sir. Why are we talking about this? Yeah, so you've given them C. So I can see all the hashes, which has all of the problems. Now, can they change it? Can an average user change that file? No. No, it's only readable and readable and writable. Everyone can read that file, though, because they need to be able to see the other users on the system and what their user IDs are all So what's the problem with that? Why is that an issue? Yeah, the same as physical attacks are available to people that can view that that password file, or they can also see users that have the same passwords. Yeah, or even as you, you could, I mean, you don't have to change it, but you can easily just view the password file and see what users are using the same password, possibly as you, your password is password, you can see every other user that's using that. You can calculate the hashes, you can do all the attacks we just talked about. And so what's the key problem here that they all hashed the same thing that every single password hashes to the same value, right? Because we without any other information, we can't go back through the hash from the hash. But because every hashes is disturbing, not and we definitely don't want non determinism in here. So what can we do? Somebody doesn't already know we could use like some other information about the user, like not their username. Why can't we use their username? So we use their username. Then what would happen? So two people have the same have the same password as password. And same username. Yeah. No, they can't have the same password and the same username, otherwise have the same user according to our system. So let's say they're different user names, same password. So let's we'll just combine them together and concatenate them and hash it. Will the hashes be the same? No, no. So we can do we can change our scheme slightly. We'll use vertical bars to do concatenation. So we can't make the username the password that anybody else can do that. So they could. So we've at least gotten rid of the same password hash to the same thing. So what we can what somebody would need to do is let's say they wanted to break my user's password. So I guess question. Before what did somebody have to do to find out every user in the system by the password and password? Just see what have hash password and then see all the hashes. And see who has that, right? You create a hash table out of all the hashes. So it's basically a one operation. It's trivial. Now in this scheme, can I do the same thing? No, it's not as easy. Usernames are public. But how many user accounts are able to figure out if they have the same password as password? So before how many, how many hashing operations that I have to do to figure out one one, and I can test the entire database. Now I have to do individual hashes against the individual user names, which can increase the attacker's workload. We will not use the well, rather than using the username which can vary in length, it can vary in terms of randomness. So we're going to create a salt and add that to the password before we hash it, but the salt specifically will be public. And so, so anyways, we will come back to this after the midterms, you'll forget, but you won't study this. How are we doing? Is it considered best practice to salt to like re salt if they change their password?