 All right, folks, let's get started. Hopefully, everyone had a good fall break. Keep going with authentication. You all can talk about signing each other's keys and trying to verify each other's identities, and hopefully, you're learning a lot about trust. That's amazing. Well, okay, so to refresh everyone, it's been a week since we met and talked about this. We are talking about authentication. So what are the aspects? What is an authentication system trying to do in the larger context of security? Who somebody is, right? So trying to do, whereas, authorization says what you're allowed to do, authentication asks the question of who are you? So what are some of the ways that we can do authentication? User name and password, and what is that? Actually, authenticating. Why is that authenticating you? Yeah, so it should be the user that created the account is the only one that knows the password to that account. What are some of the types of authentication mechanisms which we talked about? Yeah, physical location. Physical location, what else? Using your phone to verify. Using your phone or a secondary device in order to verify, so you're verifying that somebody has control of that device, what else? Yeah, oh, sorry, the sensor, you have fingerprint standards? Or? Biometric generally, so what's a biometric that's not a fingerprint? Iris, yeah, it's like eye standards, all this kind of stuff, facial recognition. All right, so we talked about an authentication system as having, we can think of it abstractly as having different types of functions, and so we can think of an authentication system that is trying to map who the user is to their identity in the system. And so we have a number of functions here we can think about, we can have a set of authentication information, a set of complimentary information, functions to turn an authentication information into complimentary information, and a function that verifies somebody's identity and says yes, they are who they say they are. The Unix system, so we looked at kind of classic Unix authentication as strings of eight characters or less passwords as complimentary information where it's storing the hash along with the two character hash ID, which essentially acts as we talked about as our salt, and where now our function that translates somebody's password into the complimentary information is a version of DES that's specific to the hash ID and has different ways to log in. So we looked at it and we saw kind of the whole process of an external entity logging in and authenticating to a server using their password and how the server is actually able to verify their information. So why, here in this example, are we storing, why is Alice, why are we storing with Alice this thing that's not Alice's password? The hash of her password, why are we storing that? Why I was looking at such a reason, why? If the server databases leak, you don't want everyone to see plain text passwords. Yeah, so a hash is a one-way function, right? Meaning that it's easy to compute, give it a password, compute the hash. It's difficult to go backwards, to go from the hash and find the password and ask for that attack value. So now if we put it on our attackers tab, what's our goal as an attacker? So how can we attack an authentication system? Find whatever the hash function is, how does that help us? Okay, so let's, so I guess that goes to threat modeling here. So what's our model, right? Does the attacker have knowledge of the system or not, right? We'll take a similar approach between crypto where we assume the attacker knows the system because otherwise then we're banking on the fact that they don't understand how the system works, right? But that was smart. We want the system to still be secure even if they know the attack of the system works. So that's a good point. What other goals do we have? Yeah. So like, because if I remember correctly, caching is about integrity investment as opposed to what is necessarily secure in it. So if you're an attacker and you know the system works, you want to get the message and then you want to find another message that's different that has the same hash. There's a couple of those that we have possible that you'd want to find a way to do that and that way they get the hash and the actual message would be very different. Okay, so then let's go. So we're kind of talking more about how to break this specific hash function, right? One of the ways is find a different password that has that same value. This is a simple thing that has the same password because that's all it does to verify. But at a high level when we think of it in terms of this authentication system, what is the attacker trying to do or what would the attacker like to do? Or whatever, what are even the goals of the attacker? What's the attacker trying to do here, yeah? You want to convince the system that you're someone you're not. Yeah, so you want to impersonate or spoof to the system and pretend that we're somebody we're not, right? Yeah. I give a list of what you're feeding hashers and you can find it to put up your hash. Forget hashes. I'm not thinking about hashes anymore, right? Let's think about what's the goal. Okay, we want to impersonate somebody. So how can we do that? What do we need to do that? We need somebody to impersonate so we maybe need to know about a user account or a user app. What else? We need to know what attributes that user has to access any data that we're trying to download. Yeah, so what does that user use to authenticate the system? Their username and their authentication information, right, that's that A. Maybe it's their fingerprint. We don't actually care, right? This model doesn't actually specify what that authentication method is. What if we steal that information? Right? Is that good? Can we impersonate the user at this point? Potentially, why? Why not? There could be some other variant factors whether it's in my office or something. But this is everything in our model, right? Our model says that the user inputs some authentication method A and then, so something from our authentication set, there's a function that then maps that to the complementary information and that complementary information is then matched with whatever's in the database. If it's in there, we're good, right? So at a high level, this is all we care about. Can we find an A such that for one of the, with a specific function that maps the authentication information to complementary information that that's correct for that user, right? So this is our goal. So we can think about different ways to do this. So, and this is what we've been talking about and focusing on, right? So if the attacker has a complementary information, right, so this would be the case that the hashes of the server are leaked, right? Then the attacker's goal becomes find some A such that when you pass it through this function, the hash function, if you see, yeah. So if you're saying that find some A such that half of A equals some C in the C set, just the set of all possible, like, yeah, we're rather than actually about the password. Let's say that it's specifically, this is given for a certain user. And this, so this one attacker goal, this would be at C's associate with some entity. So, yeah, you need another clause on there that says the login function returns true. So yes, if we're, so then, so if the hashes in the passwords are the same, is this a difficult goal for the attacker or an easy goal for the attacker? So they've stolen a complementary information from the server? I mean, I feel like it's going to be kind of hard to get the brute force of bunch of plain text passwords. But, so, okay, so let's say the complementary information is the same as the password. Oh, okay. Right, so plain text, we can pick up plain text passwords. Yeah, easy. Right, easy, because this F is the identity function, right? Pass in something, do nothing to it and return it. So, the attacker didn't easily log in, right? But, now, with this, so if this F is a true cryptographic hash function, what does that mean if you have C? Pretty much resistance, so it's pretty much impossible to get A. Yeah, so it should be, you should have to brute force all possible two to the, whatever the strength of the has function to the 256 possible passwords to find one that maps to the same value. What would you rather brute force instead? What would it be another technique? So, this would be breaking the hash function. Maybe just try a bunch of really, I mean, I guess if it's one user, I don't know, but like a dictionary attack or the bunch of common passwords? Yeah, so what if I tried the password as a password? Right, put that through the hash function, see if that's the value. What would you try after a password? Password one. Password one? What else, what are other common passwords? Just don't say your actual password. And if you're using any of these passwords that we're talking about, you should change that. One, two, three, four, five, six. Yeah, one, two, three, four, five, six. Yeah. The birth date is for a loved one's name. Birth date, or maybe somebody that they know, somebody that they know his name again. Asstifcourt. What's that? Asstifcourt, ASDF. Oh yeah, ASDF, the first character's a courtee. Yeah. Or your user name. Yeah, the user's name, or their user name. Yeah. Let's go back here. Anything like a local, like a dictionary type? Yeah, we could try all words in the dictionary, right? That would be, there's, do you talk about that? There's probably 30,000 of them. I don't know the exact number. But that would be pretty easy to check with a good hash function. Cool. Okay, so what if we don't have the complementary information? So this would be the case, let's say you are, you're trying to break into, let's say some web service that has username and password. So what are you trying to do in that sense? What's the goal? You just need any valid username and password. So you need any valid username and password, right? So you're saying we need to find some password subset when I pass a new login function that returns true. Right, so this login, so what's the key difference between these two approaches? Mentally, what's the actual difference here? I mean, some difference is clear, right? When here we have a complementary information, here we don't. Do you need to find a complementary information yet? The hash, we'll figure that out at the hash right now. Great, so let's say there's a website, what's the difference between access to the complementary information or not? So think about the situation with the website, yeah. Looks like I wanna complete your enforcing. Yeah, okay, so two separate things, right? So in one case, we still have to guess A, so we can't really, in some sense, backtrack because we don't, if it's a good hash function, we can't go backwards, right? So we can't go backwards, but if you have multiple hashes, determine what the rest of the hashes are. Okay, so yeah, if you have multiple hashes, you could easily guess it in parallel in some sense. You could also, yeah, I'm just saying, so we're both owners, so guess the injection, but find the risk of injecting against the amount of your value, the other one, you have to hit the web server.mc every single time. Yeah, so here you never have to talk to the website at all, you can just compute this locally, right, to try to write these, whereas here you have to actually log in and try logging into the website. What happens if you have a smartphone, what happens when you mistype your PIN or fast code for five or six times in a row? You get locked out, you get locked out, why does it do that? So you can't just talk to the person, yeah. To prevent you from easily brute forcing, right, the PIN code, you can get five or six guesses when they're all in correct in a row, it locks your phone so you can't try it for a minute. Where is that enforced in this model? The second one, right, has to be this login function, right, because otherwise, I mean, the phone is literally doing exactly this, right, putting a passcode, checking it against something. If you have the hash that I was looking for you, you can easily brute force and crack this. But you don't and you're trying to brute force it without getting the complimentary information, so you're trying to brute force PINs, and because it knows you're using the login information, it can make that process slower and put up barriers in there so you can't easily guess through that. So how do we prevent attacks? Look at some things, we just talked about some of them, right, so we know what the attacker wants to do, yeah. Have nothing worth stealing. Have nothing worth stealing. Have you three boring existence, I guess, or web service? I'd say even if you think you have nothing worth stealing, there probably is something worth stealing on there. Yeah. Have you had a two-factor problem here? So maybe using multiple of those factors, so maybe increasing the difficulty to an attacker, does that fundamentally prevent them? Yeah, quite, yeah. Could that be a lock and a count after five attempts, five failed attempts? So we could, like we said, we would lock an account and then what happens? How do they unlock their account? They're just completely hosing, never access your service ever again? Email and reset, probably just send an email. Maybe, so force them to send an email to reset their password to get access to their account, so then the attacker would have to control their email. Another way to do it would be to force them to call customer support, reset, and unlock their account. Yeah, could you just have them forced to wait, like, okay, I'm gonna try to do that. Yeah, sure, you could force them to wait a day or an hour, that great for usability. Yeah, I mean, people frequently, I mean, I just did this the other day, of almost locking up a camera, put it as a iPhone or an iPad, so I could put it in an password and it locked up for five minutes and I didn't make sure my next one was good, otherwise it was gonna be locked for like an hour or something, which is very, very annoying. What else, so maybe, we talked about this a little bit, but maybe we can try to hide some information, right? We can try to hide the authentication information, we can hide the function that we're using, we can hide the complementary information. So we talked about this on Unix, right? So we talked about the Unix, we looked at how passwords and hashes are done, but where are passwords actually stored on a Unix system? When you SSH to general, well, I don't know that generals are a great example, but when you SSH to a, like a standard Unix server, you type your username, password, where are the passwords stored? In the shadow file. In the shadow file, so an EDC shadow, right? So it's in a special file that's only readable by root, so you as a user can't access it, so we can try to, and again, this is the alternative being if your, if your passwords are all available, your password A are just really open to everyone, you basically made the attackers life very easy. So we definitely don't want to get passwords away, we just want to try to hide and prevent people from learning the complementary information. Can we hide the login function? Right, so we looked at the other way, so if we prevent people from getting the complementary information, then we can say at least they can't just try to brute force or guess things offline. Right, but we have this, this set of login functions, this login function, can we just hide that and prevent the attacker from accessing it? Obviously, like at least how it operates on the two inputs? You can obfuscate how it works, sure, but is that gonna stop, I mean, what's that gonna prevent the attacker from doing? Just figuring out how it processes the data. Does it care how it processes the data? I feel like if you get access to C, it's still the limit which hash functions it is. Just put the size of C, like it's 256. Yeah, so even trying to limit the hash functions is why we want to use the same cryptographic argument where we say, assume that the adversary knows how the system works, right? The security through lack of, because it's low. They just get one piece of complementary information, maybe, but if you have an account on there, so you know what your password is, you could try a bunch of hash functions until you find the correct one. What about the login function? Can we just hide that? Why not? Now somebody needs to log in, right? You need somebody to log in. So the way to think about this, it's like, I don't know, making a house where there's no doors on it, right? Going back to the house example. Like sure, nobody can break in your house through your front door or windows or whatever, but now nobody else can access that house either. So similar to this, and this is the way to think about this, is that, but what information can we hide? So again, what information does an attacker know? Or what do they need to do? So let's go back to the attacker goals. What does the attacker need to be successful on attacking an authentication system? They use credentials? They need to know the password, so the user's credentials, what else? At this point, if I know your password, I don't need to know the hash function, right? So I don't care, it won't take care of that for me. What do I need to know, though? Your username, right? So what information can the login service provide that makes the attackers job easier? Yeah, you guys have used websites before, right? Have you ever, what are some of the error messages you've got back and you've tried to log in? Yeah, so, right, so you can say different information in a normal web service that'll say something, well, in not a normal web. Rather than the login function just saying yes you're logged in or false you're not logged in, you can think of other information that sites give you, right, that the username doesn't exist. And why do they say that? For user experience, right? So to make the actual user that's logging in, maybe they accidentally mistyped their email address or whatever the username is, they added a one that uses a username of a different site, right? So then what else, so if it's giving you that information then what other information is it giving you? Yeah, so it tells you who doesn't exist and then if you type in a real user ID with the wrong password then what does it tell you? It's a valid name. Yeah, it'll say, so if it tells you if the user doesn't exist the user does not exist. But if the user does exist and the password is wrong it tells you the password is wrong. You can use this information to figure out what user accounts exist on the system. What are some popular online services that you use where you don't have to guess what people's user names are? Twitter? Yeah, why Twitter? Built into a service, right? You can go look up every single person's user ID and you can use that user ID to log in. What else? Instagram. Instagram? What else? Yeah, most social media, what? So if it's usually not your username then what do you use to log in? What's your user, like what do you, what is that username? Email. Email? Are your emails private? Your email accounts? Especially if you're using your ASU email as you're well aware you can search that and look up that stuff if you're a student. What about if you're not a student? You need to worry about that. Guess why? Where would you have used that email that maybe it's public? Yeah, this is not public, thankfully, but maybe, maybe it's got to be public. Yeah. You sign up for a newsletter. Yeah, so you maybe sign up for a newsletter, you don't know where that email propagates from there. What else? GitHub. GitHub, so if you're GitHub, you can make it, oh yeah, that's right. When you make a commit on GitHub, your email addresses usually do not commit. What else? Oh, your LinkedIn account. LinkedIn usually adds your email address, what else? Yeah. Oh, you're on Piazza, right? But publicly, if you're gonna give a website or you have a contact information, that'll have your email address on it. If you've ever made any post to a public mailing list, those are usually archived and publicly available, so those email addresses are there, right? So, we can't hide L completely, this is kind of the idea here. It's, you need to actually be able to log in, but what we can do is we can try to, one of the good techniques is to not, not reveal as much information as possible about if you get tax exceeds or not, right? So, but again, it's a user experience trade-off, so this is kind of a business decision that different services need to take into consideration if they should give you information of the username not existing or, because if I know the user IDs are email addresses, right? I could actually just get and harvest a bunch of email addresses and start trying to log in as them and I'll know which user, which email addresses exist or not. What are there web services that exist where even the fact that you're a user of that service would be sensitive information? Suggest an example. That's appropriate for the last. I think I know a website where people would go out of fairs. Adult friend binary, the trash in that sense, yeah. Yeah, right? Yeah, so, we won't get to it, but they actually had a big data breach, so they had user named email addresses and I think hatches and passwords got leaked and then people actually used that information to then go extort members of the site. So they would figure out from their email address and they would link in who they were if they were married and then send them emails saying if anybody else in your family that you were on this site. The real world impact, right? I'm just even knowing that somebody has an account on this site. Does a high email put a pressure on whatever site they use? So, how many, I know I used to say you took in for multiple things, but less in here, but I am lazy. So, did I get this? It's a choice, right? So I think that the key is, one good principle is to not leak information to the attacker about what accounts are valid or not. For instance, if you knew there was an admin user on the account, that would give you a high value target to try to break their specific password. Whereas maybe, maybe for other sites it doesn't matter, right? So, and I think obfuscating your, as we'll see, you should choose a password such that it doesn't matter that somebody knows your user ID, right? So that's more on the user. So it's not necessarily about obfuscating your user ID and using a separate user ID per service. It's more about an attacker easily. So if I was, let's say, I mean, if I was gonna try to break into a company's, maybe in that internal web portal, but some web portal for their company, I would try different emails. So I'd harvest all the emails at that company and I'd try those emails. Oh, actually, GitHub's a great example, right? So I'd try to harvest all those emails from that company. I would try logging it, like logging into GitHub to shrink that list down to user ID, to email addresses that I know exist on GitHub and are associated with this company. And then that would give me a list of users to brute force from there. There's other things we can do. We can prevent access to L in some sense, right? So we could do something like, so for instance, MyASU, right? So there's a clear username, password login that you all have to access MyASU. Can anyone try to guess your username and password on your MyASU? But you can see ASU restricting access to only if you're inside the ASU network. Why would they wanna do that? So this way people like certainly can't just randomly guess username, passwords, and maybe guess it because the password is password, right? But what's the downside there? It would be pretty annoying to have to sign up for classes or access, I'm not blocking there anymore, what is it, something else? And this, yeah, access that stuff from not always yet, at ASU, if you're traveling or whatever, you'd just be completely on your own with access there. So this happens a lot in terms of a company or a corporation could do things like, only allow access to some certain system from specific IP addresses, either internal to the company or whiteness is from people's homes or something like that. So this is an A, yeah. So how do you whiteness an IP address? Don't they change? Do they connect to a new network? What IP address are you talking about specifically? So I guess, as an example of that company, like you're one of them, if you reset your mode of a router. Yes, and you would have not access to your whatever system and you'd have to tell somebody to whiteness the new IP address. Yeah, so IP address, this is not a great idea, it's not very user-friendly at all, but you're restricting the space of people who can try active password guessing attacks against your service. Okay, so password-based authentication, so this is the one we've been focusing on the most. In your computing, using computers, managing username, password, accounts, would you agree that this is the most common form of authentication that you come across as somebody who's used this for, I don't know, X number of years? The most, since it's like the most widely used things, also the most easiest for people to have, like change the system would be more complicated just to use that in the password-based. Right, so let's bring it down, I mean a little bit more, right? So you said there's flaws, but what are the flaws? Yeah. Oh, I was going to list the flaws, but it's easy to implement. Is it okay? It's easy to implement. Oh, it's easy to implement, is that a good thing or a bad thing? Yeah, interesting, we'll look at some of the implementations there. So okay, easy to implement, and pro, yeah. For all the research stuff for how to attack your passwords, or usually about how to research out how to break your passwords? Yeah, so there's a lot of, I mean, this is I'd say no matter, in some sense you have this problem that Windows had back in the day where it would have had the 90% market share, all of the malicious software was written and targeted towards Windows, so people who ran Mac software and Apple software thought that their system was safe and secure, and it was really just because an attacker is not going to put the effort in to target your specific system. So it wasn't until like Mac's got more popular that there started to be like Mac viruses and those type of malicious software. So yeah, it's interesting that in some sense, because it's so dominant, it gets attacked and defended a lot. But that definitely is something to worry about. Let's go to someone that's, Bruce Kahn? Yeah. I think the passwords are hassle-like. Classes are hassle-like? Yeah, it depends. Everybody wants a different one. There's so many, you can forget them. How many passwords do you have personally, do you usually be able to share? I have like three or four passwords that I use and I do a little iteration for each of them, like adding things or whatever. Yeah, you don't have to tell us your specific algorithm. That's okay. But, okay, so yeah, so what, okay, yeah. Yeah, so a pro would be convenient to share accounts. Does anybody actually do that? It's clear that that can't happen very easily with an online service where it's not necessarily tied to you, especially if that's a service of what you create different profiles to use. I think that Kahn is, most of the time, it's human generated, so some of our students can be predictable or just make good decisions. Right, so Kahn in terms of password strength, right? Because so, let's say an adversary wants to, well, we talk about this, right? They want to break into maybe your account specifically, right? They could just try all the common passwords. They can try if your password is just a single word. That's easily guessable in a dictionary. Maybe depending on if there's lists, we'll talk about them in a second, of like frequently used passwords, and so you can go in order of most frequently, the least frequently used passwords, and just guess those. So it's actually a user, so from a usability perspective, this puts a lot of burden on the users to generate a password that's difficult to guess, but that is unique and different for every site. How many websites do you have passwords for? Over a hundred? Everybody think they have significantly less than that? Yeah? 30? Do you have 30 web accounts? Only? I think if any number is believable, I probably wouldn't believe two. It's like, or minus your account or your CSE, so they should start from there. It's like, only two, you know what I mean? It's only for the best. It is the only question I agree with. So what other things in the passwords? Okay, so does that mean actually have a hundred different unique passwords that they've memorized that are difficult to guess for all their websites? Memorize? Memorize. Password managers? Yeah, okay, so we'll get there in a second, but, right? So that seems insane. What's the problem with reusing passwords based on kind of what we've seen and what we talked about? So why do we give advice that you should have a different password for all these accounts? Yeah, so re-sharing passwords, especially with an associated account, means that if I ever guess your password for one site, now I've leveraged that to a hundred different websites. Right, and think about, so if you have, say, passwords accounts across the broad internet of a hundred, on a hundred different sites, are they all, do they all have good levels of security? I mean, probably not, right? I mean, yeah, exactly shrugged, like some of them, hopefully, but even big companies get breached. Probably the worst part about a small website is not only is it likely they get breached, but nobody's ever likely to notice or tell you about it, right? And if you're using the same password on all of them, now your attack surface on the internet is huge. All I need to do is break into one site that I know has an account for you and then steal that password. Now I can log into any website. Or what's, again, I think we've talked about this before, but what's the ultimate, if you could only steal one of my accounts, what would you want to steal, or websites? Keep doing that. Your email, why email? Reset password. Reset password, right? The reset password functionality, all the password reset links get sent to your email as you click on to create a new password on that website, right? So just by leveraging access to my email, you could then get access to probably 85 to 90% of those other websites. The only ones that would probably, I think the financial services don't let you change the password like that, you probably have to call and talk to somebody. But what kind of things are they gonna ask you when you call the change your password? So how does that work? Sorry, another way to phrase that is do you give your username password when you call a customer service representative of your bank? No, when do you get it? So security, what else? Account number, maybe address. PIN number, maybe a PIN number. Place of work, yeah, they could try to verify what information they have. If you have access to somebody's email, what information do you have? I still don't know. Does anybody have their social security number in their email? No. No? Not in the email address. In your archive, in your email archive across the internet. Somewhere, have you ever sent or received an email that has your social security number or the last four digits of your social? Yeah, anybody ever, I mean, you often times for a company when you get a job at a company, you need to give them your social security number that sometimes happens over email. Other things, taxes or any type of that, do you have any tax returns? Do your social security numbers on your tax returns if you've ever stored that or sent it to yourself or done something like that in an email that could be in there? It could be reimbursements or something. You went on a trip for some club and then you had to send somebody a form with your social security number in order to get reimbursed, right? Yeah. Yeah, so then maybe in those other 95 websites, there may be something else where you've stored that, maybe in your Google Drive or in somewhere else, right? Similar things with, and I'm also thinking of, things like you may think, well, your driver's license number or your passport number, right? But for similar reasons, you actually may have sent scans of those documents to somebody else over email. And so that copy is now stored forever in your email. I'm also speaking from experience. I'm not trying to project. I know instances of all of these things that I have that I try not to do it, but sometimes it really gets up on. And so there exists in my emails copies of all of this information. That's not, that's not like you said, that you're the other person. Well, how secure is their methods? Yeah, exactly, right? And that's just the emails that you send that your site has received. And I think we'll get to network security and we'll see that it's also possible that other people saw that information along the way. But it's nice how much stuff you can worry about at any one point. Okay, so, okay, so password is the most common. Any other pros and cons or strengths and flaws that we didn't bring up and talk about? I don't like how it 100% of its worth is laid on the password being confidential. It's pretty easy to trick people to give you their password. Yeah, it's pretty easy to trick anybody. No, no. So it's easy to trick, how do you trick people to give you your password? Just go up and ask them. Fake email, like a fake login page. Yeah, so fake email that says your, a fake email that says your Google account. There's a security warning. Somebody's trying to break into your Google account. Click here to either change your password and then you click through. That takes you to a website that looks exactly like Google's login information. And then you type in your username password, they go great, all right, you're all good. Scam calls, so people calling you, trying to do scam, robo calls. Yeah, the reason why these things happen is because they work, right? So in 2016, I wanna call the immigration here, but anyways, there's a number of these email phishing attacks that are high-valued and actually work and compromise people. So people fall victim to this kind of thing all the time. So, and all you need at that point, most of the time is their password, right? Once you saw my password, you're good and you can log in as well. That's a, like a phone or a key or anything. Okay, so the pro is you know that password, as long as you know that password, you can go to any machine anywhere. Having your username password, log on to that service. Is that also a comm? Yeah. Yeah, it's a comm because like you said before, you can have software that's key-related for you know exactly what your password is. Yeah, like this machine right here, right? This Dell machine in front, I have no idea what's running on this system. If I were to log in here with my email username and password, bang, that's stolen by somebody and now they can get access to that, right? So it's a double-edged sword in some sense and the flip side, and we've talked about this before, is forgetting your password, right? But if you forget your password, then you're the only one technically that knew that password. So there must be some way to try to reset your password. If not, then you're just completely locked out of your account for all eternity, right? So I think, Eric's raising it from Winston Churchill, password is the worst form of authentication except for all those other forms that have been tried. I should tell you. The original quote was about democracy. So what are some other options? So maybe you've used or seen other options. What are some alternatives? Talk about the new working characteristics of your body being in the office here. Yeah, so I got mobile cell phones, right? Smartphones have ways to do other types, fingerprints, face recognition, and scan. If you know where they're gonna move away, too much of your passwords are gonna use like password managers, which will generate your sword passwords. Yeah, that's not really changing the system though, right? How the user manages this insane system changes in some sense. What about, yes? Like this login tokens? Login tokens, what does that mean? Like, you know, things that change in 15, like like a six digit, what do I want to change in 15 seconds? Yeah, but does it, can you just log in with that? I feel like some time, right? Maybe, I don't know. Are you using a username and password and then token? Right, so you need a username and password and then something, so it has all the worst properties, most of the worst properties and passwords and then making it even more difficult to use? Yeah. Provided for you, that way it can be stolen. Interesting. Okay, yeah, that's actually a good, like building physical security authentication, right? You have some card that you can use to authenticate. What about your websites? Somebody mentioned Instagram. Let me, does everyone have, well, okay, it's not a good example. What about anyone use like login as Facebook or something, login with a Facebook account? Never, never use anything like that? Now, login with Google, or with your Google account, so what is that doing? Yeah, so it's interesting, right? It doesn't get rid of passwords entirely, but it allows one site instead of now creating 101 user names and passwords that you now need to remember and remember the password of. Instead of that, they essentially trust Facebook or Google or whoever to authenticate you and then they trust them when they say, hey, this is user X, Y, Z, and maybe you can give them access to certain information or not, but at least then you don't have to create a new username and password for that service. So what do companies like Google and Facebook gain? Open their API to a lot of people that use that, that use the login as. Ooh, what do you guys think? So what is Facebook, why does Facebook use this or create this feature? Yeah, okay. Say it louder. Yeah, so to acquire your users, right? So you're having thousands upon thousands of websites like advertising for you, right? And have your brand there with a link to login with Facebook, yeah. Data. What was that? Data, data. Data, what data, how do they get the data? Personal user data, or because sometimes you can login with a system like that, it'll be like, I don't want to know what you're doing with this service in here to like, you say yes, this isn't good. Yeah, so Facebook is getting information from the service about what they're doing. Also just even having that button, that login with Facebook button, is usually hosted by Facebook servers. So when anyone visits that site, it means they make a request to Facebook servers which tells them the IP address of the person that's accessing it, any cookies that Facebook's seen about that user in the past. So it actually allows Facebook to track your behavior beyond what Facebook, right? When you're browsing just Facebook, they could, I don't think they do, but we'll go into a data collection. That's in the broad umbrella of data collection. Is there more reason to keep your Facebook account active as the other accounts do? Yeah, what if you, how do you believe your Facebook account if you're tied into thousands of websites? I was gonna say, on that topic, how do you disconnect? I mean, mostly, in those, I think, certain services that you log in with like Google or Facebook, it's more of they're kind of authenticating you, but then they're pulling the information from that and they give you the option to make your account with them, though, with that whole information. Yeah, so it's interesting. It's user-friendly in some sense, right? You don't need to create a new username password. That website can actually pull in information from you that you would have to give to it to start a new account or whatever. And then they made sure I didn't force you to create a username password so that you can still log in even if Facebook goes away, right? So, yeah, all interesting and other types of authentication, but still you still have this problem of you still need a password to authenticate you Facebook or Google. Okay, so we talked about a little bit of passwords. So why are passwords easy to guess? People want what? People want super password. People want... People create simple passwords? Yeah. Why? Because most of the people aren't computer science students. We have a number of people that aren't used to constantly searching for software. Okay, yeah, so maybe another way to phrase it is people maybe aren't aware of the risks of using an easy-to-guess password. So they use what's easy to remember. Yeah, for that. Yeah, so the tricky thing is easy to guess and easy to remember are two different things, right? It's possible to create a password that's hard to guess, but easy to remember is possible. But for an average user, it's kind of a pain, right? Like you're generating just, let's say, for like a Unix password, eight characters, you can generate a very secure password if it's random eight digits, right? Because they're not digits, but alphanumeric uppercase, lowercase, and numbers because then you're getting eight to whatever that to the eight digits. You can increase the space of things to guess by randomly generating a password. But now you have to have this random piece of information that you memorize per website, right? So there's this kind of dichotomy because you want a password that's difficult for other people to guess, but easy for you to remember. And it's again, let's say you even created a random password that was 20 characters, you're gonna create that for all the other different sites, or you're just gonna reuse that same one across all of it. Cool. Also, easy to snoop. So how do you snoop on some of these passwords? Look over their shoulder? Look over their shoulder, this is actually, this is called a shoulder surfing attack. Right, so you just watch them as they type in their password. Did I tell you a story about this? My hand? So I worked for the Sacramento Sports Commission. Ooh, should I record this? That's fine. And we put on the Olympic Trials in Sacramento, and so everyone, my job was to kind of help make badges for everyone. So you have badges with people's face and what areas they could access. So the whole access control system based on that. The software that we were using, we could pre-define different access. So if you're an athlete, you get certain access to certain things. If you're a volunteer, you get access to certain things, which are obviously much the less than an athlete, but if you're a volunteer who needs to work like with the athletes or in a certain area, you need to alter what areas they can access. So the software that we were using to do and create these badges, you had to put in a password if you wanted to change the default authentication, which is fine, but that sounds great. The problem is we have volunteers creating these badges. So you have people you don't trust creating these badges for other volunteers. So, and this password field was not hidden. So what does that mean? So when you type in a password, it would show up exactly in plain text of exactly what you typed in and then you could change things and then hit it. So you'd have to like, if there's a volunteer right here, you would have to tell them to turn away as you typed in the password and made sure nobody looked at it. So, okay, somebody said, well, that's crazy. There's no way we could do that. So what would be a secure alternative? Secure. Yeah, so if somebody decided our password would be five asterisks. So as you type in, then you have to fake typing in other letters. Because otherwise it's very clear somebody's shoulder surfing or this actually happened to me one time, mistyping it. So I see asterisks, asterisks, AB, asterisks. So I can see that it's not hiding your password at all just to get around this terrible password. So snooping all this kind of stuff is crazy. It's also, as we talk about, easy to lose or forget. So no control on sharing. It's easy to social engineer, right? And these are kind of things that are inherent to passwords themselves, right? You have this ability here. Then there's a lot of practical vulnerabilities that they're susceptible. So it's kind of an easy to snoop by other people as you're typing it in. It's also easily, your password can be easily visible depending on the technology. So we talked a little bit in crypto about HTTPS, Secures and Encrypture Communication with a web server. So if you're using a website that's not using HTTPS and you are typing in your password on an unsecured wireless network, like a coffee shop, everyone with a radio can sniff your traffic and see exactly what your password is that you're sending to that site. So even if you're typing in a great password, they can still do that. Yeah. They should? I wouldn't say. Yeah. Not every company does, it's getting better and better and more and more companies and what more and more websites are. The problem is, well, there's other ways where it's easy to do the wrong thing. Sometimes you can mess up your email client and Google's good about this, but another, let's say your random company's web server, you can easily set it up to do authentication to your email server, sorry not web server, your email server over plain text essentially. There's tools you can download that will sniff wireless traffic. There was a tool, Facebook actually used to not only have HTTPS on its login page. So when you logged in and tried to do your password, it was secure, but then every other page after that was HTT. The problem is in order for Facebook to identify you and link your sessions, you needed a cookie value that specified who you were and that was sent over plain text. There was a tool that somebody developed called Firesheet that would listen to a Wi-Fi network and would show you all of the cookies that it stole and who those accounts were and allow you to just pop up a new Firefox window with that specific user account. And Facebook fixed this in like three days, they HTTPS all their websites. And it was a big problem on college campuses because most of them were unsecured wireless networks and so it kind of had a bad combination. So Seth, will the replay attacks as we talked about, this was kind of when we went last week over this idea of do we hash a password before we send it to the server, right? Well, as long as somebody gets our hash in that case, then they can replay it to the server and get access. Similar things here, password reuse and really kind of requires proactive management on the part of the user, right? We're forcing a lot of security decisions onto the users. And we'll kind of look into this. So one of the ways we've talked about, so a way of attacking the authentication of a service is just a dictionary attack. And we can attack essentially every single password-based authentication with this. We just try either different values in the dictionary. Somebody mentioned pet names. We could get the list of the most common pet names in the United States and try that, right? If we have the, so this, again, it kind of depends on, you can think of different scenarios, just try it on the website, we can try this. We can also, if we have the complimentary information we're trying to brute force a hash or check a hash. In this case, essentially, rather than brute forcing, we're just randomly trying all characters and all inputs. We're now trying specific values, like a targeted attack. So is it possible to search all possible passwords like this? Which there you go. Some of those passwords are dictionary arrays. Right, the password may not be, so the space, so think of it in terms of like sets, right? The set of all possible passwords is very, very large. Right, so that's dependent on the specific service. We saw on Unix that passwords were eight characters. Were there any other restrictions? Oh, is you technically have a password that had null characters and crazy stuff that you can never actually type into it. You have to figure out how to get those in. But you theoretically could do it, right? What's a more lengthy password space? Yeah, most passwords these days require you to include like a dash or numeric values or something called a special character. Yeah, so depending on the service, right, it may require a dash or a special character or uppercase character instead of just having lowercase characters. So if you only have lowercase characters, how many, what's your set of, I don't know. So you have 26, right? 26 possible characters, I guess, for each element. Now what do you have uppercase? You just doubled it and then add on zero through nine. You've added 10 more possibilities then add in spaces, special characters, right? You've actually significantly increased the search, well, not necessarily significantly, but you've definitely increased the search base, right? So maybe it's difficult to search for all possible passwords, but it's much easier to search for likely passwords, right? Again, this is where we're using the fact that these are humans writing these passwords to do this. And we can perform a dictionary attack as we mentioned again offline so we know the function that's being used and we have the hash, we have the complementary information, C. We can repeatedly try different guesses. There are super cool open source tools that you can use for this, like Crack or John the Ripper, that you can use to break these hashes and to brute force these hashes. They require some tweaking. You have to tell them what space of password you're trying to guess. So you're gonna guess three character passwords that are just lowercase and uppercase, right? That could be one thing. So then this would be an offline attack and an online attack because we're actually trying it against the service, right? And this is what we talked about on either a website. So, let's say a website is real secure and if you put in the incorrect password, let's say three times in a row an incorrect password, it completely locked your account. Can you perform a dictionary attack against this website? You can just keep switching accounts. Yeah, so know in some sense of if you're trying to break into one user's account. No, you can only try two passwords or three passwords technically and then the password is locked. But what can you do? Just keep switching accounts and hope that someone uses one of those passwords together. Yeah, so especially if this website allows me, gives me information on whether a user account exists or not, I can enumerate all user accounts and then I'll try the most popular password that matches their security requirements on every single account. That'll likely get me five, 10% accounts and then I still have a chance to try the second password, most common password on all the accounts. I never got to play with it but I bought one of these forensic boxes that would brute force iPhone pins and the way it would work is because the counter for how many times you tried or the lock was stored essentially in memory and wasn't persistent. So you could just try, I think, 20 passwords and then kill the power of the phone and reboot it. So like a four, you think of like a four digit passcode that's just zero through nine, it's actually, it only takes like two days for a device, a hardware device to crack these kinds of things. Since then the iPhone's got a lot better, they have a secure enclave now that you physically, you could do your department chip you couldn't get into and that's, that is persistent there so they had to settle their game a lot. Yeah, if you can get around technical requirements then absolutely. So for instance the other thing that websites do is they restrict how many times you can try based on IT address. So in that case the attackers rather than just going from trying from one machine over and over, they either use AWS or they, the criminals will use a botnet of machines that they got access to that are real people's machines and they'll just try guesses from each of those machines. So it never goes above some threshold for detecting. Login. Yeah, so you can guess the easy password across everyone's account net couldn't, whatever companies that you're trying to log in so you can see these multiple log-ins that's really, I'm just trying to break into our network, okay? Notify people just to be honest with you. How many notifications are you willing to get? Right, anybody run us? Oh, I don't think we have time but I can maybe show you my server logs of people trying to log in to the SSH in our submission server. I'm sure maybe some of them as you trying to do that, but you can see there's just constant background noise of people guessing passwords. So the answer is yes, there's a ton of work in trying to detect that but separating out the noise. Normally what happens is they're not going to tell you they're going to alert a security person that looks into it and then if you need to change your password they'll contact you. Password guessing is going to prevent this, yeah. Logging to an old example, like we'll say like and you're saying in from me you can choose that right with me. Yeah, so we could, maybe you keep track of other information that the user is using when they log in. So the, again we could try IP address, we could restrict the base on that. We could assume that if our guys are very smart, they, or again if they've stolen our hash then they don't need any of that at all, right? They can do it completely offline. So one way we could do that is just don't let them access the complimentary information. Therefore you can force all guesses to be online. The problem is this is very hard to guarantee. What else, anybody misteqn their password logging into their machine? What happens? It tells you to try again, and then if you got it wrong again. Try again. Does it, how long do you, does it take in between guesses? Longer and longer? Is that because the hash gets more and more difficult to compute? No, the hash function is usually very quick. I mean if you're using shop 256 or MD5, the hash function is very quick. This is the login function intentionally adding and exponentially longer and longer delay so that somebody who's trying to break into your system and trying to guess your password will be delayed by doing that, right? So this is some of the things that we talked about. Let's add a delay to the login function when the password's incorrect. Which should inconvenience real users very little but makes dictionary attacks take longer, let's say, it doesn't fundamentally prevent them but makes them take longer, yeah. That's the way to do it. Someone, when I was a kid, I'd get his or her like 10 minutes to longer, right? So that's like, I don't know, a mile or two? Yeah. Yes, I mean, that's definitely something, I mean, I have a terrible story of our friend who did that to somebody else to try to mess with them and then didn't realize the corporate policy on the iPhone meant that the third attempt would wipe the device. So, yeah, the device just started wiping and they're like, oh God, I need you to say, but, so the company made that decision, right? That's to say, like, rather than let somebody brute force and get access to this device, we'll rather wipe it. For a common brute-forcer, if you can slow them down on the order of minutes, so you probably have a cap, right? Potentially, depending on how you do this. But the key problem here is, we said it's probably easy, I mean, we can't force and guarantee our hashes or our complementary information to be secure. They may be released and when they're released in this delay, we've added to L, doesn't matter at all because they can do whatever they want. So what can we do to get around that? Yeah. In the case of, like, offline decision-making attacks, there's, like, the hashing algorithm that are intentionally designed to be slow, like the crypt, I think it's called, so it's just hard to iterate through all the possible possibilities. Yeah, so that's exactly so. What if we just increase the time to compute the hash function, right? We could use a different hash function that's designed for this. We could just shot 256 10,000 times or a million times, just keep feeding that output into itself, right? That would do something similar. Now, if somebody steals our passwords, if each guest takes them 30 seconds, then it's much more difficult to brute force. But if each guest takes them 30 seconds, how long is it gonna take them to log in? At least 30 seconds, because we need to take their password and compute that hash, right? So usually, these algorithms tell it to tune it, so it's on the order of 300 to 500 millisecond somewhere around network. You're not gonna know, but NB5, if shot 256, can be emulated insanely fast in card fire. They have systems to do that. One way, and I'm not gonna go into the details here, but rainbow tables are insanely cool, so if you're interested in this password cracking stuff, I suggest you look this up. The essential idea is, well, I know what standard passwords are. And I know what standard hash functions are used for logging in, so why not just pre-computing, right? So what do I need to store in that case? What was it? I said the hash value. Yeah, the password that I'm guessing in the hash. And then I go through the linked data set, and I take each password and I compare it to the password hashes I already have, and if I found it, I know the password, right? So this is essentially what rainbow tables do. Pre-compute some size of the storage space, not going to be those how they work, but essentially allows you to trade off in time it takes to crack, and the space required. So the space requirements are very large. So for instance, you can get download, and these are actually one of the best instances of Florence. You can get a rainbow table for MB5 that has all one-through-eight character altitude-marriage values in 127 gigabytes. You can now, but if you want to extend that to all one-through-nine character altitude-marriage characters in 690 gigabytes, right? So super cool technique, attacking technique. Again, okay, so now we actually talked about this before, so I won't want to make us get there, but here you have these space requirements, so you pre-computed this, but what if the authentication system is using a salt to the hash, right? So the salt as we talked about is a random value that's unique per user, that that information combined with the password gets you the hash, right? So that doesn't help me here, because these are just all one-through-eight character altitude-marriage values that I have in MB5 for. It does not take any salt into account, so that way each password has its unique, so I don't even think we have the same passwords because we have different salts that hashes become different. And essentially we think that it's selecting a different S, right, that function for every user, and just as we saw with rainbow tables it allows us to break, so now we can even generate a different rainbow table for every possible salt value that just gets insane. The other way we can go about solving this problem is slow hashes, so this is a hash that has the cryptographic properties that we want. We never want to get those up, right? If it's easily reversal, we don't want that. So things like decrypt is designed to be a slow hash. It's actually used on the submission server, so I can give you my password for the submission server, and I mean, each hash takes about 300 milliseconds on average to compute, so you'll definitely not be able to break my, also it means I can't break your passwords. So for those of you that have asked to get access to your password on the submission server, this is why I do not have it. I can't give you your password. I can change this so that you now have a different password that I create. Other functions that are interesting, so there's a whole kind of data research in the slow hashes. S-crypt is one that is designed to take up memory. So all of these things, a lot of these hash functions take very little memory so they can be easily implemented in hardware, even on like an FPGA. But S-crypt is designed to take maybe like 100 megabytes in order to, I don't know what the exact numbers are, but 100 megabytes in order to do it, which is difficult to make memory, but it's very fast. Okay, we talked about this, cool. So we're gonna go over an example of password reuse. So again, this is what we talked about. 3.5 million gano accounts, usernames and hashes were released in 2013. There was half a million adult friend finder accounts released in 2016, with us, I haven't updated it, they're always constantly breaches. The Adobe breach was super interesting because they had user IDs, they had email addresses, they had password data that was base 64 encoded, and then they had the password hint. So the password hint of course needs to be shown to users, so it can't be hashed, it needs to be stored essentially in plain text. So you see that password in the corner, it says try, 40123, who do you think that user's password is? And so people looked at this and actually tried to figure out what this password data was because it wasn't clearly a hash, because the password data got longer and longer, it wasn't all a fixed size, we know it's a hash if it was all fixed size. So here you can see kind of by using the password hint they were able to reverse engineer what the algorithm was, they were able to look at the different password lengths, able to figure out that essentially they were not actually in the passwords, they were encrypting the passwords, so those were the passwords encrypt if they're AES or DES or something like that, and what they found was by reverse engineering this, they found these were the top passwords, oh sorry, they used a different account, that's right, to, they used these top passwords, passwords that would make some other data sense to map them here to find out that yep, all of these were the password 123456, not a great password, 12345678, I think that's like twice as better, password query 11111, okay, and we don't have much time, all right, let's finish this up, one, two, three, oh it's cold, your password is one of these, please change it.