 In terms of selecting passwords, we mentioned a few schemes yesterday, so I think you can think of your own schemes for selecting good passwords. But you need to consider who you're trying to protect against, so if you're trying to protect so that no one guesses your password, maybe you don't want your friend to guess your password or what was your ex-friend, they want to get access to your account, they're no longer your friend or ex-friend of any type or sex, then you want to make sure that they cannot guess a password given that they know something about you. So that's one type of protection that you need, hopefully not so common. Another type is protect against people who again know something about you, they may not be so close, but other students, other people who can collect information but are targeting you. So again they can find information about you to try and guess the password. The other is that we'll see after we go through storing passwords that in some attacks it's possible for people to try and guess your password without knowing anything about you, so they're not really targeting you, they're just targeting one of many possible people and you're in that set and therefore they may use different techniques to try and guess your password, that is they may try all possible passwords and we'll see that such an attack, we need special ways to store passwords to make that difficult for an attacker. So our picture that we introduced yesterday, remember we have our user, our system that we want to log into and the system stores some registered ID and password and in some database and the user when they want to log in submits their ID and password and we get a response. This system could be communicating across a network with our user like accessing a website, logging into a computer remotely or of course it could be the same computer so we may not have our big computer here, maybe the user actually logging in direct to this computer here, so that's the case when you log into your own laptop for example, you sometimes need a supplier username and password on your own computer, how does that work? So some examples before we look at storing, here I'm running, you'll see a zoomed in version soon, it's a bit hard to see, here I'm running a computer in virtual box on my laptop and I am presented with a login screen, so I think many of you have seen a similar type login screen so let's try and log in, of course I need to know the username or the ID and in this case I know that for this one the username is network, so I'll type in sorry network and then I press enter and I am prompted for a password so now I submit the password and I'll type something in and I pressed enter and then there's some delay and then it says login incorrect, this is just an example of the feedback that you may get when you have an incorrect login, I'll do it again and I'll type the wrong password and just notice what happens or notice how slow it is, it's obviously the wrong password, I know the actual password, so I press enter now and it takes something about five seconds before it gives me a new attempt, that's not because my computer is slow that's because the login software is implemented such that it introduces an arbitrary or a delay, what happened? It's gone back to the start, I'll show it again, wrong password I press enter now and then it says login incorrect, so that delay is a security mechanism, why do we have a delay there? From when I press enter for the password until I can try again, why was there a delay? No, the system is quite fast, my computer is fast, it will check within a few milliseconds but you can notice there was a delay of a few seconds there, that's a security mechanism, it makes it so that it's difficult for an attacker to make many attempts quickly, so an attacker, one attack is to try many different passwords, so whether you type very quickly, try the network login and try a password, if that doesn't work try another password and keep trying and if you could even write software to try it for you on your behalf, so an attack could be to try many possible passwords, so a common countermeasure for such an attack is for the login system to have some delay, it just resets, to have some delay between attempts, so that for example the five second delay means that you can only make an attempt every five seconds, that makes a brute force attack on the passwords very difficult because even though my computer could handle maybe thousands or even millions per second this delay means you can only do one per five seconds, so that's a security measure let's try just a couple of other things, all right I typed the wrong password there's a delay, what feedback does it give me? says login incorrect and let's try another one log, I used the deliberately wrong username and it still says login incorrect, it doesn't say password incorrect, it doesn't say username incorrect, the meaning here of login incorrect meaning something's wrong, so the feedback when you get, when you do something wrong should not reveal information about whether it's the wrong username or the wrong password because that leaks information, okay so the feedback is important here, now again that's a convenience thing in some systems they may give specific feedback saying your username is is incorrect to make it easier for you but that's less secure because then people can discover about them, so we'll get rid of that because it's a bit hard to see we don't need that one we'll see another one just so it's a bit easier to see, I'll just log in so it's a bit easier to see on the terminal, so I'm actually logged into my virtual node on my computer, so once we've logged in we want to focus on where is the password stored, so when you create an account you create an ID and a password, where is a password stored on a Linux or Unix system, where's the specific file, passwd is one file so those that have shadow is the other file, so on Unix or Linux systems the standard place to store the information about passwords is in a text file, you don't need to know the name but most of you have done the lab have seen examples of it, so we'll have a quick look we'll return to it later, there's a file called passwd which stores one line per user, so it stores and it wraps around a bit but it stores the user name plus other information, so my network user some information about that user but that's the ID, this file doesn't store the password information, as we said before the shadow file there's another file that stores the password information and the normal user doesn't have permission to access that, so this is again a security mechanism, our database, this database here should be such that the users of this system shouldn't be able to access that database because it stores sensitive information, other users passwords, so there are some access control on that database such that not any user can read it, if I switch to the root user, the root user is the administrator of this computer which I know the password for, zoom out a bit, let's look at the shadow file, now on this system everyone all the students in this class have an account, so don't look at other people's passwords when we see this or don't remember them, this is the shadow file and has an entry for each user, we'll just go select some, it looks confusing but there is some structure there, we'll see some details later but for example the network user at the top, that's the username and then the next field is some information about the password, we'll zoom in and see a more detailed example but all of that is some information about the password, it's not the password, we don't store the password in the clear in the database, we apply some operation on it so that it's much harder for someone even if they can see this file, so now you can see this file but still it's hard for you to find other people's passwords and that's another protection mechanism on the password database, you should store the password in a form such that even if someone does get the database they still can't find your password and that's what we'll focus on today, how to store that and what is this long sequence of characters mean, it's actually split into three parts, one, two and three and we'll go through the lecture today and come back to this at the end and see what those three parts are, so two levels of security on the database at least, one protect the database file or files from the normal users so that users without permissions cannot access it and two even if something goes wrong and a user without permissions does access the database, store the data such that they still can't get the password out of it, this is multiple levels of security, so let's see how we do that, what we'll do is we'll go through some different options for storing the password and hopefully at the end arrive at how we do it in practice, so we assume that the password and ID is registered with the system in some initial step, so when the user creates their account or the system set up that everyone is allocated a user ID in a password, that plus other information maybe the user full name, some characteristics about the login account are stored on the system in a file or database, in our Linux system it's just in a plain text file but in other systems it may be in a more complex database and when the user wants to access the system they submit their ID and password and the computer system compares the submitted values against the stored values and if the user ID matches and the password matches then they are authenticated, if not then they get an error message usually, so how do we store the passwords on the system that's what we want to focus on, we'll look at three or four different options and look at the drawbacks and advantages of each, first one store the password and ID, so here the ID username and the password P store them in the clear, in the clear means no form of encryption, no special operations just have a file for example that has in one column the username and then in the second column the password of that user, that's got some obvious problems, so let's say our file that we saw before the shadow file just had the usernames and all the user's passwords there are potential attacks that can happen on that, there can be an insider attack where a normal user gets to read that database for example if I showed you the database then you would see the passwords of other users, so we have a computer system which has many users for example the ICT server where you have Moodle accounts on, if the password is stored in the clear like this and anyone who has access to that database can see everyone else's passwords and that's of course can be a problem, how do we stop it? Have some access control on the database which limits who can read it and we see that with a shadow file the normal user cannot read the shadow file on a Linux system, that's access control, just to remind you of that, stop viewing this, so the access control I'm currently logged in as a user called network and I try to read the database, it's just a text file of password information then this is the access control working saying you as a normal user don't have permission to read that database, so that's how we try and stop this insider attack of a normal user reading the database especially if the password is stored in the clear, now there's always an administrator of the computer where the database is stored, someone who has full privileges, so even though the normal user cannot read it we cannot prevent some administrator from reading it and that's in our example turns out I also and the administrator of this computer so I know the root user's password so I'm now the administrator user and now if I want to look at that file can I, yes I can, so we can't prevent that, there's always going to be at least one person who can see this database, how do we stop that, we can't, there's really no way to prevent the person who runs that computer from looking at the contents of that file they have full control over that computer they can do what they like with that, so in effect the administrator uses of a computer system where you've registered passwords must be trusted, we don't have technical means in most cases to stop them from using seeing your password, that leads to the issue especially nowadays with websites, think of all the websites that you have a password on and then assume all the people who run those websites can see your password, so that's a potential and attack in that they can reuse your password on other systems so that leads to the suggestion that you shouldn't reuse your password across different systems you should have a different password across different systems so that for example the administrator that on one system cannot reuse the password on another system, these are insider attacks and often we find it very hard to prevent or protect against insider attacks, we most often want to protect against attacks from outsiders, people from outside who are not normal users outside our network or outside our computer system, so how can an outsider see my list of passwords, well if they can get unauthorized access to the computer system through other means and access the database then if our password is stored in the clear then that unauthorized user can see all the passwords, now we will not say how do they get that unauthorized access but if some other security flaw means that someone can log into our computer system, read our database then that means that person can see all of our passwords and that's bad and yesterday I showed you some statistics from different passwords and these were from leaked passwords that is these statistics were gathered from some unauthorized user has accessed some website or some computer system and effectively stole on the database of passwords and then published it on the internet for everyone to see, so that's when we say 300,000 leaked passwords that means there was some database and that someone somehow gained access to and revealed it to everyone else so there are different ways to do that but as a good security measure or principle is to assume that it may happen things may go wrong and if someone does access this database we don't want them to be able to see the passwords which suggests that we should not store the passwords in the clear even if someone can read the database we want to make sure that they cannot read the passwords so normally we don't store the password just as is in that password database, what can we do if we can't store the password as plain text in the clear what can we do we can encrypt it okay so that's an obvious solution there such that if we encrypt it and someone gets access to the database then they can only see the encrypted password not the plain text password and that's this solution so this top line indicates that what we store in our database the ID of a user followed by their password P but not in the plain text but instead we encrypt that password using some secret key and we store the encrypted form of that password so if someone gets the database they need the key to be able to read the password so that stops this problem of if someone accesses the database that the passwords are revealed so what happens now is that when the user submits a password when they want to log in they submit their password and the system must encrypt that using the secret key and store it and compare it to the stored value the problem with this approach one problem the secret key must be stored somewhere on our computer system we cannot always ask the the administrator to enter in the secret key everyone every time someone wants to log in so the key itself must be stored somewhere so there's a problem if the attacker can access the computer system and read the database then it's likely that they may be able to also find the key because the database is stored on the computer system so is the key that's in a file or in memory it's potential for the attacker to find the key and if the attacker finds the key then they can find all the passwords and we haven't gained anything so for this approach to work we need to somehow keep the key separate from the database and that makes it difficult in terms of implementation sometimes so the approach or the aim even if someone gets the database don't let them find the passwords and one approach to that encrypt the passwords but the drawback is that we need to make sure that if we do encrypt the passwords that it's hard for an attacker to find the key given that we must store the key somewhere so that software can automatically decrypt or check the passwords any questions about that approach it can work but it has this problem of the key must be stored somewhere where should we store a key in the database alright so if the attacker finds the database they now have the key we have to use some method to to deceive the attacker okay what method can we deceive the attacker with reverse it right in the opposite direction okay but our attack is quite smart it's one of our students here and they know those tricks they just try the password in the opposite direction even if we do have such an algorithm to reverse it remember this must be done in software so the idea is that I don't know if we can draw it but in this case and our system what's stored is say some user ID and the encrypted password instead of the actual password we store the encrypted password with some key and our system also must somewhere store the key otherwise we would not be able to check the password when someone tries to log in so when someone does try to log in they submit their ID and password so they send that to our system and now the system must check that how does it check how does it check that the password is correct well what we could do is we take the supplied password at this so the computer system then has some software that takes the key and the supplied password and the value that it gets here it compares to the database if they match everything's okay all right so because if if the passwords are the same the encrypted form should be the same the ciphertext alternatively we could decrypt this as well we'd get the same result but importantly this must be done in software we must have some implementation that does this for us to do so we need to have the key so the software running on this computer system must know how to find the key and to do this operation so if an attacker can get access to this system and get access to the database then it's highly likely they can also get access to the key and the algorithm for checking that the password so we don't gain much there even if we have some algorithm to hide the password the software that checks the submitted password must be aware of that algorithm so if the attacker can find the algorithm then they can easily find the reverse to fix it so depending upon the key is a bit of a problem here it can be used in some cases if we can keep the key separate or somehow secure maybe on different computer systems but it adds more inconvenience for the implementation so there's another approach hash the password don't encrypt it instead of using a secret key all we do is we take the password that we store we don't store the password we take a hash of that password using a hash function and store the hash value and then when someone logs in they submit their password we compare the hash values so we'll draw that this was using encrypted password the next approach using the hash and what we store is the ID of the user which is public and the hash of the password there's no key involved so there's no necessary storage of that key and now when the user logs in again they submit their username or their ID and password and now the system does the check how does it check it takes the hash of the submitted password and compares it to the stored hash value and remember our properties of hash functions if the password that was submitted and the stored password or the registered password are the same then the hash of those two values should be the same so if the hash values match then we accept the user because it implies the passwords match if the hash values are different then it implies the passwords are different that is the submitted one does not match the one that was registered so this is the more common approach for storing passwords or it's at least based upon this we we store a hash of the password and then on login we compare this hash of the submitted password the blue one with the stored hash value we don't actually store the password what if our attacker finds our database what do they do what does the attacker know if they know the database or they can access our system then they know the list of IDs they'd say we have many users there's not just one ID there are many IDs and they know the list of hash values do they know the passwords where are the passwords they're not there the passwords are not stored in our database the hash of the passwords are stored and one of our properties of hash functions is they should be one way that is if you know the hash value it should be hard to find the original input the original password in this case so if an attacker does get access to the database they do learn the hash values but it should be hard for them to go back and find well what were the passwords that generated these hash values and that's because of some of the properties that we introduced of hash functions the one way property means it's on the slide but we'll write it the one way property means it's hard to find maybe hard to find P given H of P as the attacker knows the hash of P but the one-way property suggests that that would be very difficult for them to find what the original password was it's hard to go back and the other property that we rely on is collision-free it's stated in different ways but the hash the hash of different passwords will be different not worded very well but we rely on this property that if we do have two different passwords the stored hash value which was hashed from say one password and we submit a different password we type in the wrong password or someone tries to guess the password then the collision-free property says that if we have two different inputs to our hash function we'll get two different outputs we won't get collisions and if that's the property of our hash function then this passwords scheme works any questions on how to store passwords so far let's see what we say on the slide so this approach store the hash of the password we need a good hash function one where these properties exist and there are some functions that that those properties hold so the hash of the submitted password is compared to the stored hash value just reminder right our hash function takes a variable size input so the password can be any length can be eight characters twelve characters a hundred characters and when we take a hash of that password the hash value that we store will be a small fixed-length value small could be 128 bits 512 bits for example and the properties in terms of security we assume that the hash function will not produce collisions that is two different inputs will not hash to the same output and that it's hard to go in the opposite way of the function that is it's easy to hash the password and get the hash value but it's hard to take the hash value and get the password to go in the inverse direction so if the attacker gains the database should have said gains access to the database then it's practically impossible to take the hash value the one stored and go backwards to get the original password so that's our aim in terms of securing it now listed passwords questions before we do some analysis of this so what we'll continue and analyze and look at some attacks and see how much effort it would take to for the attacker there are some practical attacks on this the questions on the concepts before we move on everyone finish the quiz before today's lecture we'll keep having a few more quizzes each week let's look at some some numbers to a specific example I'm gonna go through an example which is on the website here but you have printed out in your handouts I'm going to take some some parts from the example so if you flick forward a few pages I think you'll see this this printed out passwords hashes and rainbow tables let me just check that it's in there keep going at the end of these lecture notes anyone found it yes it's there somewhere someone's found it good 63 okay so I'm gonna go through that or parts of it using the examples from there and do a little bit of analysis first we'll go back to the original approach so just the username and the password okay so just as an example this one store the password in the clear so if we look at the database here's an example where we can think of it's some table two columns one this is where a computer system where the ideas are username we have a set of usernames and for each user we would store the actual password that was the first scheme that we introduced so when a user registers they select a password and the system stores it there but we've mentioned some obvious problems anyone who accesses that database now immediately learns everyone else's password so that's not a good solution so the next approach is to store a hash of the password and that's in the print out there to these examples are there so store the username and when a user registers and selects a password the system takes a hash of that password and stores the hash value and from memory I think I used MD5 to hash these passwords and I got these values MD5 produces 128 bit hash value and convert the hexadecimal and you get these what 32 characters so that's what would be stored in our database so now when the attacker if they do get get access to this database then they know the hash values they need to go back and try and find the passwords forget what you saw at the top assume you didn't see this and you're the attacker and you only have this how do you find the passwords what can you do okay very hard but you really want to find them what would you approach would you take turns out with hash functions the you can do a brute force approach so you need to try many possible values and once you find a password that matches the hash value you've found the correct password so brute force approach can be applied but we'll put some numbers to it and see how much effort it evolves these are 128 bits it's just represented in hexadecimal it's using MD5 that's the hash function a brute force attack on the hash function assuming no known flaws in that hash function depends upon the number of bits and a brute force attack in general on a hash function takes two to the power of n attempts where n is the number of bits in our hash output so in our example we would need to take two to the power of 128 attempts to find a collision what I mean by that is that a brute force attack would involve the attacker take some value hash it compare the hash value to the one that we're looking for say we're looking for Steve's password take the value it doesn't match seven five one so on no then take another value hash it and compare it and keep going until you find a value when it's hash that matches the stored hash value and I'll say a dumb brute force attack as we'll see there's a better one a dumb attack would require in the order of two to the power of n attempts where n is the number of bits in the hash value two to the power of 128 attempts what's an attempt in this case an attempt is hashing some value and in terms of consuming time that calculating the hash is the most expensive anyone have an idea how long would it take to perform just one hash how do we find out you have your computer you you want to try all these two to the power of 128 attempts how long would it take to do this hashing algorithms are different than encryption so they take a different amount of time so it depends upon the the hardware that's doing it many people have done analysis and I'll show you a website that gives some data here's a website which gives some data and I think I may have showed you in it last week this is just for some different graphics cards which turn out to be quite good at calculating hashes some of the speeds how many hashes per second for example with MD5 this graphics card GPU can do 92 million hashes per second 92 million that's an old graphics card we try and find the largest number this one's up to five billion hashes per second and some varying numbers if you scroll through you can find I think some that go up to 10 billion so in the order of maybe 100 million up to 10 billion hashes per second some hardware can perform another hardware could even go faster but let's use some of those numbers to approximate how long it would take so let's let's for now assume I can do 1 billion per second so let's say my computer can calculate 1 billion hashes per second then I need to do 2 to the power of 128 different hashes in this brute force attack how long does it take me someone give you a calculator and try the time in this specific case of course 2 to the power of 128 divided by 10 to the power of 9 equals about 10 to the power of 21 years not seconds if you do the convert conversion 2 to the power of 128 is such a large number that even if we do a billion per second it's going to take you billions of years okay so this attack doesn't work this brute force attack is not successful if you don't believe these calculations go home tonight and over the weekend check some of those calculations okay just to get a feel of the the size of some of those numbers but there's a better attack this brute force what the attacker does is that they choose any possible value as an input to the hash function say some random characters they hash it and the hash values compared to the one that stored if it matches good if not moved to the next one choose another random possible password but it turns out with passwords usually passwords are short but not any size in the order of several to tens of characters so a better or a smarter attack is to try passwords that over the length that we expect the user of to chosen okay don't try strings which are a thousand characters long because we know that no one's going to choose a thousand character password so maybe we'd start by trying all the one character passwords and then if that doesn't match try all the two character passwords assuming our system even supports such small passwords so a better attack is to not try any possible password but to try those which are of a particular length so let's try that and see how much effort would take so if we assume or know that password lengths are limited normally that is there's usually some upper limit whether it's hard in terms of the implementation or it's just what people choose that most people do not choose passwords larger than say 12 or 15 characters so we can use that to speed up our brute force attack let's assume that we know that in this case eight characters let's assume that we know that the password is no longer than eight characters maybe the system forces it to be no longer than eight eight characters I think some websites or some systems do limit the length of your password let's say this one limits it to eight characters so what the attacker needs to do in the worst case is try all possible passwords take a hash of each one and compare it to the stored hash value if it matches then we have found the correct password before we do that well no we'll illustrate that and then we'll calculate how many attempts are needed so the attack in this case let's say we try password P1 the attacker chooses a password P1 and what they do is they calculate the hash of P1 and compare that hash value to the stored value that is in our example that they're trying to find my password they'll take the hash of some password P1 compare the hash value with this stored value 751 if the hash value is matched then they've found my password if not they move on to the next potential password what is P1 well if we know that the passwords of a particular length we will let's say if it can be one character we would try one character password maybe A and if that one doesn't work we'll try B and then C and so on and we'll keep trying different characters and if the one character passwords don't work then try the two character passwords if they don't work try the three character passwords ABC ABD and so on and keep going until we find a hash value that matches the stored value and once we do we've found the password so when I say P1 it's one of the potential sets of passwords if it doesn't match try P2 and compare and keep going sorry that's P2 there compare keep going until we get to say Pm and let's say at this point this password that we try we get a match so that's the approach for the attack in this case try all possible passwords given that we know that the passwords are a bit up to some maximum length eight characters for example so what we want to know is well how many attempts on average or in the worst case do we need to make to get this password how many possible passwords do we have to try in the worst case how many passwords are there let's say we limit to eight characters well we need to know something about the structure of passwords what characters can be in a password uppercase lowercase digits so there's uppercase lowercase is 52 if you count numbers if they're allowed 62 let's find a picture that shows that here's a picture of a keyboard so we often talk about the printable characters many systems will allow you to choose a password from from a large set of characters and that the largest set is usually the set that you can type in on your keyboard so if you look at a standard keyboard how many characters can we type that are printable well we have our English so if we look at an English keyboard first we have our 26 uppercase and lowercase characters so there's 52 we have the numbers 10 and then we have all the punctuation characters and the operators like plus and minus and so on and if you count them up there's another 32 so commonly we think that the printable printable ASCII characters there are 94 that we could use other characters in the ASCII set are usually not printable like delete and so on maybe we could add the space character because some password systems also support space but in this case I haven't included space so we've only got 94 with the space up in 95 so there's an indicator of well how many different characters could a user choose from and let's assume a user chooses a random password for now so each character we can choose from 94 possible characters so the number of possible passwords depends upon the length let's calculate that one character passwords how many possible one character passwords do we have you're only allowed to have one character in your password how many can you how many possible values are there 94 what if you could have a two character password how many possible passwords 94 squared that is the first character can be one of the any 94 and the next character can be also one of those any 94 that gives us 94 times 94 combinations and we can follow along with three characters it becomes 94 to the power of three let's say our limit is eight characters we know the password is limited to eight characters it cannot be more so therefore if the password must be eight characters or less if we consider all possible combinations the total is 94 to the power of eight plus 94 to the power of seven plus 94 to the power of five I forgot six and so on down to one what is it if you add all them up let's get it calculated time 94 to the power of eight that's the number okay so all why did I do 91 there just to see if you're watching okay it should be 94 at the end but it will only differ by three it won't make much difference so that's all possible passwords plus three that we could choose from so an attack in this case must try all of these the previous attack we had to try up to two to the power of 128 attempts what's the difference two to the power of 128 was the original brute force attack the original brute force attack had to make this number of attempts but if we know that the password is limited to just those printable printable ASCII characters and limited to just eight in maximum then now we only have to make this number of attempts much much fewer so now it may be possible to try all possible passwords it turns out that if you add up the eight character passwords seven six five four three two and one it's not much different than just considering the eight character passwords the eight character passwords only is this number 609 so on if you're adding all the others it's only a few more 616 it's still six by 10 to the power of whatever it is 18 or so so let's approximate and how alright let's how long would it take we do out this one how many passwords can we try per second 10 to the power of 9 we're saying our computer is fast enough to do a billion passwords per second so we've got the total number we need to try divided by the rate at which we can try them and that gives us the number of seconds convert to minutes convert to hours convert to days 71 to try that okay so say on my computer if I want to try all the passwords which are up to eight characters in length any possible combination of those 94 printable ASCII characters it would take me about 71 days to try and find that password okay how can I speed it up get a faster computer or maybe I go into the computer lab and use 10 computers at the same time and maybe reduce it down to seven days so this was if my computer could do one billion per second if you can't see this this is 60 here for 60 seconds per minute if my computer was faster maybe I could do 10 billion per second and instead of 71 days it would take me seven days still a little bit too long for me to do it but for someone who really wants the password it's not a big deal to wait for seven days to get their password in fact not only do you get one password but you get the passwords for all the users because they all have hashes in that set so everyone's password can be found within say if we had 10 times faster than this one within seven days let's say we can now do if we could do hashes at a speed of 10 to the power of 10 per second my calculation was 10 to the power of 9 but it was 10 to the power of 10 then it turns out it's approximately seven days to do the brute force attack trying this set of passwords this is the worst case okay well if we try all possible values how can we make it harder for the attacker longer password but okay let's say the user must choose a nine character password they cannot choose one up to eight if they must choose a nine character password there's one more character so it's about 95 94 times longer with one more character with nine characters it will be 94 to the power of nine or 94 times more than eight characters so it take 94 times longer so what 700 days or 600 days so two years so yes adding one more character makes this secure effectively adding two of course is even better so that's one reason why we should use long passwords but not many people do or at least many people will use password less than eight characters or nine or less now it's even better for the attacker once the attacker does this once they can reuse this information in subsequent attacks because the hashes of all these passwords let's say we save them on disk all those hash values and the next time we find hash values of someone's password we just look up and find the correct hash value how many passwords are there how many do we need to save on disk as the attacker is approximately 94 to the power of 8 passwords a few more because we count the 7 and 6 and so on but it's approximately 94 to the power of 8 passwords if we limit it to 8 characters let's just focus on 8 character passwords how many how long is a password how many bytes how much do we need to store each password anyone last 10 minutes stay stay stay with us focus just on an 8 character password how many bytes do we need to store it on disk one character is how many bytes one byte okay normally with ASCII we store one character as one byte on disk so one password let's focus just on the 8 character passwords to keep it simple 8 bytes so what the attacker would do is for all our 94 to the power of 8 passwords calculate the hash values and store them on disk how long is a hash with MD5 it's 128 bits or 16 bytes this is if we're using MD5 that's a characteristic of the hash function it produces a hash value always of 128 bits or 16 bytes in length so what the attacker can do is after these seven days they've gone through they've lived there leave their 10 computers running to calculate hashes of all these passwords as they calculate them they save the password and the hash value on disk in one large file or database how big is that file so we think of some table that stores one column is the password one column is the hash value how big is the file or the table well we have 94 to the power of 8 entries and if every entry we need 8 bytes of password plus 16 bytes hash value and if you calculate that I've done it before it's about 146,000 terabytes so that's not so practical for an attacker if they want to store all those values for future use then they need 146,000 terabytes so at the cost of a hard drive which is a few thousand bar for one or two terabytes that will be too expensive and of course too inconvenient to have 140,000 hard drives to store them but there's a way to improve it why do we want to store them because once we do this once to spend our seven days then we can easily use this table as a lookup table now I find a hash value go back to the top and I have the hash value not of Steve but of someone else of of John this one actually 0 6 C 2 1 I find that hash value instead of going through and spending another seven days calculating these values again I just use my table and quickly look through for the hash value in one column once I found that hash value I can find the password the benefit here is that for the attacker it's much much faster to do a lookup on a table or on a database than it is to calculate hash values searching through a database especially if it's ordered or structured is very very fast compared to calculating hash values so instead of doing say 10 to the 10 hashes per second we may be able to do 10 to the power of 13 lookups per second and be much much faster to find the new password questions on the calculations here before we lead to the last concept and an important concept everyone can calculate the size needed to store all the passwords and hashes a common exam or quiz question given a password of a particular length and a particular character set how much do we how many passwords are there 94 to the power of 8 how much space do we need to store all the hash values well depends upon the password length and the hash length and an entry for each possible password it turns out that attackers have tried this but instead of storing it in the raw form 146,000 terabytes you can find some data structures which is effectively compresses it and you can store all of this information in a much smaller form and it's called a rainbow table it's just a data structure that allows you to store this information but instead of in 146,000 terabytes in maybe and I'll show you an example but approximately half a terabyte I've run out of space but I'll show an example where you can use some special data structure called a rainbow table to store this information and really cut down on the on the sides and I'll show you an example to finish today if I can find the website somewhere yeah go away maybe my computer's crashed okay maybe okay we found it you can find these rainbow tables which is instead of having 146,000 terabytes of data here's an example this table actually this one this table stores ASCII passwords 95 they include 95 characters we calculated for 94 they include a space password lengths from one up to eight characters they store all the passwords and all corresponding hash values in a large table but really in a compressed form the total number of values is this we calculated in a number similar to this before this six or six million billion but the table size in the hard disk is just half a terabyte if we store uncompressed we need about 146,000 terabytes but there are data structures that compress it down to half a terabyte and that's of course manageable so what attackers can do someone calculates all these values maybe it takes them a month with their computer to calculate it but then they save the values on a hard disk and they need about half a terabyte to save it and then the next person who wants to break some password they don't have to calculate they just buy the hard disk or download the table and do a lookup which is much much faster so this website actually people are selling this website is selling hard disks with these tables on it you can pay a few hundred dollars and they'll post you a hard disk with the table of all passwords and all hash values and then the corresponding soft software is very very fast to find the password given a hash value the data structure is called a rainbow table it's just the way to store the data with such tables it makes it very easy for someone to do an attack on a password if we store it in the hashed form what we'll see next week is that the best approach coming back to our sides instead of just storing the hash of the password the recommended approach this one has a problem in that we can have such tables which make it easy to find the password the recommended approach is to introduce a second value inside the hash called assault and we saw a hash of the password combined with this other turns out to be random value and that can be used such that the attacker cannot generate a large table and then do a lookup very quickly effectively makes rainbow tables unusable for attacks on passwords but we'll stop there today we're trying to go through the different approaches for storing passwords don't store it in the plain plain text or in the clear generally we don't encrypt it because it involves having to store a manager key a secret key on the system which makes things more inconvenient for the implementation what we do is we store a hash of the password but there are some attacks if we store just the hash of the password it is possible for someone to do a brute force attack and then save that that information and sell it to others so that they can easily do a brute force attack so we'll arrive next week and go through this approach finally don't just store a hash of the password store the hash of the password concatenated with a random number where we call that random number assault but we'll cover that salt and and that final approach next week what you should get from today is the approach of the general approach of storing passwords and the calculating the size the number of attempts and so on we'll cover that one on the salt next week