 So One way we can measure the strength of passwords is by looking at the number of attempts it would take to guess that password So our four-digit password 10,000 possible values so a maximum of 10,000 attempts our 64-bit Random binary key two to the power of 64 possible attempts So we can compare them and say the one that needs more attempts is stronger Well, what we use entropy for the same thing it just converts those the number of attempts to a smaller value by taking the logarithm So log base two of ten gives us an entropy 10,000 attempts is equivalent to an entropy of in our case. We got 13.28 Two to the power of 64 attempts is equivalent to the entropy of log base two of two to the power of 64 Which is 64? So entropy is just another way to compare the security of passwords. It's just on a different scale than the number of attempts So here's three options for different password schemes find the strongest a task Three options find which password scheme is the strongest So you need to calculate the first two are easier to calculate the third one will require a few steps so the first option so You're designing a system Login system and you're going to choose from these three options and you're going to require the user To choose a password according to one of these three options You want to choose the strongest of the three option one The user chooses a 42-bit binary Value that's their password Option two they choose 13 Random digits so 13 numbers No, no more than 13 no less than 13 to keep it simple and they're random Option three they choose a seven character password Those characters are made up of upper or lower case English or digits so they can choose any of those values But there's a special rule that one of the characters must be from the punctuation characters like slash question mark comma and all those and If you look on your keyboard, you'll see there. I think about 32 possible punctuation characters Compare the three and tell me which one's strongest and which one's weakest Anyone with an answer? How are you going to compare them? Well Calculate the entropy for each The higher the entropy the stronger the scheme So calculate the entropy of options one and two should be easy. You should do that in half a minute What's the entropy of option one? 42 With a 42-bit value option one we have two to the power of 42 possible values The entropy is log base two of that log base two of two to the power of 42 is 42 So with a binary value the entropy is easiest the number of bits and that's where the the value That's where entropy comes from. It's the number of bits So the the entropy of option one is 42 What's the entropy of options two and three and then compare them? option two option to 13 digits Well, in fact, we were on the track there when before We said one digit Has an entropy of 3.32. It's on the slide the log Base two of 10 is 3.32. So one digit has an entropy of 3.32 Therefore if you take 13 digits the entropy is just 13 times 3.32 and the answer 13.3 by 3.32 was Option two entropy 41.6 43.16 The entropy of the first one is 42 And you think that's where entropy comes from it's the the the the number of bits needed if we have a binary value So the entropy of option one is 42 The entropy of option two is 13 times 3.32, which is 43.16 Which is stronger than option one so option two is better than option one What about option three try and calculate that in the next few minutes You choose seven characters one's a punctuation character the other six from the the set of English letters or numbers and you choose randomly and Maybe to make it simpler. Let's say that the the punctuation character is just is the last character. Okay, so you choose six six letters or numbers and then the last one some punctuation character and Punctuation you don't have to remember them all commas and all on and so on there are 32 possible values there What's your calculation? except lowercase uppercase or digits We've got more to choose from So you can't use What'd you get I think That's around that looks close that sounds correct See if you can calculate the entropy for option three 40.7 sounds correct. So option one the entropy is 42 Option two One digit has an entropy of 3.32. There are 13 digits gives us 43.16 approximately and Then option three. What's the entropy some people have got it but let let the others try and calculate Let's say you choose six letters or numbers at the start and then one punctuation character at the end option three And you can split you can analyze them separately. So first look at well. What's the entropy of a six letter? password and Then what's the entropy of a one character password the last punctuation character? Let's say let's focus first on our first six letters in the password. We've got six letters and In each letter We're choosing from How many possible values? How many possible values? 62 We have 26 26 lowercase a to z 26 uppercase to choose from there's 52 plus 10 digits So we can choose from 62 possible values 62 characters So if we just choose one value out of those 62 the entropy is log base two of that I'll just write log So whatever the log of 62 is the log of 60 log of 62 is Because we can choose there are 62 possible values the entropy of one character from that set is log of 62 which is five point nine five approximate close to six and Since we have six letters or six Yeah, six characters in our password with that entropy and it we multiply by six, but then at the end we add One more character Some punctuation character and there are 32 possible values So the entropy of that one character is log of 32 Which is five six times five point nine four Not five point nine five times Plus five is about forty point seven log base two. Yes, we're I've written log here log base two So we break it into two two Two parts the first six characters where we choose from 62 possible values for each character And the last one character one times where we choose from 32 possible punctuation characters and it becomes Six times about 36 plus five almost 41 Which one's strongest? Which one is strongest? Option hands up for one Hands up for two Hands up for three Okay, the highest one the larger the entropy the stronger that password will be So that's what we can use entropy for to compare the strength of different password schemes but the problem in practice is that This assumes the user chooses a random password The first one is a random 42-bit key 13 random digits and This one Six random letters or numbers and then one random punctuation character Think of all your passwords. How many of them are truly random? Most people do not choose random numbers are random values. They choose something that has some meaning Any questions on how to calculate entropy at the moment? We'll return in a minute Give some different examples So because most human created passwords are not random It's hard to actually estimate what the the entropy of a particular password will be NIST is a Organization in the US that create standards They've done some research and some studies and come up with some approximations of the entropy of different passwords that users may choose and some of the results Shown in this table. Let's explain it and then Compare some values in the right hand column or the right hand two columns What we know is when we the user randomly chooses a Password like what we just can't calculate in the first column is the length of the password number of characters The last column is if the user chooses from the printable ASCII characters What are the printable ASCII characters? What do we have before in that previous option three? We had 26 lowercase letters 26 uppercase 10 digits that's 62 Plus the 32 punctuation characters slash question mark and so on that's 94 printable characters So if the user can choose randomly from those 94 printable characters and They choose a password with 10 characters Then the entropy we follow through is 65.9 Okay, so that's how we read that if we're using 94 character alphabet and We choose a password length of 10 characters and it's randomly chosen That's the last two columns the entropy would be sixty five point nine If we chose just from the digits the ten digits So this is a ten character alphabet and it was ten digits long then the entropy will be thirty point thirty three point three Which is ten times three point three two which we calculated before so that's what the last two columns are for But the user normally doesn't choose a random password They have some structure in the password So what NIST did is they use some some basic techniques to try and work out. Okay Say you choose a password from letters normally Even though you've got 94 characters to choose from normally a user Will use lowercase characters Sometimes I'll use uppercase, but they're there assumption that most times they'll use a low lowercase character and Let's say the first letter you choose is the letter T in your password Then the next letter Will normally be limited from the set of 94 characters that is you normally not choose a password Which is T followed by Z or T followed by Q You may choose T followed by E or T followed by O So the relationship between the letters Will depend upon the frequencies of characters in a particular language normally So they've done some analysis. It's not very It's not hundred percent accurate But just to give an idea and said that if a user could choose from 94 characters under some assumptions if there were no checks on what the user chose Then they calculate the entropy if we choose 10 character passwords to be 21 much lower Then if we choose randomly of 65 because a user will usually choose According to some structure they will not choose randomly so There you cannot choose from 94 possible characters for each letter in your in your password If you want to see the exact assumptions I've made that the document describes them it goes through in detail and in fact it's based upon a study done by Shannon everybody remembers Shannon Shannon capacity in data communications Shannon confusion diffusion in Des and block ciphers Shannon also did work Related to the entropy and the entropy of passwords No checks means that the user chooses Under some assumptions the next column dictionary rule means that If the user chooses a password the system then compares and makes sure that that word is not in a dictionary So you have a dictionary of English words and if it's in the dictionary then it cannot be counted They have to choose a different password. So that limits in this case and That in fact adds to the strength and They've done some calculations and see okay if you don't choose a dictionary word The entropy goes up in this case to 26 if it's 10 characters and the last column nor this column here is if you choose and You limit from dictionary words and you have some additional rules on how you must create the password For example, you must have one punctuation character You must have one uppercase at least one uppercase. You must have one at least one lower case rules like that How to compose the password? I'm sure you may have seen them when you create passwords for different websites the website will restrict or require your password to have some particular characters which makes the password stronger and They've done some calculations and see okay the entropy goes up to 32 now it's not Not entirely representative of all password schemes, but it gives an idea When the user chooses It's nowhere near as strong as randomly chosen because the user chooses based upon some structure in most cases They use a can choose a random password, but most people do not so For example, if we want to get equivalent to or at least an entropy of 64 equivalent to a 64 bit key if we randomly choose from 94 characters We need 10 characters if we randomly choose from 10 digits Sorry from digits We need 20 a 20 character password our password must be 20 digits so from the Printable characters I need either a 10 letter password or from the numbers I need a 20 number password for the about the same strength If I have no checks then in fact it doesn't even go up to 64 we need more than 40 letters in our password to get the same strength Who has a 40 letter password anyone? So none of it no one has the same strength password as a 64 bit key That's the point here most people have passwords in this range in length less than 10 characters in in most cases Some may be longer But think about your passwords and the length of them usually in this range and the entropy is therefore usually in the order of 10 20 maybe up to 30 depending upon your password scheme So your password may be about the same strength as a 10 bit key or a 20 bit key Meaning it takes a few millions of attempts to break which is not many and Some some systems limit the length of the password. So some especially financial organizations will limit to say six characters So different systems have different schemes So we can use the entropy to compare against them, but there's no one way to say which scheme is best because We need to consider not just the strength, but also the usability and of course the This is under some simple assumptions about the user what a human user would choose But in fact different people may choose passwords differently So some may choose random some may choose a simple word some may come combine different characters in different ways So, how do you choose a good password? any suggestions Don't tell me your password, but give me a suggestion for how to choose a good password Don't tell me choose 64 random binary bits Because I never remember that and anyone suggest how to choose a good password How do you choose a password a good password? What do you do? mix between alphabet and digits mix between alphabet and digits randomly Okay, not so good because if it's not random if it follows some structure, then that's equivalent to these Any any small more precise suggestions? There are different ways So for example type Because on your keyboard you have type type characters and English characters. So type hit the type keys and The password comes out as English Letters, okay Okay, until someone knows that scheme and then they just need to map The English letters back to the type or that the Thai letters to the English keys But yeah, there's one way in that If someone's looking just at the That English Set of letters that you've created it would look random or almost random in that case Using a different language doesn't really help because the attacker just needs to Once it knows the language It just needs to attack based upon that language if they don't know the language then try multiple languages. There's not so many okay, so But using Translations or similar to the this scheme of the The keyboard mixes up the letters any other schemes Keywords from hobbies or from things you know about are you you interested in? keywords Like one word Would that word be in a dictionary? dictionary words Some of the least secure passwords in that what an attacker will usually first do instead of randomly try passwords They'll take a dictionary now dictionary is Something that includes a large set of words known words and They'll just try those dictionary words and most people do choose a word from a dictionary Now how big is a dictionary if we choose say the Oxford English dictionary there's maybe 200,000 words Not many they just have to try 200,000 words And a computer can try that many we'll see later in an offline guess very fast But of course there are different combinations and there's different specialist words so that can expand but If a word or your password is from a dictionary Generally, that's one of the easier passwords to break Phrases Assuming there's no limit on the length What limits the length usually your ability to remember? Okay, so but if it's a phrase Something a line from a song or something that you know and remember easily then that is Slightly better because now it's a combination of words from a dictionary or even better Choose four words from the dictionary randomly And just use them You'll usually after using it several times remember those four words You just need to remember the order of them and it makes it long enough and usually gives a large enough entropy to be Secure against most attacks There are many different ways, okay We're not going to go through all the different ways you should investigate and think about Now that you know how to compare different schemes think about which ones are more secure and Especially which ones are not secure and don't use the insecure approaches. So you don't have to read online You can even just think about your schemes on your own and do your own analysis to see what's secure We were we started on online password guessing where the attacker goes to the computer system and tries the passwords while it's in use What's offline password guessing the system? Stores some information about the passwords We'll see shortly an example of how but the system must store something something about the password Offline password guessing is when the attacker can get access to that information For example, if there's a file that stores the set of passwords Then if the attacker can get that file and then try to discover your password Usually there's fewer restrictions on the time that the attacker has available in this case Because if you can or another way, let's say you go up to my office and take a copy of the hard drive Then you have all the time in the world You can go home You can try many different computers to try and find my password because you don't have to worry about me coming back to my office So that will be an offline attack Because you're not trying to break the password while using the system So if we can do that we have less restrictions on time The guesses are not recorded in this case. So I cannot log how many guesses you're making because you're doing it on your own computer So in that case, what do we do? Well, we must make sure that the passwords which are stored on the system are secure Even if you can copy my hard disk It should be practically impossible to find my password And we use cryptographic techniques to do that And the most common or the most recommended approach is in fact, you do not store the password You store a hash of the password on the system and also All right, it doesn't protect against copying of files But give limited access to in terms of permissions to the files that store the passwords So so that not anyone who accesses the computer system can read that file Especially if it's a shared computer system So let's look how how Linux does it in terms of storing a password and see how Storing a hash helps We'll explain briefly here and then show on the on my computer How do we store passwords in Linux or Unix like operating systems as an example? There's two files stored. So on my laptop. We'll see shortly. There are two main files In the directory ETC, there's a file called past WD the password file In fact, it doesn't store the passwords nowadays in the past in the old times It did store the part or some hash of the passwords now It's used to store the user information for example your username and so that stores just a text file and Each line contains information about each user. I'll show you an example shortly So a username and some other information like the name of the user and Then there's a second file called the shadow file and that stores Again the username as well as a hash of that user's password In terms of permissions normally The past WD file is world readable. That is anyone who has an account on the computer system can read that file So anyone can see the list of usernames But the shadow file is normally readable just by admin So the administrator the root user of the system For example the it server All of you have accounts on the it server All of you can read the past WD file And see the set of users on that computer system But none of you only I can read the shadow file The shadow file contains the hash of the passwords of all the users Let's have a look as an example On my system currently I just have on my laptop. I just have myself as a user. Let's add a user just briefly Add the user and Let me remember how to do it Just for this example Normally when we add a user they have a home directory on the computer I don't want to create a home directory for this user So no create home and the username so choose a username Anyone volunteer we had some volunteers last week anyone else S-a-n Okay, so add a user I'm the admin so I need to Give my own password I've typed it wrong so adds a Adds this user to my computer system now. This is where we register the passwords So the user would choose their password. I would choose it for them I typed it in there. It doesn't show what I'm typing in so that someone can't see what I'm typing I the password I chose was just the word password Let me choose some full information Okay So I just registered a new user on my computer system. Where is that information stored? Let's look There's the first the pass wd file in the etc directory grep just searches through that text file looking for this word sandy and it's going to show This line there's one line inside this file Which contains this And it's it's just a text file and it's separated Each field is separated by a colon here. So here's the username The next field x means The password is not stored in this file. It's in the shadow file in another file This is a user ID in in unix. The user has a username and a number as an ID And in fact a group a group ID This is I didn't enter a full name or a room number that would be stored here if I did enter it between the commons here Their home directory even though it wasn't created and the shell the terminal when they log in what program runs bashing this case But importantly this file stores the set of users in fact I'll show the entire file Okay, there are many users created automatically You see me here So my username user id 600 my name. There's no other information home. The rest are really just for specific services on on My computer on linux. So most of them are servers. They're not individual human users They are special cases Not important So the the past wd file stores the user information and now let's look at the shadow file So I I tried to search in the shadow file for the word sandy using the program grep and it says permission denied Normally, you cannot read that file because it's considered more secure Than the others. So there are permission set up that a normal user cannot read the file You need to be root or admin. So I will use sudo to do that And here's the the password information Let's see the structure Here's the username And then the next field starts here this dollar six and goes through to here This field is a hash of the password And also the algorithm used to hash the password The remaining fields are about how How long the password is valid for You can put time limits on the passwords so that the user has to change their password every day every month So that's what these fields are for to store information about This password Will expire in one hour So we don't care too much about these values Focus on from dollar six through to this tp slash again Try and zoom in a bit Unfortunately, it's quite long. So it wraps around but This field after the username and goes through to here Is a hash of the password and the algorithm used so And another value we'll see shortly The dollar signs separate the sub fields here. So in fact inside here, there's a a field with a value six Another field with a value owz through to pg And then from the here number zero one u all through to ltp slash That is a hash of the password The second field Is assault and we'll come back to explain that The first field is the algorithm used to perform the hash algorithm number six What algorithm was that? We have to look up some man page to see that I have to scroll Quickly scroll down and find the algorithm in one of our Help pages it says algorithm number six is char five one two so we're using the hash algorithm char and the Output value is five hundred and twelve bits long. So char five one two produces a five hundred and twelve bit hash value So the format is shown here And it's not encrypted even though it says encrypted here. It's the actual hash value the algorithm The salt and the hash value How do we use a hash value? Well Why This is where the the system stores information about the user's password So now when the user tries to log in So here's our here's my laptop when the user comes along and wants to log in Inside here is our file which which stores the hash It stores the username Sandy it stores the algorithm number six, which is char five one two This salt which will come back to and the hash value this long value Let's say h1 The hash value for even better hate sandy That's the stored value Where hate sandy is the hash Using char five one two Of the password they chose and I actually created Chose the word password Now when they log in The user submits to the system their username So when they log in they type their username Press enter and then they type their password. So they submit their username and their password and It's password The system doesn't store the actual password. It just stores the hash of the password So what happens when the user submits their username and password? The login system takes a hash hash of the Received or submitted value Take a hash of the submitted value And compares it against the hash value stored And our properties of hash functions mean that in practice If the two passwords are the same that is the one that was created At the start and then stored in the hash value here, which is in the file on the screen If the hash value Is the same as this The hash of the submitted password. It means the passwords are the same That's our our practical property of hash functions the hash of two Messages which are the same will produce the same hash value if The password submitted is wrong It's abc Then the hash of the submitted password will not match the hash value stored Of course the user names must match as well So in fact, we don't store the password we store a hash of the password And it still allows us to authenticate the user because The hash of the submitted password must match the hash of the stored password If they do it implies that the two passwords are the same Why do we not store the password? Why do we store the hash as several people have said so that the attacker Can not see the password, okay if If we stored the password And if someone could get access to this file Either through other malicious means or someone who has permissions to read it but Shouldn't get the password Then if we don't store the hash but store the actual password Then i've discovered automatically another user's password Now consider Consider for example the it server On the it server you all have accounts. You all have your passwords Created you set your own password I'm the admin If we did not use a hash Then I could see all of your passwords And most likely some of you use your passwords for the it server The same as other systems So people could discover and find your passwords by storing the hash value How do I get your password? Another property of the hash function is that We cannot go backwards the one-way property Given the hash value It's practically impossible for me to work out what the original password was Okay So that's why the hash function is useful here in Password storage the we can still authenticate the user Because the user submits a password We take a hash of that password and compare it against the hash value If the two passwords were the same then the hash values will be the same If the two passwords are different then the hash values will be different And someone who gets access to this file cannot easily Find the password of the user They just find the hash of that password So that's common in most operating systems so windows does it as well They just may use different files or different way to store it And possibly a different hash algorithm but the same concept Any questions on How the password is stored And this is not just for operating systems most login systems Can or do or should use this approach When you register for a website Okay hotmail gmail or any website where you need a username or password If that website has been built securely what happens is that when you register your username And the hash of your password Is stored in some database So the password is not stored in a database a hash of the password is stored And when you log into that website you supply your username and password And the system calculates a hash of your password and compares it against the database so same approach Of course some websites don't do that and you look over the news over the last year or two There are many cases where If a website doesn't do that And someone maliciously Gains access to the database They can find the passwords of many users So if we didn't store as a hash but store the actual password And there'd been attacks against websites where the attacker finds the password of thousands of users And releases it on the internet and therefore they can log into different accounts It's more complex than that We said we take a hash of the password That provides a level of security In fact in practice normally we introduce another value as well The salt but before we try and explain the salt let's do some simple calculations of How an attacker could try to perform an offline attack on such a a password database So think of this file this shadow file as a database of usernames and passwords Similarly if you're using mysql to store it for a website Then you'd have some database of usernames and passwords, but not the actual password a hash of the password Now assume an attacker has this file or has the database. What can they do? To find the password How will you find the password of the users? Here you have a hash value of one user. How do you find the password? Come on. I know some of you have strong minds and Not malicious, but Sometimes think maliciously What would you do? How would you find the user's password? I've given you the hash value. You can see it on the screen Find their password Tell me how you do it. So given Forget about salt for a moment, but given this hash value a 512 bit value. It's not stored in binary here It's converted to some It's encoded in some ASCII or base 64 encoding, but it's a 512 bit value given that Find the user's password What do you do? Random Random hash what? So take random passwords calculate the hash And compare against the given hash And once you get a match Yeah, once you get a match You've found the password Let's try that So as the attacker you know the hash value and You want to find the password So you basically you can do a brute force attack Choose some password p1 Calculate the hash of p1 And let's say you get lowercase h1 Does h1 match h sandy? If yes, then we've found the password. It's p1 if no try p2 Take the hash Get h2 Does h2 match h sandy if so we've found the password if not keep trying. Okay So that's a brute force attack What do you try as passwords? What value are you going to try first p1? What value? A dictionary word Okay, so Most likely the user chose a password from a dictionary So don't try random passwords So we want to get to that password as fast as possible if we try random passwords then It depends upon the length and we can calculate how many attempts we'd need to find it But Most likely the password is not a random string, but it's from a dictionary or some modification of a dictionary word So try dictionary words Take a dictionary 200,000 different words in English Try some variations on those words And try them first And you're much more likely to find the hash value So Not just dictionary words, but different combinations of words and different strings Birth dates information about people and so on. So that's a normal approach How many attempts or how fast will it take to find? Find the password. Let's introduce some numbers and Give some examples of what effort we would take Let's assume that we can calculate hashes at a speed of 1 by 10 to the 8 hashes per second that is Calculating a hash takes some time Because the function is quite complex like encrypting something it takes some time and your computer Let's say this is 100 million hashes per second Which is quite fast for normal computers. Maybe using a GPU A recent GPU you could get to that speed Okay Because a GPU can do things in parallel quite well But so this is taken from some typical computer that I looked up and saw That they could do about 100 million hashes per second using char 512 So then the question is well, how many Attempts do we have to make until we find the password and how long will it take? Well, what depends upon the length of the password? Because how many possible passwords are there? Let's say The password Let's say the password was a dictionary word first The password was chosen from an English dictionary. How many words in an English dictionary? About 200 000 How long to find the password? Less than a second Okay We can do 100 million hashes per second We only need to try 200 000 and we find it assuming the password was taken directly from some dictionary Because in an English dictionary there are about 200 000 unique words It varies under different conditions So if it was a word from a dictionary easy to find the password That's why choosing a word from a dictionary to choose for your password is not a good idea Let's say We have a six Character password Instead the user was a bit smarter and they didn't choose from a dictionary. They chose a six character password and from the printable Characters from your keyboard So 94 possible characters How many Possible passwords are there? So each character can be from one It's chosen one from 94 possible values So there are six characters. So 94 to the power of six And i've calculated before It's about 6.9 By 10 to the power of 11 So they randomly chose a password six characters long And each character was chosen from one of the printable keys from your keyboard. So from one of 94 characters Gives us about seven by 10 to the power of 11 possible passwords Now what I do as an attacker is using my computer. I'm trying 100 million hashes per second 100 million passwords per second and comparing How long does it take? 6.9 by 10 to the power of 11 divided by 10 to the power of 8 Which is 6.9 by 10 to the power of 3 Which is 6,900 seconds What is it in 6,900 seconds do I have the answer? in hours 6,900 seconds 3,600 is one hour So it's about two hours About okay So it takes two hours to find this password not not long. Okay, that's good If it takes me two hours to find the password then that's considered Easy for the attacker Using one computer two hours to find a password is not very secure so That's why we need to a make sure our passwords are strong but We know if we make the password longer It becomes more inconvenient for the user Let's try What if we had eight characters? So the user chose not six characters, but eight Then it's 94 to the power of eight Which and I calculated before is 6 by 10 to the power of 15 That's how many possible passwords if we have an eight character password and so now it's 6 by 10 to the power of 15 divided by 10 to the power of 8 Which is 6 by 10 to the power of 7 seconds Takes the attacker 705 days to Find the password So that's more secure by adding two more random characters You've gone from a two-hour attack to a two-year attack okay, so This is An eight character password is not not too long Although this is random not many people remember or choose random passwords and and of course they usually choose a word from A particular language or a dictionary or related so Under the assumption of random eight characters 705 days So not not so good for the attacker now but what if the attacker was a student here at the university And they used the lab computers to do an attack This is using one computer that does 100 million hashes per second 100 million passwords per second they try What if they put the software running on the The lab computers and leave it running over multiple days and let's say we have 100 different lab computers All trying to break the password And because we just try the passwords in parallel that is Some of the passwords we try on one computer another set on another computer and so on so We can do the test of the passwords in parallel across 100 different computers Say there's 100 different computers in the labs Cutting the time by a factor of 100 From 705 days Down to seven days Okay just by Expanding the computer resources in this case not too hard All right, maybe not achievable in here, but In some organizations may be possible Quite easily for a user to get access to 100 computers Run for seven days this software that tries to calculate the hashes of passwords Seven days and they find The user's password A new user comes along We create a new user on the account Volunteers Doesn't matter We create a new user And let's look at their hash value So I created a new user john And the hash value is here. There's completely different hash value if we compare to Sandy So two different hash values So now the attacker wants to find a password Spend another seven days trying to find the password. Okay, so to find john's password But they don't have to Instead When we found search for the first password Store these values Store them in some database So that now when we have John's hash value this new hash value Instead of recalculating the hashes of all the different passwords We've store the password and the hash value in some database and simply look up the hash value And generally a look up is much faster than a hash Calculating the hash of some value takes some time But looking up Some entry in a database is usually orders of magnitude much faster Because calculating hash like a cryptographic operation Involves many steps a look up is just a comparison between A hash value and a known hash value So what the attacker does the first time they They try all possible passwords our 6 by 10 to the 15 passwords Hash each of them and it takes seven days. They did it across their 100 computers on the lab And they get all the hash values They store them in a database So a large table which has password and hash value P1 h1 p2 h2 P And they have 6 by 10 to the power of 15 All possible passwords. They've already calculated. It took them seven days to do it, but they've calculated all possible hash values from those passwords Now John's password, how do they find it? They take the hash value And simply look in this database here. They look for the corresponding hash value here It should be there because John's password assuming it's the same length. It's eight rent. It's eight characters will be there so performing a look up and once they find the hash value Then they've found John's password And in fact, they can do it for all users now by just looking up in this table That's much faster than calculating the hash again because performing a lookup that is comparing the hash With a value in the table Is very fast comparing to calculating a hash depends upon the speed of the computer We said we could calculate hashes at 10 to the power of eight hashes per second Let's assume we can do lookups At And I've just made this number up to get Some nice results But much faster and let's make it 10 to the power of 12 per second What that means I have some database On my computer as the attacker and what I do is just compare two values in one row And let's say I can do 10 to the power of 12 comparisons per second with my computer It should be much faster than calculating a hash because hash Relatively slow So how long does it take me to find John's password? I've got six by 10 to the power of 15 hash values I look up a rate of 10 to the power of two per second. So it's six by 10 to the power of three seconds 6,000 seconds, which is also our two hours. It's in fact one hour 40 minutes So now what the attacker can do once I've calculated this table of passwords and hash values For any user they just Find that user's hash value and look up in this table and they immediately or very quickly find The corresponding password Because performing a lookup of a table that can be quite fast So it takes in this case with our example less than two hours to find John's password Any new user? Take two hours or slightly less So This is common in what malicious users do is that they Calculate the hash values of many different possible passwords Store them in a table And then sell the tables Sell the table to someone who's trying to find someone's password. Okay because Assuming someone has already calculated the table may have taken seven days and may have taken them one year But once they have the table it can be reused And because of lookups are very fast It can be reused and you can very quickly find the corresponding password How big is the table? How big is this table? How many all right? Let's calculate. There are six by 10 to the power of 15 entries Okay Each entry has a password and a hash value The hash value is What do we say? 512 bits so Multiplied by we have 512 bits And the password how big is the password? Eight bytes. It's eight characters in this case 64 bits Have I calculated before? I think I have the answer It's about I've calculated for a different value 640,000 terabytes Okay There's our problem Okay to store Those six by 10 to the power of 15 possible values No assuming no compression or just store them in a raw form We need 640,000 terabytes. Okay, we cannot do that Even though we may be able to find 100 lab computers I cannot find 640,000 hard drives to store it on But It turns out that there are efficient ways efficient data structures to store such information You can The hashes and the passwords in a very efficient manner effectively compresses it down And the data structures are referred to as rainbow tables We're not going to cover how they were But there are some ways to instead of storing that 640,000 terabytes Using some different data structures. You can effectively compress it to be much smaller And different approaches. Let's say I'll show you examples shortly Because I got the numbers from examples What is it? About three terabytes out of space with rainbow tables Instead of requiring 640,000 terabytes you can actually Reduce the size required to store all of this information to around several terabytes I do have three terabytes of hard drives very cheap So this is what attackers do They spend a lot of time to calculate the hash values of many passwords And then use some efficient data structures rainbow tables are very common To store that information And then when you want to Find a password given a hash value perform a lookup and lookups are very fast Compared to calculating a hash So if you have the table a lookup Okay, depending upon the size can be the order of hours less if for a smaller table more for a larger table And and the speed of the computer and the storage Using these efficient data structures can be manageable three terabytes. No problem 640,000 terabytes not possible Let's Finish with a final example And people who have calculated these hash values Store them And some people sell them So this crypto haze actually has software for calculating the hashes And they You can purchase the tables that they've already calculated 500 dollars Using different algorithms ntlm is an old one used in windows md5 is an old one used in some unix systems now They use char 512 So you can buy They'll send you hard disks Including the hash hash tables 500 dollars And then you don't have to spend the seven days or the seven The 700 days to calculate the hash tables. They're already calculated for you You can quickly do a lookup from that hash table So they've already done the time consuming part And they use rainbow tables to store that information in some efficient form and another one So they have different sets of rainbow tables again different Hash algorithms ntlm md5 char 1 And different Character sets so we did it for eight characters But some passwords will be two three four five six characters So they have different combinations And they talk about the key space How how many keys or how many passwords in each table? And somewhere they have a price Why? Yes, I'd say it's legal It's just calculating hashes of passwords Using it to access a system may be illegal But calculating the values is I doubt if it's illegal So depending upon the size up to 1,250 us dollars to get that table Of course It's legal to calculate the hash values. There's nothing wrong with calculating hash values of passwords But as with everything we teach using this information to do malicious things can be illegal Finding someone's password and then using it to access their bank account their hotmail their moodle login Would be illegal ntlm is a hash Algorithm used by windows or at least old windows windows XP. I think is the latest version No different algorithm there's md5 tables and char one tables Char five one twelve because it's much longer. It's not this approach is not so useful Because we've got more to store And so it's not so common So how do we stop someone from doing this? So given someone can Do this and store it in a reasonable space And therefore make it easy to find the password We'll see that we introduce a new value assault When you hash the password you don't just hash the password you hash the password plus the salt The salt effectively increases the length and increases the size We'll see for example and we'll do it next week A 12-bit salt Increases this from three terabytes up to a factor of four thousand. So 12,000 terabytes Which is again not manageable. So next week we'll go through how we use assault to Avoid such attacks