 So, we covered cryptography over several weeks in the previous topic, which is that the underlying techniques we'll use to implement security mechanisms, but now we're going to start looking at more practical security mechanisms. And the first one is very important is, with a computer system, the users, say the human users, when they want to access a computer system and we want to provide security of that computer system, we often need to determine who the person is who's accessing the system and are they allowed to access the system. And generally referred to as we want to authenticate the user. So, when a user wants to log into a website or into a computer system, that computer system needs to check, is this person allowed to access the system, so user authentication. And you use this or you are authenticated on a daily basis in many different computer systems, usually using passwords. So we'll focus mainly on passwords, we'll mention some other ways to authenticate users as well. So, the idea is that there's some computer system, some user wants to use it, how does that system determine that they are a valid user to use it. So, a definition of what we mean by user authentication is the process of verifying a claim that some entity or resource, the entity may be a human, but it may be another piece of software, or even a file, sorry. Some entity has a certain attribute value, that's a very formal definition. So, if we're not giving human, our computer system needs to verify that the human has some attribute value, that is has some name and maybe password to confirm that they are the right person who's accessing the system. But it can be more general than verifying humans, we may have to verify can this piece of software access this file, can this piece of software log in automatically to this system. So, it's more general than just people. But we will focus all of our examples I think mainly on human users, so the idea is that we want to be able to check that a human is who they say they are when they're accessing a computer system. And there are two basic steps, identification and verification. First, the user needs to identify themselves to the computer system, usually by presenting some ID. When you sit an exam, we need to know that it's you taking the exam, it's not your friend taking the exam for you and going to get a better score. So, how do we ID you, well you have your student ID, but where did you get that student ID? Well, you had to go and register at some stage, maybe to get the student ID you needed to provide some other form of identification, maybe birth certificate and so on. So this step of getting an ID, getting a user ID or a student ID is the identification step where the user gets some identifier and that identifies usually unique, right? You have a student ID, it should be unique amongst all the students. You have a login or an account name for the ICT server that should be unique amongst all the users. You have a username for your hotmail account that is unique amongst all the users of that hotmail system. So that's your user ID. So that's obtained in the first step, the identification step. Once we've done that, then later you want to access the system. You want to log in to your hotmail account. You want to log into the ICT server. Then there's the verification step. In that case, to log in or to access the system, you present some authentication information that acts as evidence that you are who you say you are. So you present your username to the hotmail system, but how does the hotmail computer system know that it's actually you? Maybe someone else has your username. Well, they require some other information, some authentication information, typically a password, a password. But there's other types of authentication information, a PIN, a personal identifying number, other biometric information, your eyes, irises, fingerprints and so on. And we'll talk about some examples of other authentication information. And that should be secret such that no one else can find the value. Because if someone else knew the value or could obtain the value, then if they have your user ID and if they also know this password or can obtain or generate some unique information about you, then they could log in as you. So the general idea, and you know because you use it on a regular basis, is you have some username or user ID which is unique, but normally not secret. Everyone can know your user ID and an authentication information such as a password which should be secret and chosen by yourself normally. So to access the system you present both pieces of information, the username, user ID and the password. The system confirms that the authentication information that you present, such as the password, matches what you chose in the identification step. So we'll see this, these general steps as we apply it to passwords. Fingerprint, so with fingerprint we could combine the user ID with the verification. But in many cases the fingerprint, you still have some user ID. So you still may have your name, but you use the fingerprint, actually not the fingerprint but characteristics of the fingerprint, of the image of your fingerprint. You use characteristics of that as the unique authentication information. So yeah in some cases the ID and the authentication information may be combined. We'll see some other examples of biometrics and how they're used in the slides towards the end of this topic. User authentication very important because most computer systems rely on it as a first line of defense. That is you have a computer system, you first want to check, so if someone wants to access and use that computer system you need to check are they allowed to access this system. And we use user authentication to do that. And once they can have been authenticated and are allowed to access the system many other security mechanisms depend upon the fact that they've been authenticated already. So once you can log into the ICT server you can do many things and the control of what files you can access depends upon your user ID. If someone else goes log in as you then the security of that system may fail. So it's an important part of securing any computer system. The general way is that we can authenticate someone. Based on something the individual, the user knows, possesses, is, is, IS or does. That's the general four ways. So something the user knows is used as authentication information. A password, a secret password, a PIN, so some secret values, some answers to questions. So only the user should know those values, they store it in their head, in their memory. And if the user, when they want to access the system can provide that value, the password, then the system will assume that that's the correct user. So what normally happens we'll see with passwords is that when you first access the system you register a username and password. The system stores that information. When you want to log in you must provide your username and password and the system will check the one that you provided against the one that you registered initially. If they match, if the passwords match then it assumes you're the correct user. If they don't match it assumes you're the wrong user, you're incorrect. So this is based upon what the user knows, some knowledge that they have. Other systems may use what you actually physically possess, a key, or a key card, like a swipe card, some token, a USB token for example. You can get USB sticks which provide some means to uniquely identify you as the user. Whoever has that USB stick is identified as you. Smart cards, physical keys, the key to my office door. By having that in my possession I can get access to my office. So by having a particular token you can get access to the computer system. What you are, so biometrics, something about you as the human. And we differentiate between static biometrics and dynamic biometrics, things that are fixed and things that may change. So your fingerprint is usually considered fixed, it stays the same for your lifetime. Your iris or retina in your eye, your facial characteristics, they are something that define generally unique to you the user. If we can capture that information we can say that that is unique to a particular user. So then to access a system, first you register your fingerprint when you identify yourself and then when you want to access the system you supply your fingerprint and the system compares your supplied fingerprint with the registered one. If they match the system assumes it's you and you can get access. If they don't match then it assumes it's not the right person. So using information about you, the person, which is static based on who you are. And then some biometrics vary, so we refer to it as dynamic biometrics. So what you do for example, the way that you speak, your voice pattern, your handwriting pattern, typing rhythm, the speed and the pressure at which you hit keys for example will vary amongst different users. So if we can recognise patterns from there and show that they're unique amongst a set of users we can use that to identify users. Because if everyone in this room has a different voice pattern what we do again is that we all register our voice, we say some words into a microphone, the computer records that and then when we want to access we say those words again and the computer compares those you just provided with the registered audio. If they match then you're access, if not then you don't access. We're going to focus mainly on passwords but can be extended to PINs and other information that we know and then we'll have a few brief slides about the other approaches. Because passwords is the main one that we use. All of them are used in practice but passwords is the most widely used approach. It's not necessarily the best but as a trade off between convenience, cost of implementation and security it's become widely used. So we're dealing with authenticating humans and this is a quote from one of the textbooks or one of the auxiliary books and it's not just about authentication but generally with security of humans. Humans are large, expensive to maintain, difficult to manage and they pollute the environment. In terms of computer systems it's astonishing that these devices, humans, continue to be made and deployed but unfortunately humans are so widely around that we must design our computer systems and our protocols for security around their limitations. So humans often cause the biggest problem in securing a computer system because humans cannot remember long sequences of bits because they cannot generate random values very well and that means when we use humans to generate passwords they often choose insecure values. So we'll see that we must design the security of our system around the fact that humans are not perfect, they cannot act like a computer but of course we must deal with them. Life would be better if we didn't have to authenticate humans. So let's focus on passwords, we'll go through the general approach for using passwords for authentication and we'll talk about storing passwords, a little bit about selecting passwords but not much detail and then we'll finish with a couple of other authentication techniques. So how do we use passwords in the most common way in computer systems? Many multi-user computer systems use a combination of an ID and a password for user authentication. So multi-user computer systems in the old days where there were servers where not everyone has their own personal computer, they use a shared computer so many users log into that system to use that computer and which is still in use today and you use it many times in websites. Multi-user computer system, there's a web server, computer system, many users access that web server and they get specialized content based upon the particular user so we have some username and password to identify the user. So that's a common way for authenticating users, an ID and a password. So how does it work? In the initial step a user performs some registration with the computer system and in that case there's a username or user ID, both of them, a username and a password which is stored for that particular user. Now the username and password may be selected by the user or maybe generated by the system. So when you want to create a new email account with Google then you get a chance to choose your username, you cannot choose any because some are already used so you get a chance to choose a unique username, Steve at gmail.com and I get a chance to choose a password. So in that initial registration step I choose my username, I choose my password and the system stores those values. So that's the initial step. Then later I want to access my account, my gmail account. Then what I do is I submit my username and password to the system and the system compares the submitted values against the stored values which were obtained from the registration and if they match it assumes the user is authentic, they're the right user, if they don't match the user is not authenticated. Any questions about that process, does that make sense? So two steps, register and you do it many times with websites but not just with websites with many computer systems, you first register a username and password and then when you want to continually access it you must provide your username and password. So there are two pieces of information here, the username or the ID and the password. What about the ID, where does it come from, what's it used for, determines whether user is authorized to gain access to the system. So you must have a registered username. For example, with the Moodle login, you log in to Moodle, you provide your username, the letter and your student ID. Some student comes along and they haven't registered yet but they try and log in with you followed by their ID but they haven't registered. The system can automatically detect or can now detect that this user is not allowed to access the system because they haven't yet got a registered username. They don't have authority to access the system. Sometimes the identity is used to give privileges or determine what privileges a particular user has. So based upon the username, particular permissions can be given to that user. When you log in to Moodle, you're logged in as a student user. So you can access the courses you registered in, you can take quizzes and so on. When I log in to Moodle using my username, my username is associated with a teacher, which means I can view the quiz answers and give you marks and so on. So the username there is used to determine what privileges that user has. So this is the ID or username. The next topic after this is about access control. So we can also see that the username is often used to determine what permissions you have to access particular resources. Are you allowed to access particular files on the computer system? So we'll see that access control uses the username to determine who can access this file or it's usually based upon the username or the identity. The other piece of information we have is the password. And there are many questions or issues there and we'll discuss some of them like what is a good password, how do you choose a password, how to store the password. So the system when you register stores the username and password, we'll spend some time talking about what's the alternatives for storing the password on the system. We'll say a little bit about how to choose a good password but I'll leave that a little bit for you to study. With a computer system, especially say a web based system, the computer system is in one location and the user that wants to access it is in a different location. So the Facebook web server is in the US but when you want to log in to Facebook your computer is of course here in Thailand. So to log in what you do is you submit your username and password. You send it across the internet from your browser to the Facebook server and it checks is the username and password correct compared to the stored values. So the questions arise is how do you submit the password? If you're sending the username and password across the internet from your computer to the server if it's not encrypted some way then someone can intercept and capture your username and password and therefore access the system as you because the password should be secret. So how do we transfer the password securely from the user's computer to the computer system is an issue. Things like how to respond if there's an error that is if you don't provide the correct password there are different ways to respond. So the many issues about passwords and password storage and usage we'll cover some of them. What's a good password anyone? What's your password? First don't tell me your password, your password should be secret so we'll cover some across this topic and other topics with some principles of using passwords but anyone got an idea of a good password? Use your name? No? That would be a bad one. Use a random password okay so I'll get my computer and alright let's do it quickly. You used OpenSSL to generate random numbers in your homework. Here's a password alright here's my password a random password I just remember that whenever I want to log in to Moodle I'll just type in this password okay good in that we'll see that it's hard for someone to guess that what's bad about it I cannot remember it okay so when I want to log in to Moodle I cannot remember this so what I do is I get a post and I stick it on my laptop or on my office computer so I remember it. What's another problem with it? Let's say I've got good memory and I can remember it I log in to Moodle every day to check your progress what's another possible problem? Waste time it's long to type okay it's what 32 is it 32 characters to type it in takes a bit of time okay when I want to log into a website or a computer system I don't want to spend even five seconds typing my password all the time I would like to be quick to log in easy to make mistakes the longer it is the more chance that I hit the wrong key and I press enter and then I have to wait to log in again so it can be inconvenient as well as hard to remember okay so there's this tradeoff between we'll see hard to guess secure and convenience and performance we'll come back to how to choose a good password a little bit later I think some of you will have ideas for doing it let's look at well what's the problem with passwords what can a malicious user do to try and defeat a password based user authentication system some vulnerabilities and this is a selection over these two slides two or three slides an offline dictionary attack we will see or have already said that the system stores the username and password when you register it the system stores that if an attacker can somehow get access to that data stored okay so stored in a database or in a file if an attacker can get that information then either they can immediately find your password or we'll see and this will make more sense as we go through this topic we could sort of encrypt it's not exactly encrypt but somehow store the password so that it's hard for the attacker to actually see the value even then the attacker can try and guess by getting this list of encrypted passwords and trying all possible values from some dictionary to try and guess some passwords this one will make we'll spend some time explaining what we can do to how this attack works and what we can do to try to prevent this attack but we'll come back to offline dictionary attacks it's possible for the attacker the other is a bit more obvious or you I think you'll understand that a specific account attack some attacker tries to guess passwords for some specific account okay you know my username for the Moodle website so what you do is you try many possible passwords try and guess my password by entering my username and your password guess if the system says no then you try again and try again and again and again so for a specific account try many different passwords what's a countermeasure to stop that or to to slow down an attacker because what an attacker could do instead of manually having to type it in they could set up some software that automatically submits a username and password to the website so it's happening a thousand times per second each time with a different password attempt so then the time it takes for the path the attacker to get my password and get access just depends upon how many attempts that they can make per second and the the chance of finding my password if my password was this random number the time for the attacker to guess that would take too long most likely but if my password is a six character word the time for an attacker to guess that is not so long so one way to stop this is to have the system lock the account after too many failed attempts that is you try and log in to my Moodle account wrong password it says wrong password you try again you try the third time it says three times too many no longer can you try the system locks that account and no one can log in with Steve's username in that case in that way the the attacker cannot try many passwords to try and guess okay so lock the account is a countermeasure here no not not really it's the issue of okay you want to get access to my my account on Moodle okay let's say you know my username is Steve it's not hard to determine my username and what you do as the attacker if you go to the website you type in my username you guess a password random password okay let's say you have a large list of possible passwords just random characters you submit one and the system says no incorrect so you submit another and you just keep trying eventually you'll get it if you try all possible values so that's the attack this countermeasure and there may be others but this countermeasure is to say that okay if you submit a three five ten times after submitting some too many times the system will say you can no longer try you can no longer log in with this username okay that's right yeah that's right so this countermeasure of locking the account after too many attempts has a drawback and that is that okay look not for Moodle let's say for the same system is applied for your hotmail account that is you get three attempts on your password once you've got three wrong attempts on your hotmail account you cannot no longer log in and you lose access to your email until you do some other contact with Microsoft to get access so what the attacker can do then is just to submit three random passwords to your hotmail account and that automatically locks your hotmail account and you can no longer use it so you the correct user are denied access to that service so using this locking of accounts after too many failed attempts an attacker can take advantage of that by performing a denial of and perform a denial of service attack that is cause the account to be deliberately locked so that the normal user can no longer use it until they do some other maybe offline mechanism so yes this countermeasure effectively prevents too many attempts but has the drawback of it can deny service to normal users okay another attack and some of them are related try popular passwords okay you know the the users for all the students for Moodle you know their IDs you you can find your friends ID you want to log in as your friend and change their quiz answers then okay instead of trying many possible passwords try some popular passwords okay password is a popular password I love you is a popular password one two three one two three four whatever a combination how many characters are needed many people use these popular but insecure passwords so if for example you have a large list of set of hotmail email accounts a thousand different hotmail addresses then what you could do is for each one try some of these popular passwords and you probably don't have to try many until we get access to at least one account and then you get access to someone's account so how do we stop such attacks control the password selection so when the user initially registers and selects the password make the system control such that they will not select a popular password and there are different ways for doing that having a set of words or passwords that they cannot use so when you register for your hotmail account the system if you choose the password one two three four the system may say no you can't use that password choose another one and you see with many web based systems they'll give now an indicator of strength when you choose a password and they may have limits you must have a password which is at least six characters your password must contain at least one uppercase character or one special character or punctuation character so that is a way to control password selection so that it's harder for an attacker to try and guess your password quickly if there are computers that make multiple attempts not just on the one account okay but even across multiple accounts that is with multiple different user IDs but coming from one computer the system can try and block that computer so let's say I use my computer and I have some software that automatically submits user names and guest passwords to some to the hotmail server and I'm doing it for a thousand different hotmail addresses then the hotmail server can check and see okay there's this one computer that's submitting many different passwords for many different user accounts maybe that's an attack so block that computer what about password selection has anyone seen that before when you register for a password you go to a website you must choose a password has anyone seen any strategies for how the server suggests your password uppercase and lowercase sometimes anyone anything else use some numbers so you must use some numbers so if you choose just just letters the system will say no you must have at least one number in your password okay makes it harder for someone to guess what's the problem what's let's say I make a strategy which is you must have at least 10 characters it must have two punctuation characters so non letters non numbers must have at least two uppercase then you must choose a 10 you must choose a password that meets that requirement again we get the problem of difficult to remember okay the user no longer can choose something which is easy for them to remember and that leads to other problems like writing passwords down and therefore leakage of passwords through other means so there's always trade-offs between these countermeasures and they have drawbacks so it adds security but it may make it more inconvenient okay this is related or similar to the one before instead of trying many different passwords try and guess a little bit more intelligently the password okay you want to get my moodle password or account you know my username you try and guess my password well maybe try and use something about my my name my birthdate or things about me that you know because maybe I've chosen a password based upon my birthday okay so if you know the particular user and you know information about them then it may be easier to guess their password so that's guessing against a particular single user again have mechanisms to control that the user selects particular passwords and inform or train users about selecting passwords make people aware don't use your birthdate in your password don't use your middle name as your password or or reverse your your name okay so awareness and training of people can help that in the long term I log into my laptop I go to the bathroom so a student comes in and they have access to my account they change my password so they've got access there okay I hijacked my computer so what do we do have some form of auto logout same on in a shared computers okay in the library or in the lab you log in you leave the browser open and the next student comes along and they have access to your account so have some automatic logout mechanisms if you don't do anything for two minutes the system automatically logs you out add security makes it more inconvenient though I have to if it automatically logs me out I have to come back and log in again if I have some delay okay if users make mistakes try and exploit them as an attacker if they write them down and stick it on a post-it note then just go and read the post-it note if they share with friends use some social techniques social engineering techniques to trick them into revealing passwords you call them up on the phone you say I'm with the SIT computer center there's a problem with your account please provide your username and password and I'll fix your account for you okay you're pretending to be someone else they think you're a trustworthy person you give them the password and now they have it and you see email many email spam messages that try and trick people into providing passwords so trained users not to reveal their password use passwords and some other form of authentication with bank systems sometimes you'll have a one-time password or an SMS message that requires you to not just have a password but some short term value that you must enter so combine the normal password with some other form of authentication many people reuse passwords across different systems okay you have tens of maybe even hundreds of accounts on different systems on computer systems on web systems remembering a different password for each one is hard so often people read reuse the password so if an attacker can break into one system and find your password for that then they've automatically got your password for other systems I am the admin for the Moodle server so in theory I can access your passwords for the Moodle server so if I was malicious if you use the same password on the Moodle login as you use for your email for your bank account and for other things I could quickly get access to your other accounts so reusing passwords across different systems is a problem how do you how does the system prevent that very hard because the systems cannot communicate between each other but again inform the user to try and not to reuse passwords if a system has is within the same organization then they can check make sure you don't use similar passwords monitor as the password is sent across the network again you log into a website and to log in you submit your username and password and the system must verify that but that login involves sending the username and password across the internet someone intercepts monitors your network communications and they've discovered your password so encrypt your communications that send passwords as a countermeasure there so many vulnerabilities many problems but still passwords are used on a regular basis let's spend say the rest of this lecture looking at how does the system store the password when you register use let's say select a username and password and that stored on the system so that when you later access the system you supply your username and password is compared to the stored value how do we store it so the information that we need to store on the system is at least the ID username or ID and the password or maybe some information based on the password we'll see what that means shortly maybe we'll store some other information for example you create a website you've got a great idea for a website you create a website which has many different users access that website you provide a login system so you need to design and implement the website you use some MySQL database to store the list of users and some information about those users so each user's username maybe their full name other identifying information their preferences and their password is stored in the database and then when that user logs in you write some PHP code that takes the username and password that they sent when they logged in compares against the database if it matches you give them access if not no access so that database stores the username and password but storing the password in clear text is very bad so we'll look at how we can store a password here's one approach that the basic approach and what I just showed the top the first line is this is the information we would store think of a database would have two two columns one for the ID and one for the password ID and P for password so we have a database where each user has an ID stored and the actual password stored that's the simplest approach but there are several attacks that make this approach not very good first if someone can read that database maybe the person developing the database then they can quickly see other people's passwords okay so you have in your database a list of user names or IDs and their corresponding passwords anyone who can read this database can see everyone else's password so there's a first problem we don't want to allow anyone to read this database so we should have some form of what we call access control on the password database not anyone can read the database an example again is with the ICT server you all have accounts on the ICT server one computer with many users and and shortly I'll show you that there's a file that stores your username you followed by your student ID and if we stored your password in that file what would that would mean is that it's just a file on the ICT server when you log into the ICT server if there's no protection of that file you could just open the file and see everyone else's password that would be bad so we need to make sure this file or database that stores the passwords is protected some way in that we control who can access that particular file or database for example only the admin user can read the database normal users cannot okay let's say we have that only the admin user or the developer of the database can read it no one else can read it the values still that admin user can read the database and if you don't trust that admin user then they can access all the passwords how do we stop that or effectively there's no way to stop that you must trust the admin user if you're going to supply the username and password because even when we see other approaches that if you submit a username and password to a computer system the admin user is a person who runs that computer system if you submit your username and password you must trust that admin user because there's always a way for them to be able to find your password if you submit it to it to the computer system so there's no no way really to stop some admin user from accessing passwords we can make it a little bit harder for them and we'll see the next approaches do but in the end you must trust that the person who runs that computer system so when you choose a password for your hotmail account you place some trust in Microsoft or the admin users at Microsoft who run the hotmail account because in theory they can see and find your password so we can stop insiders we must trust admin users we can stop other users by using some access control the main thing though what if an outsider tries to get access to this database an outsider for example you create your website you have a database of usernames and passwords stored on your web server but some malicious user using some other means gets access to your web server and can download that database all right there's some other floor in the security of your system so that the malicious user can access and read your database in that case that malicious user now has learned all passwords of all of your users which is bad so to stop such an attack do not store the passwords in the clear that is do not sort store them as is store some manipulation of the password we'll go through the next approach I've got some of the examples that we're going to go through and use you have in a printout in your lecture notes about passwords hashes and rainbow tables you have a printout it's also on the website which goes give some example and gives the details about some of the concepts we're about to talk about so you have a copy you can see it here as well passwords hashes and rainbow tables I have a few examples from that so the example I've taken from that is that okay let's say a very simple case we have you create a website and you're going to allow many users to register so a user will select their username select their password if we store the username and password in the clear then you would store this information say in your database for your website list of usernames and for each username their actual password okay so John has chosen my secret as the password Sandy something else some random characters and I would have a long list of users and their password now the main problem with this an insider if they can read that database can immediately see everyone else's password that's one problem but in practice the main problem is the if an outsider someone who's not normally allowed to access the database but somehow through other means gets access if they can get access to this database they've again immediately see everyone's password and that's back so they can immediately log in as everyone else and if these users reuse their passwords across many different systems John uses his password not just on your website but on other websites and other systems then the attacker can now try that on other systems so we want to stop this attack of if the database can be discovered by the attacker we want to make it hard for them to find the corresponding password how do we do that first approach encrypt the password do not store the password in the database take the password and encrypt it say using AES some symmetric cipher some key and store the ciphertext of the password in the database then when the user logs in they submit their password the system takes the supplied password and can encrypt it with the key and if the ciphertext matches the value in the database then they are authenticated now if the attacker can get access to the database which has the list of IDs and the encrypted passwords they cannot immediately see the passwords of the users so this prevents such an attack from the attacker but it has a drawback and a very significant one the key is needed by the system because the system when someone submits their password must encrypt it with the same key that was used to encrypt when they registered so the system must store the key somewhere where well if the system stores the key in a file so everything's automatic then if the attacker can get access to that file then they can immediately get access to the key and decrypt all the passwords easily so in fact we still have this problem of if the attacker gets access to the database and if we store the key the secret on the system and the attacker gets access to that key then they can quickly get access to the passwords and it's reasonable to assume if the attacker can read the database then it's likely that they'll also be able to read the file that stores the key well encrypt the key but with what key do you encrypt it with again it needs to be stored somewhere so it doesn't provide much protection especially in practical cases where you need to store the key somewhere even in memory a more practical approach and we see this one in an extension of it is to store a hash of the password we store the ID for the user and we use a hash function and take the password and calculate the hash of the password and store just the hash value not the password remember from the previous topic hash functions and practical properties usually they take any size input the password can be any length and they'll produce a fixed usually small output MD5 produces 128 bit output for example and in practice with a good hash algorithm there'll be no collisions you hash two different passwords you'll get two different hash values if you hash two passwords which are the same you'll get the same hash value and it's a one-way function that is if you have the password and you calculate the hash you get the hash value that's easy but if you know the hash value it's practically impossible to then work backwards to get the original password so if you know that the password it's easy to get the hash value but if you know just the hash value it's hard to go backwards to get the password that's the the properties that we assume for that hash function so we store a hash of the password the result if the attacker can read the database they see your list of hash values they want to find the passwords well if they have the hash values the property of the one-way function means that even if they know the hash values it's practically impossible to work backwards to get the original input value the password it makes it hard for the attacker to take the hash value and get the original password so is this this is the case where we had the password stored now let's consider the case where we take a hash of the passwords what's the hash of my secret let's use explain what I'm doing we're going to use MD5 for an example hash function just for this demonstration I have a password my secret instead of storing that in the database I calculate the MD5 hash of that password and this just takes the the the word the minus n means don't add any new line characters at the end just a characteristic of the command line and send it that word into MD5 sum which calculates the MD5 hash 06 C 2 2 2 2 up to 4 9 in hexadecimal it's 128 bit value so that's the hash value so instead of storing my secret we store this hash value in our database so we store these values instead of the passwords so the hash of John's password is stored the hash of Sandy's password so I've calculated them and this would be what stored in the database not the top values now how do we use this to log in what the the user does to log in we take so this is the data stored in the database upon registration user John wants to log in so what they do say from their web browser they submit their username and password to the system they send it across some network containing his username John and his password which was my secret that those two values are submitted to the the system and the system stores this database so now the system needs to perform the authentication they need to verify is this John and the way that verifies that they check the password so what the system does is it calculates the hash of the supplied password the hash of my secret and they take the hash value and in the database here they look up for username John so the username John they compare this stored hash value with the hash of my secret and the properties of hash function is if this hash value was created by hashing my secret then when we hash the same word again we'll get the same hash value that this hash value will match the stored hash value and John is authenticated so John accesses the system so that's the normal way that the login works using the hash values so we don't need to store the password we store the hash of the password now what if what if the attacker can somehow get access to this database then what the attacker wants to do is to discover the passwords of these users so they know the hash value they need to work backwards to get the original password but we said the property of our hash function is our one-way function knowing the hash value knowing 0 6 C 2 through to 4 9 it's practically impossible to find out the original input that's the property of our hash functions or that we desire so even if the attacker knows these hash values they cannot get the original passwords so that protects our system if somehow the attacker finds this database any questions so far we're going through different approaches for storing passwords storing the password in the clear is bad because if the attacker gets the password database they see all passwords encrypting the password with symmetric key cryptography hides the passwords but we still need to store the key somewhere so if the attacker finds the key then they can still get the passwords so we store a hash of the password so if the attacker finds the database if we have a good hash function then they can't work backwards to get the original password but we can still authenticate the user because they submit submit their password the system calculates a hash of the password and it should match the stored value if we submit the wrong password we'll get a different hash value so if the hash value stored is 0 6 C 2 but user John submits the password my secret one that is the wrong password then we apply the hash of that wrong password we'll get a different hash value and the system will be able to verify that this is not the right password okay there's some problems still that there are still some attacks which may be possible that the attacker can do so let's look at what effort that they can take to perform an attack and we'll not go into the details but and some of them are given in this handout but we'll start I'll give you some of the numbers what the attacker can do is try a brute force attack on the hash that is given the hash value try and work out the original password and generally the amount of effort that takes depends upon the length of the hash value generally the brute force attack on a hash takes two to the end operations where n is the length of the hash n bit hash that's a general brute force attack so for example MD 5 is a hash function it is a 128 bit hash a brute force attack which takes a hash value and tries to work backwards to get the password would take two to the power of 128 attempts operations and it depends on how fast your computer can try different values I done some calculations before and if our computer could work at a speed that's a attempts if we could work at a speed of one billion attempts per second it would take I think 10 to the power of 21 years or approximately it'll take forever okay two to the power of 128 to do that many operations even a thousand times faster than this will still take millions and millions of years okay so a brute force attack by trying that the normal attack on the hash function is not going to be successful but what the attacker can do is use the the knowledge that this the input to the hash function is usually a short password and what they can do is instead of doing a brute force on the hash is just to try and hash many passwords let's say our password length so as an up and the passwords I used in the example tables were eight characters and the characters that we could choose from when we chose a password were let's say lowercase letters uppercase letters I think I lowercase letters uppercase letters numbers 0 to 9 and other punctuation characters on the keyboard so normally with a password you use a keyboard to choose it so you're limited to the set of characters of that keyboard offers this is what 26 letters another 26 in uppercase if we use shift and then 10 so we've got 62 and if you count a lot of the punctuation characters there's about depends on which ones you count let's say that's 62 there's a plus another 32 characters say that's typical on a keyboard we have about 94 typical typeable characters uppercase lowercase numbers comma slash greater than less than and many other characters about 94 possible characters in total try you have a look on your keyboard and see how many characters you can type in English so how many possible passwords are there possible passwords if we must have eight characters and each character we can choose from one from from 94 possible values so the first character can be any of these the second character can be any of these 94 characters so the number of possible we have 94 choices in the first case 94 for the second character and up to 94 for the eighth character becomes 94 to the power of eight possible passwords so if we allow any possible combination of those 94 characters and we have eight letters eight in a row there are 94 to the power of eight possible so what an attacker can do now is that they try all those passwords say password P1 P2 down to P 94 to the power of eight and for each password they calculate the hash so this is the attacker now they've got 94 to the power of eight possible passwords so that's what the users may have chosen so the attacker tries in a think of a brute force attempt try all possible passwords now take the first password let's say all a's eight a's in a row lowercase a's calculate the hash of that compare this hash value to the value in the table it's that we're trying to get John's password we take a password the first possible one calculate the hash compare it to 06 C2 if it matches then we've found John's password if it doesn't match try the next password so we try a password calculate its hash compare the hash value to the stored value if they match then it means the password that John chosen must have been P1 if it doesn't match try P2 from the attackers perspective and keep going until you find a match the worst case from the attackers perspective is we'll have to try all passwords all 94 to the power of eight passwords calculate the hash of them and we're guaranteed to find the password of John and even the other users in that case how long does this take well we need to calculate the hash of 94 to the power of eight passwords the time depends upon how fast you can calculate hash values and it again it depends upon the hash algorithm I'll provide some links later but a number I've looked up as it takes about it depends upon the hardware you have about in the order of 10 to the power of 10 per second this particular reasonably low-cost hardware can calculate 10 to the power of 10 hashes per second 10 to the power of 9 maybe approaching 10 to the power of 10 so how long does it take to try 94 to the power of 8 passwords is 94 to the power of 8 divided by 10 to the power of 10 and I've I don't have the actual answer here in seconds about seven days okay if you're continuously calculating trying all these passwords about seven days it would take you all right if you're really determined to find the password that's not too long or maybe you have more money instead of using just one computer to do it you do it in parallel across seven computers and it takes you one day or you do it across 150 computers and it takes you one hour okay so it's possible in that case by trying the passwords as long as we have access to the computing resources but still if we use instead of eight characters nine characters we allow nine characters so it's now 94 to the power of 9 so times by 94 that is we have 94 times the number of passwords it would take 94 times longer 94 times 7 so two years now just by increasing the password length by one character so we can start to defeat such attacks by requiring the password length to be long what we'll look at next week is the fact that okay with long passwords the idea and attacker would take is that what you do is you take all the possible passwords or a large selection of them calculate the hashes the next time you want to do an attack you don't have to recalculate the hashes instead you store these hash values in some large table or database and simply do a lookup and we'll see that performing a lookup is much faster than calculating the hash and we get the concept of you may have heard of rainbow tables and they can reduce the time for such an attack to be in the order of minutes hours even with large numbers of passwords so we'll look at that in the next next lecture and continue and lead to this brute force attack rainbow tables and look at a new way to store passwords so we'll continue that next next week in fact so what you should do as homework those that still working on the homework one submit by 5 p.m. today but try and read through this document that I've given you about passwords hashes and rainbow tables I'm using the numbers from there in the lecture so if you read through in advance you'll start to see some of the calculations and see where the numbers come from and we'll continue this next Tuesday