 So welcome on lecture 18, where we start going into the security topic in the web area. And we start off with discussing generally what kind of attack surfaces there are in web applications, how you can attack a web application, some very basic guidelines on security, and then we dive into the topic of authentication and authorization. So the whole idea that you should not be able to access everything in an application that you want to, but only if you have sufficient rights. In the next lecture, in lecture 19, we then go into vulnerabilities and attacks, because that's of course an important thing. Hackers trying to get into an application because it's not secure. So we'll look into that then. So for now, we do this part here. And essentially, there are a couple of learning outcomes related to that. So of course, this is generally anyhow connected to development of web applications. So if you, for example, you should be able to spot errors in an application or improve it, then of course this also relates to security, like introducing authorization, for instance. Then there are a couple of learning outcomes specifically to web security threats. So that's mainly for the coming lecture, but it's a good start if we talk about authentication here. There are some basic explanations on attack surfaces, which I won't go into much, but you can read up if you want on these kind of things. Then there is a developer guide for authentication methods and a blog post that has really some basic explanations on the different ways to do authentication. So it's an interesting read if you practically want to do that in a project. I recommend you to start there. Now, attack surfaces. This is essentially the view on a web application. So imagine we are up here in x1z.com and we're running an application. It could be a backend. It could also be front-end files that we then send to the client that is requesting it. And we have something stored on a database. And the cloud here stands essentially for the internet. And as you remember from the second lecture, there can be lots of computers in between that route my requests and route the response back. Another interesting thing is what can you attack here? The basic answer is everything, but essentially you can start with the network connection. For example, imagine I am sitting at Reykjavik University. I've shown you already in the second lecture that you can listen to network traffic. So that's one way of attacking the actual connection, eavesdropping or actually doing something with my requests. The same can of course happen on the other side. So someone might get access to the communication between the server and the database server, for instance. Connection between the server that hosts the backend and the front-end stuff and the internet. So someone is, for example, listening to that connection. And then the same goes for all the different connections in the internet. So every kind of internet connection, be it a cable or a Wi-Fi or a fiber cable, can be somehow accessed theoretically and someone could place an attack there. Similarly to the networks, of course, you can go on the machines. Someone could have a virus or something like that on my machine and is able to read everything or is able to actually use my computer to send requests automatically. And again, the same applies to all the machines on the internet. The same applies to all the server-side things, the server, the database server and so on. And especially here in the middle this is somewhat tricky because you don't have control over which computers your request goes to. So potentially your request could go wherever and that can be an issue. If we're talking private information, sensitive information, then this becomes especially interesting because you could be routing your requests through a country where the government is somehow interested in accessing things and they, of course, have much more means to get control of individual computers or the network in between them. So that's also something you have to consider. This is the network and hardware view, essentially, what kind of networks are there, what kind of machines and so on. The other interesting thing is the application view, so if we look at what kind of applications are running. So on my machine I have Firefox, sometimes I use Chrome, and whenever I request the front-end I'm getting HTML and JavaScript code. Of course all of these could have potential issues, potential vulnerabilities. There could be something wrong in the Firefox and the Chrome implementation. There is a lot wrong in the JavaScript implementation of things you can do with JavaScript. Potentially there's also some way to mess with the HTML that somehow has security relations and we will see much more of that in lecture 19. But theoretically all these different technologies and applications can be surfaces, can be means to have an attack. The same on the server, if we, for example, use Node.js, this could be a problem. It doesn't matter what we use, if we use Java Spring, if we use Django, it could have the same issues. And finally our database as well. One thing I have not listed here is operating systems and other applications, but they of course can be a potential threat as well. So if we here have an outdated version of Linux running, for example, that could be an issue. If we have an outdated database server or an outdated macOS operating system, all of these things could be problematic. So essentially the message here is whatever you have can be an attack surface. This is the application view, so now we have the hardware, we have the applications. There is one last one that I want to show, that's probably the most important one. And that's the users. So we have someone sitting behind my laptop, usually that's me. We have someone working with the servers here, developers, it might be other staff that is somehow related to them. And we are very suspectible to attacks as well. We can be manipulated. And this is something that's usually called social engineering. That's for example, when someone calls you and pretends to be from the tech service and tries to fix something, but actually what they're trying to do is get your password or get you to send a request or something like that. There is an old comic on that if you're interested. But essentially the left side is always the technical view, when we look at everything that if we encrypt things good enough, then everything is fine. The reality is if we get the people to tell us what we want to know, then that's enough. And usually as this comic illustrates, the user getting to the user is much, much easier than fixing all the technology. That being said, we are of course looking at mainly at the technology because people we cannot fix as easily. So we are here looking at how do we make our web applications more secure? We are looking only at a few attack surfaces. In particular, we look at the application view. So what can we do in programming to avoid issues? Because other courses cover other aspects of security. So obviously if you do a networking course, you talk about network security and so if you do an operating systems course, you might look at security there. So we just look at it from the aspect of web programming. What we have done so far in this course is we have essentially, apart from deploying to Heroku, we have always used HTTP communication, which as I've shown you in the beginning is completely unencrypted. So everything we send is free text. And the other thing that we have done, which makes it completely irrelevant whether it's encrypted or not, all the backends we have written, all the APIs were always open. So you could just access whatever. Everyone could do that. I could just delete everything without having any special rights. And that's of course not perfect. So this is what we'll try to fix in this lecture and in the next lecture. There are some general security principles we should follow before we get started. And the first one is do not rely on something that is called security by obscurity, which is essentially trying to make your application secure by just not making it obvious how to use it. So if you, for example, don't tell the URL of your API to anyone, you can assume it's safe. That's true until someone finds it. If you kind of implement your security in a way that no one knows how it is, it's also safe. Again, that's usually not true for a while. And this is exactly the same principle as, well, if you hide the money under your pillow and no one knows it, then your money is safe. That's true until someone actually tries to find something in your house in your apartment. And that's exactly the same in applications. Once someone gets interested and starts digging, they will find issues. So that's exactly why you shouldn't do that. Instead, there is usually a popular law by Linus Torvas, the inventor of Linux, that says, well, if you let enough people look at your code, if you have it very open, then you'll get rid of all the issues. Basically, try to make it secure by showing it to many people. If it's secure then it will also be secure if no one knows about it. So this is very important. If you don't hide your stuff and think it's safe, it's not. The next thing is always use HTTPS. If you can, I've shown you this. Requests and responses in HTTP are always clear text. So this is a screenshot I took when I accessed my personal website. And you can just see that if you look at the post request, you can actually see my password. No problem. You can be fooled into thinking that HTTPS is anyhow safe. If I do the same in HTTPS, this is what you get. That's much harder to deal with. So always use HTTPS. Finally, do not assume that that's enough. So just using HTTPS does not make your application safe. There can still be other issues. And finally, there can be issues in HTTPS itself, which we have seen a couple of years ago with the Heartbleed vulnerability. So a big problem in OpenSSL. And OpenSSL is exactly what is used to encrypt HTTPS. So that was a big problem. So never assume that this is enough. Always try to make your application as safe as reasonable, I would say. And the final step, and that's what we focus on in this lecture from now on, is use authentication or authorization. So make sure that your web application is only usable by the people that you want to use them. So for example, most websites have some kind of way of registering, and then only if you have a username and a password, you can actually use things. And you can of course assign rights if you do that. So you can say that some people should be able to read, some others should be able to do everything and so on. And the key terms we use there is essentially authentication authorization, and that's a difference. And the difference is that authentication means that you check the identity of somebody. So if you go to your bank and you want to order a credit card or you want to do anything else with your bank account, they will ask you to show your ID. So they check that you are the person you say you are. That's authentication, checking who you are. Authorization is just checking that you are allowed to do whatever you want to do. So that's for example what happens if I want to go into the department of computer science here at the university. I put my key card to the reader and it just checks whether I have access rights to that part of the building. So it might not care who exactly I am, it just checks that I have the rights. And that's authorization, having the right credentials, having the right permissions. And of course this is often combined. So in the university system I don't know, but it might be that the system actually checks who I am and then checks what kind of rights I have. So this is roughly what we are talking about and now in the second part of this recording I'll dive into the different mechanisms we can use for authentication. I'll just do a couple of them. There are of course a lot more than we will do here. We will do HTTP basic authentication. That's the easiest one and the least reliable. Then we look into session ID which are essentially a special case of tokens to discuss them. Then we look at OAuth 2.0 which is probably one of the best options you have at the moment. And finally we look at something that is called signed requests. But for that we'll go into the second part of this video.