 And it's probably the same. All right, folks. You can't see me there. Perfect. All right. So we're jumping right in. So one of the three, four parts of the lab is three, four technologies. Http, HTML. Http, HTML. URI. URI. What's the relationship between them? URIs are the, there's a command that we use to fetch the Https. Right? Or direct us where in the server we're going to take the information from. Close. I think you're kind of there. And you want to help clarify. So URIs are created that we can ask from where we're going to be created. STB is the main utility. And STB is the form utility. And STB itself has URIs that we can use. So we got the whole loop. So I can say even not just the URI tells us who to get, but it actually literally tells us what Http requests to make. Right? So if you want to map that in between a URL and an Http request. Awesome. Cool. So now we got the loop. So now we need to know how do we get that into a web application. Right? So we talked about links. Links aren't the only way. We interact with a web site or a web application. What is the other way that you're familiar with? Like you type into something, you hit enter, you click a button and something happens. Right? That's kind of, if you haven't done deeper until now, that's kind of what you're used to. Right? So you click on links or sometimes you type things and hit enter or click on buttons. And how that happens is all forms. So again, a form is just a type of HTML pack. So it's nothing special. It just has a special meaning from the browsers. So you can do things like text fields, buttons, checkboxes, range controls, color pickers, all kinds of cool stuff. The way I think of it is a link. Every time you click a link, what kind of Http message is going to be created? Get 100% of the time you get a request. If I gave you a URL, could you tell me what the, what the second parameter on the standard, on the start line of the Http request. So the first one's going to be get. What's the next one? Seven o'clock. Yeah, the path, just the path. What's the URL? It could be, or it could be, relative to the host. Yeah, so it's going to be the query, sorry, the path and the query. We didn't talk about it yet. The fragment will actually not be sent. So you can think of there's a very simple mapping between URIs and Http requests. A form is a way to make a more qualitative request, as we'll see. So the idea is, just like the href attribute on an anchor text tells us what the URI is, the action attribute on a form, the action attribute on a form element tells you what the URI is once you submit that form. It's also optional. What would it be if it's optional? The same page, yeah. You have to have some kind of standard, so there's weird defaults for these things. Cool. The method attribute tells your browser what Http method to use in the request. So this could be get, post, create, put, any of the things we talked about or even anything custom. So this actually allows this controls that the Http method that's being used. The default, again, this is also optional. The default is get. And the question is then, well, how do we specify it? So these are the things we need to make Http requests. We need to know what we talk to. That's through the action attribute. We need to know what type of request. And then we need to know what data to send. So there's actually a lot of different sub-elements that have to be children of the form that we're talking about. We're going to focus just on input. So input is an animation O-tank that will be translated either into query URL parameters or part of the Http request body as well. So the difference is based on the method. So in a get, if we're making a get request, all of the input parameters will be translated into query parameters. If we're doing a post, the parameters will be sent as part of the body. So the data will be encoded in these two different encoding types. The difference isn't really super useful, so we can skip this a little bit. Cool. So the data is always named value pairs. It actually goes back to when we talked about the syntax in... So in the query part of the URI, what is the syntax there? What was that? Name value? Yeah. Key value pairs? Yeah, key value pairs. So we know it starts with a question mark based on the URI. And then we know we have some query part that we didn't really talk about. And so it's all key value pairs. So there's a lot of things and attributes you can do to control the look of these forms. The input can be... Here we have a type attribute that says text, which tells the browser any arbitrary text should show up in here. You can have a type of a password which tells the browser that the user is typing in their password, in which case what does the browser usually do? Mask. Yeah, mask. Your character is able to type in here. Actually, did I say it's a story about masking passwords? Nope. Okay, so why is this such a good idea? And... Got a little bit. The summer... Okay, the summer of 2004... Anyways, I worked at the Sacramento Sports Commission as an intern and they were running the Olympic trials in Sacramento. So it's when all the athletes for the summer games would come to Sacramento compete to figure out who would go to the Olympics. And as part of this, we had credentials like badges. So we had to make all the badges for all of the volunteers beforehand. We had to make badges for all the athletes during. We didn't have to make any kind of badges that needed to be made. So they had predefined security zones that would show up on your back. You had zone one, two, or three. And... But we sometimes had to overwrite those because some people need higher access or whatever. The problem is the people using these terminals were volunteers. And so when there's somebody who needed a different type of access we would have to go in and type in the password. Now the people who designed this software did not mask the password. So whenever the password is password, I type in password, they would see password. So what do we do? So you're a designer. You're ready. You have this problem. You can't do some random software you bought from. I think it was a Canadian company or something. What do you do? You make a password of five asterisks. Or six asterisks. Which as you type in it will show up. I'm sorry, sorry, sorry, sorry, sorry. The volunteer thinks that you're typing in a password. So we actually... All of these volunteers are from Intel so they're very smart. So you just went like boom, boom, boom, boom, boom. Or you mistyped it, right? That would be bad because they would see plain text characters in there. So we had to pretend to type while randomly hitting. Anyways, it was a horrible design. But that's why that's a good idea because the shoulder surfing aspect of people seeing what you type in. So anyways, back to the web. So the web has this built in which is a nice thing but it depends on what you put in the type element here. The type attribute of the input element. So the name tells it what name or the key of the name value pair. So this is going to be set to the server as the name foo. And then the value, so what's the value used for? Yeah, default value. So what do you want the default value to be? It will show up in the text box. So when you go to a page and you see something already written in there, it's because the value attribute is set. And then when you submit the form, it says if you submitted that value. So that's a way to talk it into defaults. Cool, so you see a nice, you know, standard box with the text bar. And when you finally click submit, the browser goes through, and if you read the HTML5 spec you can see exactly what it does in what order. It goes through every child input tag of the form that you're submitting, takes the name and the value, and then sends it to the server depending on the encoding. So if it's this XWW form URL encoded, basically, okay, so it's the same thing. This is another thing that has to be watched. It's very weird. It's the exact same thing as URL encoding, but instead of percent 20, it just encodes the space character that uses a plus, which is why plus is a reserved character that needs to be encoded. But it's really weird when you're looking at this and you're like, why is this a special thing? It's because... And it will set it as foo equals bar, so that would be name equals value. So that is really how values are sent from the browser to the web application, through the query. And multiple name values and pairs are separated by ampersand. So this is why when we look at URIs, ampersand is a reserved character that must be encoded. And so that's it. So if we had something, like a form that had an action of http colon slash slash example dot com slash grade slash submit, an input type of text, where the name is student, the value is bar, two other input types of class and grade, and then finally, so the button comes from an input type of submit, and that tells the browser this is a submit button. Whatever the text is, will usually be the name. Actually, I don't think it will be the name. I think it will be the value. So I think this will have whatever the default of the browser is for submit. So, so when it renders, you know, render with these three boxes, I'm obviously not doing any fancy styling or doing anything. You can completely control the styling here. So bar, why is bar filled in here for the first box? Yeah, because of the value element, sorry, the value attribute of the input element, the very first one. And so I can fill this in that Adam DuPay in CSE 591 gets an A plus. This is a, does that tell you guys this tip? If you're making slides, never put the class number in there in case you want to reuse that then somewhere else. Also the semester too. And so when we submit this, so do we know what URI is going to submit? What's the method going to be? Yet. Post. Just said the two options. Unless somebody wants to do something different. Yeah. So if you don't specify any event, it's going to be used yet. Exactly. So there's no method attribute on the form which means it's going to make a get request. Which means that all of our input might be added to the query part of the URI. Right? So a good check would be if I gave you this form and I said it's filled out in this way because you write out the HTTP request that's going to be generated. What's one thing you have to make sure you include that HTTP request? What was that? Yeah, you have to do the question mark and then the question mark, what are you going to do? So actually let's just do this if you do it in your heads. Let's see if I can redact the actual answer. No. I'm not fast enough. This is super ugly. So we have our form. We have a form with an action. We have... Let's do this. There we go. And we can even copy this. And we'll say instead of this I'm actually going to do... Now I can get rid of that. Cool. Okay, so instead of that I have... Let's see. Student. That's what this person's name is. What's this, 545? I have an A plus. This student is a good student. Okay. So not thinking about this one. I click on the submit button for this form. What's the HTTP request that gets made? So who... Who do we contact first? So like what... So it's not part of an HTTP request. But what server are we going to contact? Yeah. HTTPexample.com What's the server HTTPexample.com? Oh, it's just an example. Right, because this is not... If you try to resolve this DNS entry it's going to say, I have no idea what you're talking about. So this is about DNSentryexample. Assume that says com at the end. Right. Okay, so we're talking to example.com What's that request going to be? I think until somebody says something. What was that? I don't know. We're saying this is... So I'm making a TCP... I'm making a TCP connection to this server on what port? Port 80. So after that TCP connection is established I'm going to send my HTTP request. What's it going to look like? So what is it? Get. Slash... Sorry, this stuff's up here. Get. Slash grades. Slash submit. Question mark. Ah, student and student. Alright. Maybe I have a cool name. Say Harry for Harry Potter. Okay, Harry. So now, so student equals what? Harry. Equals Harry and then what? Yeah, I'm going to continue it on the same line. Don't know. And then I'm going to have what? Class. Class is equal to 545. And then an ampersand. And then what? Grade is equal to what? Grade is equal to what? 8%. 8%. Nice percent. You almost fell from my trap. And then an ampersand? Or what? Ah, yes. And we also, so we didn't give them that, but yes, they'll be submit. And it'll be equal to empty string. So they won't say anything here. So then what's after that? HGTP. Yeah, so we need a HGTP of 1... Slash 1.1. Is there anything else that's important? Together. What? Yeah, okay, so we'll do whatever. Slash RSS. And actually don't cover the order. But yeah, CRLF, now we're on a new line. Is there anything else we need? Why do we need the host? Host is a required part of HGP 1.1 request. If you don't have a host, then it's not a valid request. Awesome. What's the host? Well, that's it. Awesome. So we actually saw this here. So, oh, it was percent 2B. That's good. Submit equals submit. I don't know why, but that also does get sent. So I think that what it actually is depends. Cool. Then we have a different one where we have the same action, but here now the method is post. Okay. And so we have the same things. We can fill this out with the exact same values. So how is it going to change? The query string. Yeah, so now the values are already set to the string. So what's the second part of the start line of our HGP request? So the first part is going to be what? Host. The second part? Grades. Slash great. Slash submit. And then we have our HGP 1.1. We'll then have our host parameter. And then the body of the request is going to be the key value pairs. We're a student. So you have the same thing, the key value pairs separated by number of sands with the URL encoding. Sorry. Yes, the URL encoding that we sent encoding. And so because of that we have to have other things, other headers, like content length. I think it would still work even if you didn't include a content length header. But we're telling the server, hey, here's I'm sending you 68 bytes in the body. And they're encoded using application slash X. You form URL encoded so you know how to decode them into big key value pairs. So that's like it on terms of HGP. Like this is how that, this is really when you boil down even insanely complex things that you're used to using Facebook's crazy JavaScript and everything that's going on. At the end of the day it's all links and forms. That's really what we have here. So what's the difference between a website and a web application? The website would be which one is on it? What does it mean when a website runs on it? We said that was a web client, or a web site. I'm not saying you're wrong. Just asking for clarification. Yeah. So there's no difference is what you're saying? There's just a semantic difference that I'm trying to make you make up. So colloquially they are kind of used very interchangeably and actually a lot of people use website. So I think a website as a basic HGP document. Like you ask for kind of the traditional idea of the web. Or does anybody actually stop asking these questions? Remember like the original Yaku like what was Yaku originally? Besides like I feel like you mean like optimistic? No. Yaku. What was Yaku? Why did it get big before it had like the Yaku? So before it was a search engine it didn't have a search engine. It didn't have the indexing thing. It was just an index. It was just like here's cool stuff on the web and if you want cool car stuff here's a link and there's a list of all the other websites that talk about cars and here's a list of all the other websites that talk about cars and there's a link. So I think if that is very much a website right it's a static thing. Every time you go to it it's going to be the same anybody who asks for that document get that same thing. So that application really tries to take this idea of your desktop application and say well you know desktop application is not the same for everyone right that doesn't make sense. An application you interact with it you can get data to it and it's like try to make a hard distinction when I'm talking about websites or web application yes. So you would say websites are more static and web applications are more dynamic? Yes 100% A website is fundamentally static or it's not. Maybe it's dynamic in the sense that it has pretty visualizations or something but fundamentally if everybody's getting that same content all the time and it never changes it's static. So like a basic website that has JavaScript that interacts with CSS would be a web application? I'd say if you have just a website that has JavaScript and CSS where the JavaScript is just about presentation and changing the presentation I'd still call that a website. But if it has login and logout functionality then that's no longer a simple website that's not an application that you're interacting with. You just happen to be interacting with it over HTTP and they send back instead of like a GUI screen where they're drawing pixels they send back HTML, CSS and JavaScript. So we'll go more into the distinction there but this was actually a early distinction that was made so this wasn't something that kind of came after the fact. And so actually when we talked about it very briefly about the difference between get and post if you look at the original like HTTP 1.0 spec it said that get should not have the significance of taking an action other than retrieval. Which means that get should not change the state of the server. It should always be safe to make a get request. So it's safe and it's item-poked. So is item-poked for me? Usually this is brought up in the operations and tasks. Do you say poking? Item-poked is. Oh, I mean it's Latin for airport. Yeah, it was me in our context. That was a good I mean defined to Latin. I feel this is the point of time when we gently Google it, don't look it up and just go this is. We should know what it means it's important computing term. Yeah, so it means the state of the system for making one or n of those requests is the same. I think even zero. So basically it's one of these things it means also transactional systems like databases it means if you make this request in terms of the web it means if you make a get request once the state of the server should be exactly the same as if you made it a hundred times or zero times. So this is really just what it means. Sorry. Defining the state of the system I mean like when you do the queries and stuff you are changing things on the server. Why? That's who? You assume the server is pausing Why would you assume? Maybe. So that's part of the I mean it's visually obtuse but this is part of the method you can't make any assumptions because you have no idea what that server is doing. So yes that's why having a form where you intend to submit a grade with a get method would be a bad idea according to the system. Or similarly if you had a web application with a log out link that was your web application slash log out and you click the link that's a get request so that should be safe to do at any time. And actually Google got into trouble back in the day as they created this link a web optimizer web speeder wrapper thing which would augment it would look at what page you were on what page you were trying to go to you were likely to click pre-fetch it and wait for it but people would get randomly log out of websites because they don't follow this principle. You had your hand up? Yeah, so there was this thing like hate counter hate counter before a lot and I think when Google is giving you a hundred counters does that come into the same thing? In what sense in the sense of is it a web application? Yeah, I mean that's you kind of start learning lines you're probably still technically yes, it's a web application you're just giving you different content that you probably don't know how exactly that thing is implemented and that's such an uninteresting part of the functionality that I could easily be like no, that's not it's one of these things like it is a semantic difference but it helps when thinking about the difference between a web application and a GUI application so post super interesting they say like post should be used for annotation of existing resources posting a message to a bulletin board news group, main list or similar group of articles providing a block of data such as the result of submitting a form which we just looked at to a data handling process and extending a database through an append operation so they were already thinking like this basic concept of having Gets or posts is really built into this idea of web applications because you're providing now input to the website cool so fundamentally we think about a web application the idea is there is some server side code or some code that's running on the server side that is parsing your request figuring out either the query parameters or whatever and dynamically generating an HTML response just for you and so this as we said differs very much from a website where just send static files to anyone this is why I recently converted like rather than having for my personal website having a blog that has like a PHP engine and can easily get overloaded I switched to something I think it's general based or I think it's called OctoPress which you write everything marked down you compile it to static HTML files and then you just upload it and it's like there's no way you can should be able to take out like a super stupid Apache or engine X that's just serving HTML content so okay so the question though is when you're using a real application so think about does anybody have an application that's not a browser open on this screen? Eclipse? yeah so think about Eclipse well I don't want to hear about this next question I don't know I appreciate that I thought it would be like a web plan first or a little sorry that email plan but I think it would be too much so okay so for Eclipse like you're using Eclipse you've only used something like IDP or something how do you use it? what's your interaction with it? drop down menus to do what? and do different things to open up different files to open up a project do compile, there's menus for that there's drop downs, there's keyboard shortcuts there's the undo functionality so you can go backwards and undo the horrible thing you just did or maybe realize you did it and you're should be using a source control repository so fundamentally right so how does Eclipse deal with that like going backwards or forwards or how does like let's say you shut it down and you open it up and it's remembered your workplace so it puts all the things back up it usually has like a dot directory where it kind of saves like the state so it's all about state right and when you think about what does state mean maybe it's a folder where it's storing its configuration files or even it's the the memory contents of that application right when you the application knows that you open a window or you open a file for editing because it created a new tab and it populated the window of that with all of the text content of that file and then it stored the mapping there so when you save it it knows where to save it and all of that means the memory of that application is changing so how would you implement this on the web with what we've learned about HTTP for it all let's say you want to implement a text editing app people use Google Docs something like that, yeah let's say you want to do some kind of Google Docs something or other how do you do that in the web? Scorries, we have the cookies no special scorries what is cookies we have cookies no no right now cookies are just delicious dessert and what we've talked about so far let's say we want to upload our file to our awesome definitely not Google Doc text editor in a website we have some form where we can upload a file we upload that file to the server and then the server sends back whatever page it says okay great you've uploaded it how do they how do we request like that file or the rendering of the document in here let me take a look so every single request that the server gets what does it know about that connection who is from what specifically who is from we can say who DCP DCP information so what does it know so the source IP and source port which we know is actually always random in every single DCP connection can know source IP but also the user agent may get sent but that's optional it knows the host that's trying to access it knows the URI that you're trying to access what does it know about any other request you've made in the past it really does not HTTP at a fundamental level is stateless you can think of an HTTP server as somebody with no longer memory maybe it was going to be a window I mean it's similar to that but not I don't like that every request that comes in the server is like hey what do you want oh great you want this document here it is hey what do you want you're like I was just talking to you and they say hey what do you want so every single request because there's no state and because we just make HTTP requests and get HTTP responses back the server every single request is the state which means when you think about it in this sense means I ought to link the request that you're making now with all of your previous requests right you think about that so something like Facebook or something with a username and password when you type in your username and password you're essentially linking that request that you're making with that request you made years ago to create your account and give it up your username and password now Facebook can say yeah these are all from the same person or you can get a session these are all the same session this state so now I know exactly I can find your state as the application correctly and so we need so we could try things why can't we just use the IP address why not use IP addresses it seems reasonable right you know what I'm trying to do different people also we have matted so every thing about how horrible it would be if all of us were sharing the same state in the same session so every time you went to a website they all thought we were one person right that would be terrible or we have to have all of us have unique IPs which is not a good deal in IPv4 what about user agent why not use the user agent I kind of made a claim earlier that it's fairly unique yeah people use different devices it also happens with IP addresses they want to be able to stay long into a website whether we're accessing it from home or from work wherever we use that advice has been with IP addresses same with the user agent and so the server so think about the server's perspective this is going back to when we talked about network attacks but the server's perspective can it trust the source IP how do you smooth the source IP on a TCP connection after the 3-way handshake the server knows that the 3-way handshake has been completed the TCP 3-way handshake which means that the person the IP address that it's talking to not only can send packets which we know can be spoofed the send packet can have a completely spoofed source IP address but the send act packet is going to that IP address with the random act number that needs to be acknowledged on the resulting act so the server is and of course there are local games that we know we can play to try to increase some traffic that this IP address is talking to us we can trust it in some sense that we know we got a request from that IP address because we did the 3-way handshake and assuming that it wasn't broken then we know that that was good what about the user agent the user agent at TCP does TCP have a user agent doesn't that give the user service so we talked about from a server's perspective can you trust the IP address yes to the most part can the server trust the user agent that's sent as part of an HTTP request and her yes is an argue for one position if the example is that if you're on the mobile website and you ask specifically the question and it changes the user agent and sends it to the server so I was thinking it was possible to let it get on the camera exactly everything the request cannot be trusted everything anything can be spoofed you can download the extension right now to change and alter your user agent you can make requests with a curl or an online tool and specify what user agent you want to use there's no possible way for the website to verify that your user agent is actually correct and so this is why we don't want to hang sessions on either IP addresses or user agents or even a combination of the two could be good but then I could just start changing my user agent to random people and start logging in as them so we need some other way to maintain state and we need to so we're going to build up this idea of a session right you want to say that all of these requests are from the same session in Central Asia that you first made a registered user request and then you made a login request and then you added a comment if those state 3 requests happen in an opposite order you'd expect a different state of the application because you can't make a comment unless you're logged in unless you have a registered user so this means we can do things like authentication which is something that we would like in an application there are I guess we can do some things like Wikipedia that don't actually need any authentication has there been any edited Wikipedia data before? yeah you can do it 100% anonymously does Wikipedia have authentication? yeah why? they're an open they're an open encyclopedia why would they need user names? they do often sometimes yeah because people are terrible they need some accountability yes an anonymous person can edit a Wikipedia article every person does not have equal access to edit every single Wikipedia article so they have this and actually it's fascinating there's this hierarchy of editorships and people work their way up with Wikipedia editors and to do that you need something like authentication who is this person and I need to link them to that account that they created in the past and this actually then can get us a rich application that mimics a desktop movie application there's three ways we could do this one is we can embed our session information in every URL so that the server can generate some random identifier a session identifier for you to generate the HTML on the page so every link and every form on that page I have had to add the session identifier which means you click on it and it will go there I can use hidden fields and forms I can also add the same thing into forms so I know who it is and the other thing is we can develop a completely new technology to actually do this so that's where the needs come from and so we're not going to talk about the other two they're fairly straightforward and sometimes see what would be one problem with embedding you can get your deviance and when it's this new information you can just take the deviance yes so in for communications encrypted people can sniff what else but then don't they can add a length but URLs are supposed to be ok so we seem to be talking about URLs for a yeah what else what do people do with URLs boot mark boot mark them what else how do you send them to people share them tweet them so think about now what happens if I have a link that is linked to my session and I send it to one of you what happens when you click that link yeah you're going to become me that would be very bad think about a grading system that would be like that let's try it yeah you would be very you would really want to get your hands on those links right yeah this really is but this actually does happen when does this happen has anybody ever had this happen before it seems like a stupid idea but in a way I'm just frank if you find something on the website and you wanted to share with somebody that was verified you can send it with a link where it has your shopping cart and everything yes that would be I'd say that's more of a functionality of the website to be able to share a specific portion of your state with that so you're creating a custom link that shows that but if they click on like a cal it's not your account growth there right who is that again API keys and open source code API keys and open source code that's a separate issue I think it's kind of related anybody ever get an email yeah I can't say any email today so I know you all will try that out I think more specifically anybody can email with a link and when you click that link it automatically logs you to that website even though you previously were not logged in this happens a lot actually and if you notice that application includes so that link includes some accession information to automatically log you in which means if you share that link or forward that email to somebody they can log in as you so it has the same problems as embedding information in URLs but the web application owner has hopefully made the informed decision that the marketing potential of you being able to click and automatically go into the website is worth this security trade off I would say oftentimes they probably don't think about that anyways okay so what are cookies so cookie is so it was originally invented I believe I want to say by Netscape because they were trying to create an e-commerce application and they ran into this exact problem of how do you have a shopping cart if every single request to the server is like a brand new request especially something like a shopping cart that has to grow and it gets kind of crazy so the idea is with cookies cookies are a way for the server to request that the user agent store some data and then on every subsequent request send that data back to the server the idea being then the server can link all those together yes is it a specific request not every request every request to that server to not to every server just the domain that sent back to me that may be a complicated that may be a complicated issue but yes not everything because that wouldn't be the purpose if you just leave cookies to everyone so you go to google.com they tell you to set a cookie your browser if it wants to it will store that cookie and then every time you access google.com it will send that cookie to them when you access Facebook it doesn't send anything so it's actually fairly simple and this simple idea actually 100% allows us to do everything we want to do to make actual web applications with sessions great and then the important component is that either side can essentially terminate the connection at either point this is what happens when you clear your cookies and you're magically locked out of every single website that's because nobody knows who you are right you're just making a new issue if you request not sending a cookie the website has no idea who you are cool yes okay good I didn't get that right first standardized in 1997 this actually is a great if you're interested in like the super weird intricacies of cookies and these whole things of web domains actually there's an RSC that was done in April 2011 that so there's like it's actually funny it goes like well we'll standardize it like here's the standardization attempt and here we'll try to standardize cookies 2.0 and then they have to write an RSC that are actually used in the modern web and it's a really good reference it's not going to standard because things just started happening and browser started implementing them weird it's really complicated but the best references this one as far as I know so again this is where we have a nice thing where we didn't reinvent the wheel and have a completely different format for cookies which would probably drive everyone insane so cookies are just name value pairs separated by equal signs what does that sound like yeah query parameters right awesome so the way this happens the server sends a set cookie header in the HTTP response so the client first makes the HTTP request and the server says hey could you store this data this username sorry not username this name value pair and then send it back to me so a set cookie you could say user equals foo and then every time now a user again talks to that domain then it sends a cookie header with that value so it sends a cookie of user equals foo seems pretty simple right and this way I know that this is user foo every time on the server side yes how are individual sites organized with restricted cookies by the domain so you'd have an individual file that depends on the browser every browser does it differently I think most modern browsers have something like a SQLite database that they use to do all these cookies that they did but as we'll eventually see there's a lot of other options that can get set that control who can use the cookie and when and subdomains if they can use it and can JavaScript hear it over HTTP or just a GPS yeah there's a lot of security mechanisms because well we'll talk about it in a second so you can send multiple set cookie headers right so you can do things like setting a session variable, setting a language the server can really kind of request whatever it wants and the client can decide to either do it or not do it so there's attributes on the cookies that so this is where it gets a little bit complicated let's briefly summarize them so if you think about a domain like what's a good domain yeah but you can't really create your own content I was actually going to say GeoCities but I think that's such an old reference that nobody would actually understand that Tumblr something like that that's I think a good something like Tumblr right that's a thing where you can post your own blog or your own content on there so when you log in the server will set a cookie but everybody's blog maybe is different and you can maybe have different cookies for different blogs and so the path attribute actually I was going to say only set a cookie to this path of the live application don't send it to everyone else so this way you can restrict where that cookie gets sent domain so also with sub-domains you can specify that I want you can set a google.com cookie that's valid at all start at google.com so it works oh I got google.com maps out google.com everything expires or max age is a way for the server to tell the user agent hey only keep this cookie live for a day or two days so what happens then if the client sends that cookie back after the expires period it's going to develop what should happen the server should reject it and tell it to send a new cookie or tell it that they're not authorized but fundamentally we're trying to think about what things to trust the server cannot trust that the user agent respects this expires attribute it's more just like a friendly mechanism they can fundamentally user agents can do whatever they want so the user agents are very untrusted HTTP only means that javascript can't touch the cookies which will be important for something we'll talk about later and secure because the cookies should only be sent over a gdps connection it's not a gdp connection so why might we want something like that information leaked out when you go from a secure connection so what happens if I steal your cookie value what was that something from the same exact cookie and send it back to the server yeah so once I have your cookie remember from a service perspective that's the only thing it's using to link all of your requests so I can use that cookie and then log into the website there's a tool if you want to look up some cool stuff called firesheet so this is a thing that everyone knew was a problem you can steal people's cookies firesheet was a tool that somebody wrote that would sniff the traffic on the local network open wi-fi using all the stuff we talked about and then we would find all the Facebook cookies and then allow you with one click to log in as that person on Facebook and so this actually got Facebook to upgrade and change all of their cookies to secure like within I don't know less than a week I think those kind of thought on college campuses people were using a lot of locks so you have to be very careful alright so as an example I'm going to be curling Google one time so you can see and the really interesting thing in here is that we can see that Google is using their own kind of custom cookie format they have preference equals and then there's an ID equals and then a colon and an ff equals 0 like I don't know really a cookie is pretty opaque in the sense that it could be gibberish it could be something that's even encrypted only a server can decrypt who knows but the important things here is what we talked about it has an expires attribute it has a path and a domain and so you can see there's two cookies there's an HTTP only cookie that gets set here so this is far past the expiration so don't bother writing this okay so server can ask it to delete but it should not so proxy we talked about proxies on the web so should proxies cache cookies no that would be a terrible idea right because that way anyone else who went through that proxy would see your content right so fundamentally proxy should cookie should never be cached or saved by anyone along the way and that is what using HTTPS helps is that nobody else can see our content because our communication with the server is encrypted okay so we talked about this user agent is completely free to delete cookies to alter cookies to change cookies at any time this used to be like the popular way of fixing no problem on the web if you were having problems with the website like kind of like the turn it off and turn it on thing but for the web it's like have you cleared your cookies and you're like great now I'm logged out of everything this is terrible okay so now we can do this how should we create a session with the server how should we do this in a secure way so should I use the way that I just talked about where I said a cookie that says user equals admin or user equals I don't know your guys handles but whatever your name is on the submission site does anyone look into cookies they're gonna see what does that a good idea what's good about it it's pretty easy for me to look up user equals foo yeah I know who foo is I'll let you access all the foos stuck in what's the downside where do you spoon in yeah you can change that value this is I'll probably keep explaining this a lot more in the face but browser user ages cannot be trusted from the server's perspective there's nothing that stops me from changing my cookie from foo to admin or from foo to whatever your user name is the thing that is spoofing in some sense is just literally altering the data because it's there so so what's some other ways we could implement secure cookies and secure sessions generate a hash on the data that the user agent can send and then do what I mean something that is specific to the user agent in that session is part of the hash okay and then you can sort of try and guarantee that specific user so like send the user name and then the hash of the user hash of the user name and the IP address yeah you could possibly do that the problem though is what happens when you use a website and you switch from the ASU network to your home network do you get immediately logged out of every single website no right our devices are so mobile now that tying a session to just an IP address is usually not a good idea so they usually don't even do that so including that in there and then if you just included the hash of the user name well anyone can do the hash of that user name so need some other data what was that why you pick the MAC address ooh okay so does the server know the client's MAC address this is good it's like an accidental study session isn't the MAC address even lower level so that would if you can't have the same IP because you're going to have different devices you definitely can't have the same MAC because so thinking about it from the server's perspective right so if you just included a SINAC the client sends it the actual request which is going to be inside of a TCP packet which is encoded inside of an IP packet and has the other layers so we'll think about it a couple ways so does the server when it gets that packet is there a link layer frame with the Ethernet information is it there there is one what's the source MAC address of that so let's think about this first what's the location MAC address how's the server what's the source MAC address last hop the gateway yes because at every single hop the link layer only specifies the next hop so it's hop hop hop hop and then when the server gets it the client it has no idea so the only time it would know is if it's on the local network so it's possible to do that but it's tricky and it probably would only work in very very very limited servers so what were we talking about sessions cookies so there's a lot of different ways to do it if you do something I believe if you use I think if you look at my website I think the cookies are encrypted they may not be encrypted but they're probably using an HMAC the idea is that the server has some secret key and so it generates a secure hash using that key and the value that it sets and the idea is if you tamper with any of the data in the cookie the server verifies it every single time and so there's no way you could like you can't if you don't know the key you cannot create that but if you can get the key then you could break it and so that's the bad thing the standard way and the problem there is kind of you have to do all this crypto operations which now is not really a huge problem but the standard way of doing sessions is to just generate a random unique some kind of identifier and use that as the session handle so that's especially PHP does this by default so when you look at web applications it's PHP session ID so it's usually some unique random unguessable session ID and then says hey store this that way I can use it in the space of I don't know 64 bit integers right you have to try to 64 and root for somebody else's identifier or most of them use a random number and then do some SHA 256 hash so you get a nice long random number so and then the server so PHP store all the session information in a folder with like the session in different files by session if you look at the PHP you can use the database there's all kinds of options okay cool so now we get to the crux of what we're trying to go for here so we're building up kind of knowledge layer by layer at this really bottom component and so the way this really evolved is that at first you would if you wanted to write a web application you would do what we did in homework one what do we do in homework one yeah yeah you would write a custom in C you would write a web server that knew how to listen on a court that knew how to parse an HVQ request and knew how to serve content whatever was responsible for that and so you'd implement your custom logic in there but it was actually very quickly like you'd have to support all possible clients that talk different maybe have to keep up to spec and so and really when you think about it every single person that's writing a web application has to reimplement this same functionality of parsing HTTP and doing all that so if we think about kind of principles of software engineering abstraction it's like why should we do that there's people who are good at writing web server like parsing web applications we should just let them do that so we really want to separate the concerns there and create basically a web server that parses the HTTP request and we have some way to tell the server hey send these requests to me because I know how to handle them and I want to do something dynamic and execute some other code cool so you can develop a web application without ever thinking about or worrying about HTTP although if you do that you will not be a good web developer so you still need to think about and understand HTTP especially as a security person trying to break web applications isn't the REST API around the best way of doing it being a tough question um the REST is a way to structure an API around around the web yeah definitely around like HTTP so in an HTTP based API that people could access it has yeah it has more of a reputation than it actually deserves but it uses the principle doesn't it use the principles of HTTP's methods? yeah which makes it almost impossible to I mean which it does but nobody actually uses that except for RESTful APIs so then you have to be using an HTTP client that supports that or you have to implement some system so the idea is you use things like put and post so with REST you give it semantic meaning so you'd say a put means to update this thing and a delete means to delete this resource and every resource in the system has a unique name but not everything has even the like a browser or javascript can do it but it needs to do custom work to it anyways it's like the idea of REST is very good I find that people get very like oh my gosh is this RESTful and like that I think is not a productive discussion but the ideas are good okay so the idea is of a web application we have our client with the browser making an HTTP request to the web server which does mean so a web server is something like a patchy or nginx or lighty nodejs noooo yes nodejs typically you have a web server usually in front of a nodejs it depends on how you want to do it usually you wouldn't want to have your nodejs server listening on port 80 cause that means it has to run as root which shhh I'm gonna have problems okay then the web server figures out by parsing their request the web server knows What's the URI you're looking for? What's the query parameters and everything? It then sends it to some web application which then executes some custom code, figures out what you're doing, what it wants to do, and then generates a custom HTML response that it sends back to the web server that can send back to the client through an HTTP response. Okay, so, and this really, I don't know, it's hard to convey how cool this is. So what is some of the benefits of this versus like a desktop GUI application? Just go for blanks for each operating system. Right, you don't need to develop a specific GUI for each operating system. So as long as there's a web browser that knows how to render HTML, anybody can use your web application. What else? Update. What was that? Update. Updates are, I don't know, easy, but easier it's essentially. I say it's easier to make sure that everyone's using the most up-to-date version, because all you do is update this web application and now every single request is using the latest and greatest version. Rather than anybody develop that software, and have to be able to push updates out to clients, that's a huge pain, huge pain. Think about it yourself, it's like how many times do you ignore updating your OS or something. What else? So when you want to write a GUI application, what languages do you are typically used? Okay, Java, it's true. It's true, Java. Also usually C or C++, right? Why? Why are those the languages that are used? Yeah, in that you could hurt yourself a little bit, but you saw it in the back of your application. What are the operating systems written in? C and C++, right? So there's, I think part of it is this type of language between the fact that the OS is written in this language, so it's a lot easier to write C and C++ applications that use that thing. What languages are you restricted to writing your web application in? Yeah, I mean honestly anything, right? You could literally, as long as that arrow from the web server to the web application, as long as you can write a program that understands that, you can write your web application in Lisp or Ruby or Python or Perl or, I mean literally whatever, any Lua, any crazy thing you could possibly imagine. And assembly, I guess you could do that. And you think about, so when you talk about a desktop application, right, one of the other reasons why we usually use C or C++ is because it's a lot, ends up being a compiled binary that's a lot faster because you have to, when the user clicks on something they want something to happen, right? When you think about what's like a delay between making a web request and getting a response? Like what has to happen for this whole cycle to complete? You have DNS resolution and then what? Speed, what was that? Speed. You're working on your speed, your bandwidth, your speed? Yeah, yeah, but then what are you, so what are the things that have to happen? So you have to go fetch the DNS name from somebody else, not what happens. Yeah, but how does the request get to the web server? As we know, we did the three-way fetching. We need to send, you know, all the clients to the server. We need a send act to go back. We need another, the act to go to the web server. We need to push the content into the server of our request. Then the server needs to parse it and do some processing. So when you think about all that overhead, right, the difference between running Python or running a compiled C binary is, I don't know, whatever you want to call it, 10 milliseconds, 20 milliseconds, whatever that delay is, the network latency is much higher there. So this is why on the web you can do things like have crazy languages that are insanely slow, but it doesn't really matter because, who cares, like the network is super, much, much, much slower. Cool, okay, so we're gonna look at some types of different web application programming languages. Let's see, all right, where are we at? Time was, oh no. All right, we're looking at ASP. No, we won't, okay. We'll stop here. You guys got to be too much on 10 too.