 Okay, the vice That's that That's that's there we go. How about now I can't even walk it Okay, test That's that That's that's there we go. Now my levels are good. Okay. Thanks everyone on twitch I didn't say anything except for that. You have an assignment and we're gonna go over what that's like So nothing crazy there that you missed So to do that we're gonna be using the phone college Dojo dojo's system of home college platform To run all this so if you've never done this before it's very easy first thing you want to do is register a your username Very important your username shows up on the site linked to what you solve So if you want to be completely anonymous choose a random username that nobody will know that that's you There's gonna be no way for anyone else to link that to you if you want to have it be something if you want to have To be your discord name or whatever totally fine. Feel free to use whatever name you want You can use whatever email address you want the email address doesn't matter and of course like remember your password I think there's a reset functionality now. Is that correct? Yeah, that took a while to Act you know to do Okay, so I've already created an account and When you log in you'll see the list of dojos here. These are different materials Yeah so So the courses here are all the courses that are currently running that are using the phone college platform You are 365 fall 2023 if you want more stuff a lot of these modules come from these different topics that you're free to explore and do These are kind of just a different way of organizing and thinking about these things and so we are here You should see you won't see this stuff at the top the admin button because that should hopefully just be me And so the first one is talking web So this is going to be what we're gonna do what we're gonna learn about today is how to Structure HDD how the HTTP protocol works and how we can make web requests the specifics here don't matter like I mentioned We're gonna cover that next in starting in 10 minutes now We're just gonna go over how to use this so there's important information here of links to different documentations Those will definitely be there as we get lectures lectures will be posted on this page as well as the civil syllabus So I'll make sure they're in both places. However, if you want to get started Connor when he ran 365 has Recordings of his lectures and slides. So feel free That's not right Feel free to use this material if you like That's totally up to you. But again, like everything you need to do these will be covered in class Make sense Cool, okay So there's a bunch of challenges. I know it looks intimidating because there's 39 challenges They are quite easy Let's go over a curl. We'll start with curl. So First thing you're gonna want to do is start the challenge the difference here So there's two buttons start and practice create a special instance just for you to be able to work on this challenge When it's done, it will say challenge successfully started the practice the difference between the two is practice doesn't have a real flag in it Practice just gives you a fake flag, but gives you root on the container, which may be useful later I think for these challenges. It's not gonna be useful at all. So ignore it for now So now that we've started we can access this stuff in three different ways One way I guess I'll go and this is definitely not in the order that I approach them But one way is with the desktop button So the desktop button gives you a VNC like a remote access as if you were on a Linux desktop So what is this KDE? Is that what it's right? Yeah, KDE desktop So you can use anything that's on here. No, but there's no internet access here. So you don't get any of that You're making me lie to these kids. Yeah, whatever. It's whatever it is There's a terminal like a terminal here that you can run get access to terminal This is running on the dojo system So if you don't have or don't want to set up a local environment to be able to access these things You can do it all through the browser here as we'll go forward. You can explore different things There's a Ghidra's in here wire shark. Did I to disappear? I guess I did disappear Okay, but For For any challenge you want to run you'll want to check out the slash challenge directory So again, this is where why we had module zero last week of using the command line if you're not familiar with Using things like LS to list a directory or CD to change directory or executing things to the command line Then you definitely should go back and do the module zero abandoned over the wire because this is going to be Important there so for I think all of these challenges you just have to execute one thing challenge run And what this is doing is actually giving you the output So the whole idea here is you're making web requests here And so at the start here, it's going to tell you what you should do Just make an HTTP request to one two seven zero zero one IP address that is local host on port 80 to get the flag You must make this request using the curl command. So this is important It will tell you what tools actually checks that you're using the right tool You're basically going to be solving the same levels in three different ways one using the comer curl command line tool One using a raw netcat, which is actually typing in I'll demo that later of a raw HTTP request to show that you can do these things just by Making requests yourself and then finally with Python. So we use Python a lot in this class So you'll write Python code in order to solve these levels as well So this one's pretty Simple it's going to tell us well. It doesn't tell us what to do, but of course if we didn't know what this curl thing is What should we do? Don't press on the help button yet, but maybe Yeah, man curls so read the curl man page go look online There's tons of stuff on curl curls like one of the kind of like a Swiss army knife of web security things It does a ton of stuff So we want to curl to make an HTTP request and of course I already know how to do this so I'm Port 80 and It just didn't make a request right so do this boom. It gave me the flag Let's look a little more information verbose I can see the request that I made the HTTP request which again, we're gonna learn about in five minutes and It responds back with the flag so We see something like poned on college block if you're on there right now don't just like randomly type this in This is a flag just for me that I know it's for me So go back to the dojo Talking web this was level one Oh, I hate this okay See does it work like that or go? Nope. Okay. You have to use the left box thing. Yeah Yeah, there we go submit the flag and then boom it says well it would say it was correct But I have already solved this that's what that green flag is there here And then you get credit and then you can move on going forward So that's way one to access the phone calls platform. You can also Use VS code through the browser. So this is a whole VS code instance That's running on the remote system that you can access through the browser and then from here You can edit files. I can't remember what files I have here So I probably shouldn't show you because it's like solutions to stuff random stuff, but Anyways, I have a bunch of stuff. Oh, that's good because that's nothing in there and then there should be a way I think if you go here terminal new terminal We'll get a terminal within VS code here and you can do challenge run and See everything there and see that I have to make the curl request and See that I can get the flag from here. So that's two ways to access the system The third way is my favorite way. This is the Because I hate using this stuff, but it's very useful for a lot of you. So I understand why it's there in your If you go into your settings in the upper right go to SSH key This is where you can paste in your public SSH key to get SSH access to the system If you don't know how to do this, there's plenty of resources online. I think we have them somewhere. I have stuff Yeah, it's on the front page to how to access there. Then you can go from the handy-dandy From your own terminal and you can do things like SSH was a hacker at dojo.pone.colors and then it Then random definitely password that you couldn't see me type in. Okay. Now once I'm logged in I'll be logged into that instance again. I can run that same thing challenge run. I Can see all the same output and I can curl Local host and then get the flag this way So three different ways to do it depending on what what you're most comfortable with Cool thing here what I like about the SSH is that as you launch a new challenge, it still keeps you in there so I can click start and then It's removing it initializing it connected and now I'm in web level 2 and Now it's a different request. It says I have to make an HTTP request of there I must make that request with the NC command. So if I did curl local host It says that's incorrect. You needed to actually use NC. So NC Actually, I'm gonna save this one. We'll go back and do this later when When we've learned more stuff of how to actually make this command Okay, and then people are asking Very Correctly. So on the course page There's this course link which they should be able to see right? So we've moved the syllabus here the old syllabus redirects here So you don't have to worry about which one's which so you can see the syllabus here This is where we'll still pull posts like we said the recorded lectures the lecture slides including today's as well as the the modules as they're released and then you can So two things you need to do two steps in order to link things to together So first thing go here to identify Identity I kept wanting to say identify Identity on the course page for this course put in your ASU student ID We've uploaded all of your student IDs from the roster there So you should be able to put that in there and that's how we know that That you are you and that's how we're able to link that random person with that random email address there You can at any point check your grade So your grade thing. Why do I have an E right now? Because I haven't done all the stuff right yet, right? We have an assignment due it's out As I solve things this progress will go up. Also. I guess I'm the professor. I don't have to solve these things So that's more of a you problem than a me problem Next The which one Identity on the course page. So it'd be like right here. Yeah, this the course page Which you can go to It'll be on dojos on the course and then course course Yeah, there'll be a link here to course and then this has all the information If you go to the current syllabus, it will take you there and you'll have the identity button there. So that's uh, one thing One aspect to be able to us to know that you're an asu student and this is your grade great The other important thing is in settings So we now have our own beautiful. What are these called? sections modules of categories of discord So this is where all the announcements and everything will be will be announced here but Only people with this role It is this one, uh asu cse 365 fall 2023 To do that You link yourself up go to your settings and discord on the left here. I'm already linked So link your account with your discord username and then boom now you will have access to all of that stuff sure yeah, so In settings here under ssh key. I have my public key is in here already. So you can do that Let's see. I guess cat So this is my one of my ssh keys. I guess. Oh, yeah, that's right. I got you to do the other one So this is one of my ssh keys So I just copy and paste the public key in there and then it knows that it's me so I can ssh in and it knows who I am If it's empty, it's for you to put in you need to make an ssh key Yeah, and put it in there What if someone uses your asu id, uh, don't do that. I don't know Why would you know somebody's asu user id or asu id? This is your 10 digit student number if you shouldn't be something that Like at the end of the day, I know who you all are. I have like registers and stuff So if I needed to go digging, I guess I can let somebody from twitch Well, look at that. Not too many is late. I think it's because of the audio problem. Otherwise, we're on time all right, oh And did I talk about when this is due Doing a week. So it's 40 ish levels, but they should go very quickly like they're not supposed to be very difficult um But get started early. So you're not trying to cram it all in at the end. I guess I'll say this all the time But there's nothing I can say to actually make you do that. Uh, I guess we'll see how it goes the first time Cool, all right, and let's do some learning. Okay, so We are See that's crazy. How come obs knows to get that one and not this one Who knows All right. Oh, uh, good question on twitch So if you've already done those challenges because you've either Let's say you took 365 last semester over the summer and you With drew from the class. So you're taking it again Like you don't have to solve the same thing over again. So those solves just continue to count It's not like you have to do the same work over and over again. Uh, so You're confused about that. I guess let us know. Yeah Oh, it periodically runs so it will yeah, it's a polling system. Yeah, thanks for that. Yeah Ah, great great question. Thank you for reminding me You want to do stuff in recitations? Okay, of course To the beautiful syllabus. Okay. Yes. Thank you for reminding me. Uh, so recitations will be like labs. So just like Come work on challenges. We'll have anywhere from two to four ta's there to help you. Um, and so Show up recitations If we need more locations, we'll add them. But for now brickyard 210 You do not have to go to your specific recitation. You can go to any recitations Don't go to whatever room number is there go to this room number brickyard 210 And that will start today. So we'll have Three people I think today at 430 to help you out with the challenges No optional Optional and you may attend any recitation Yeah, I added that today because I was getting questions. So thank you for reminding me to announce that Cool the idea is to help you not force you to come to this thing But you know come if you want to use this as the time you work on the challenges That's fine, too. Come show up open your laptop work on stuff if you're stuck ask questions Cool All right All right the web so I mean we're gonna get into the specifics of what we mean here um but The web the birth of the web and we'll actually get into what we talked what we mean with that the web And we're uh gonna go in a little bit of a strange order We're gonna go from like the highest level of networking here. So we're gonna learn about The web and a protocol that runs on networks as we go down We'll eventually go down and learn how the Low-level packets make their way to things and do tcp udp all that fun stuff. We're gonna start at a high level Um First to give you this background so that we can use that in future modules So the web actually started this is um, any know what operating system this is? DOS no good good track mac os almost correct very close. Yeah How is it Close it wasn't an apple next somebody said it. Yeah, it's next so next was uh You've actually seen the logo in the upper right there So next was uh, I don't know if you know the story but steve jobs was ousted from apple Uh by was it the pepsi guy? I think that like came in and like kicked him out of apple um, and so he He Then created a competing company called next That eventually apple bought and then he re took over apple from within next Does anybody do any ios development? Some people yeah, what don't uh, what's So when you're doing ios development, what's like, uh You know, there's uh variables that have our class names that are prefaced with ns underscore like capital ns like no Or is that objective c or something? All right. I'm losing my analogy, but Uh, somebody looks it up. There's like ns underscore. I think it's the dictionaries and ns something There's all these frameworks that start with ns and that's because they come all the way from next step Which was the name of the company Okay, and so this was actually the a screenshot of the very first web browser um, so it was created on this next system in I think it's 91, uh, which we'll look at in a minute and uh and This was the very first web page It's actually still exists. You can go check it out now. It's a historical artifact And the web was essentially created by One guy Tim Berners-Lee or now serve if you invent something as important as the internet I guess you get knighted if you're british, which is pretty cool And he was working at CERN. What does CERN do? Cool. Yeah, that's a great idea work for them. That's something I'm marketing which is Cool. So yeah, they like shoot Uh Like shoot particles at each other to hit like a very high speed so they break apart so they can kind of see what's in them and do all this I don't know crazy physics stuff um and When he got there, so CERN is like a big I think it's like a government funded organization But it has a lot of visiting scientists. So scientists physicists other kinds of people would come into CERN and out of CERN and uh Tim Berners-Lee was there and he had this idea and and there were these concepts as we'll see around that we're kind of floating around of How do I identify things how to have hypertext links on a page? But uh But he said hey, it's really annoying to figure out where everyone is like what even their offices are like who is here Like if you can remember does anybody remember a phone book has I haven't seen a physical phone book? Yeah, like at the yellow pages what they used to call it like the list of all um Uh businesses in an area with phone numbers so that you could reach out to them because there was no other way of finding that information Uh or no easy way So he thought hey, it'd be really great if we had some system so that people could actually See where each where each other was So he had this first proposal to CERN to create the uh web Or sorry to create like kind of an internal system, but he did in such a way that it could be extended in the end and this That first website that we saw was in like the end of 1990 And there's a fantastic book if you're interested in this it's called it's a book from him called Weaving the Web um And man, I guess I don't have that uh those stats, but uh, this wasn't basically 1990 was the literal invention of the web at CERN And by like 94 95 is when you started having crazy like dot-com bubbles and different browsers and stuff and it kind of Everything really exploded from there Then you had the original dot-com bubble in 2000 Anyways all kinds of crazy stuff so The design was basically like thinking about how to share research results and information at CERN And like I said it combined multiple emerging technologies So one is hypertext. So besides being like a super cool term. What does like hypertext mean? Yeah, it's better linked. Yeah, better linked. But what does that mean to be linked? Yeah, so like More generally, right? It's a text some kind of text documents Text document that has a way to tell you hey if you want more information go see this other document Right and that's what you're kind of used to is clicking on links You go to one page you see something you click a link and then all of a sudden you're on an insane Wikipedia page that you Never thought you'd see so but this this wasn't an idea that he necessarily invented this idea of hypertext has been around for a while Um, also the internet was and we'll actually get into the history of the internet The internet has a much earlier date. So it really Bugs me when people use web interchangeably for internet. The web is just one protocol that runs on the internet Email is another protocol that predates the web But honestly so much of what we do now on the internet is the web. So that I guess I can't fault it too much And so from these like humble beginnings of CERN The problems that they're trying to solve is how do we get universal access to a large universe of documents? Right, how can you have a system where I can say, okay, I have this document and then there's links. There's indications on that page of how to get more documents and so There's actually like incredibly simple design here of answering these questions How do I name a resource? So how do I know what to call something? So how so when I say a document What where is that document? How do I ask for it? How does the person or the thing that I'm asking for it know what I'm asking for? Then let's say I know what document I'm interested in then. How do I request that? How do I get that document say, hey, I would like this document? And then how do they respond back to me to say, aha, here is your document And finally the third problem is how to actually create this hypertext how to create a document that has these links So There's three major technologies here that were actually created that underpin the entire web So if you can understand these three technologies that you Understand the web the first one is how to name things. This is the uniform resource identifier. So uri that there's this Issue with uri url where it first was universal or uniform resource locator was with the l Have you heard I'm sure have you heard of the url before? Yeah, so that's like specific to http basically and then they realized that could be more general And so that's when they created this uri concept So now you can name a thing So that tells you this name as we'll see tells you who has that information and how do I ask that? for that information Then http is the hypertext transfer protocol That is the layer where a lot of so a lot of what you'll be doing in this module Is understanding how to make an http request to a server and then also how to interpret the server's response Cool, and then the hypertext comes back. So once you get data back from that server How does that data tell you where to get more data right this hypertext notion? How do links work? That's in html. So specifically for this module. We're only going to cover The first two concepts will return later to html and learn about that when we talk about web security and those types of vulnerabilities. So It's actually a really simple. So there's only three Things that play here. They're really kind of simple. You first need a uri And that's your starting point this tells you how to make a request and specifically as we'll see this tells you What server am I making this request to? So you make the request and then once you know the server you want to talk to you make an http request to that server That server will then return you html Then that html will contain more links that have uri's that when you click on them this whole cycle repeats questions Say it again The domains exist before the web Great question. I think the answer is yes, but I'd have to look at the specs for dns to see for certain I think dns was something that came about pretty quickly because people realized they didn't want to type in Like ip addresses. Yeah, and they realized hey, it'd be really terrible to do that. Yeah Okay, great questions. Uh, so one so this is why a lot of browsers have like a home page When you first boot a browser up it needs to go somewhere Right if you're google and you make a browser like prom you probably have that set to to google.com Uh, otherwise you can change that nowadays with stateful browsers where it keeps track of the thousands of tabs If you're like neither you have open all the time even when you shut it down it comes back up with all those tabs right there But yeah, you need to know some initial place to go. This is why um yahoo was super Famous does anybody has anybody been to yahoo? Yeah, anybody know what the original yahoo was? What was it? Apparently even before that maybe it was like a mostly a static web page Yeah, so yahoo was originally just like the yellow pages. It just had categories and then links to other pages So you would say yahoo as your home page because you could find the other stuff from there This is because search engines at the time were freaking terrible and you could never find what you were looking for And take like three or four pages of clicking through in order to actually get to your information Okay, so that's the first question how to start this cycle Which is a great question if you ever see a cycle like this right because something has to start it Uh, and then I forgot the second part of your question. So you want to ask it again? Ah, yes, they all came about at the same time. So this was like a A set of problems that had to be solved and to do so Tim Berners-Lee created protocols for each of these. So there's specifications for URI's HTTP and HTML You can even see the link between HTML and HTTP because the HTTP is what? Hypertext Hypertext transfer protocol and HTML is the hypertext markup language So there's already that linking there even though as we'll see there's nothing that says that an HTTP request or response has to contain HTML Yeah Yeah, good question. Were there other protocols? I don't know the all the history of everything But there was for sure like hypertext systems like hypercard. I think was an I want to say a mac app that you could That you could use to do hypertext documents, but I think If I remember correctly that was only on like one machine Like it wasn't a distributed system or it wasn't distributed in the sense that how to ask different machines for information I guess gopher was something that I don't know too much about but you can look that up it's actually used in the cts because it's a weird protocol that you can control some things about the request, but Yeah, gopher. I think was one of these early kinds of things but In my mind one of the reasons why this took off was because a there was well-defined protocols for all three of these things So that anyone could build any of these parts And this is why you don't use the original more What was it called? world-wide web Tim Berners-Lee original next app the browser nobody uses that nowadays I think mosaic Was one of the first big graphical web browsers But anyone could implement anything that like talked to these languages. So web servers web browsers all these things Uh Cool. Yeah, great questions anything else all right, so The first aspect is uri. So again, that's that was a good question This is what kicks off this whole process So we need to understand how to ask for things and specifically we want to know what protocol to use That's what the universal Or I guess it's not universal but uniform part is what the what protocol how to ask for the thing and where to get it from So basically it answers the following questions Which server has this information? How do I ask for that information? Then how can the server a server may have thousands of documents? How does it know which one i'm asking for? And one of the beauties of why I like Networking and I like studying these things is again You don't have to reverse engineer a web browser in order or a web server in order to understand how uri should be parsed or should be understood There is a definition in an rfc You can look at rfc 3986 and this will bring up everything you could possibly want to know about uri Like like if you have any questions about how something is parsed or how something should be parsed Then you can go up in those look in those specifications Cool. So the syntax is actually pretty simple And this is again, it may seem I don't know Silly have you ever I mean, I assume a lot of you have seen like uri's before URLs Copied them before pasted them and sent them to somebody please nod your head. So I know that A person using computers for your life Yeah, you probably clicked on a link that I sent you to the syllabus, right? Cool. So then it's like actually then Understanding what these bits are is what that is. And honestly, that's what I think a lot of computer science is is digging in And saying, okay, I've used this thing before but how does it actually work under the hood? Okay, cool, so the That's weird. Okay. Got it. Okay. So the parts of a uri that are important are the scheme So this as we'll see correlates the protocol. That's that hgp part at the front The authority the authority is who has that information the path So the path here is uh, we'll talk about that but that's the path and then a question mark A query part and then a hash and then a fragment. So breaking this down the scheme is very easy It's the protocol used to request the wheat resource This is why you see links. They say hgps colon slash slash You will then see that those are We'll use hgps when you click them versus hgp you can this is why You can send I think I even have this maybe on the syllabus. I can't remember if I did this for this year Maybe not but uh, if you ever click a link that like automatically opens your email client with like an email already filled out to an address On a subject anybody do that before? Yeah, that's through a uri. So that's the scheme there is mailed to and it's just like a special thing I think you can even sense my telephone numbers. The idea was to be super general Without anything you wanted to be able to locate. Uh, you could identify it here The authority Is the server and this is the entity that decides how to interpret all the rest and this is like an important kind of A subtle point here is that the rest of this The path query fragment you may think that it has Semantic meaning to you But fundamentally it doesn't matter the only thing that matters is to the server. So that server knows how to respond It could treat the path as the query or I actually can't do anything about the fragment which we'll talk about later, but it could interpret that however it wants But the idea is if you ask for a resource it should give you back roughly the same thing And it's usually just a server name. So Uh This can be broken down So usually the server name is the host. You can do dns names. You can do ip addresses And you can even I think it's an older format You used to be able to put in username passwords here that would auto off I think that's been taken away, but You can actually try to like be a specific username at host colon port Inside there. So that's when we did. I think one of these examples. We did curl hgp colon slash slash 127001 Column 80 and that was specifying port 80 Every protocol has a default port. So the port of hgp is 80. So you do not need to specify that normally The path is usually some hierarchical representation of paths just like you would be familiar with on A unix system you have slash and then you have several sub directories there But again, what this path actually means doesn't matter A query used to pass used to pass additional key value data as we'll see And then the fragment is Actually something that's kind of interesting if you've ever seen the thing after the hash This actually never sent to the server And this is because This is used by your browser in order to identify which part of this document you want to go to Cool, so let's look at some examples So I have an example like here. What's the scheme? foo What's the authority? sample.com And of that authority, so that's the host of the authority. What's the port? 8042 excellence and the path Over there and the query test equals bar and the fragment knows cool Okay, you can have simple ones like this. So this is like an example of an ftp So the ftp protocol as the scheme the authority of the path. So again, like you can see here Query can be optional fragment can be optional. You don't necessarily need those. Uh, oh, yeah, there's the mail-to example Now How do we part something like this? What's the What's the scheme? HTTPS, what's the authority? What is it? Nobody wants to raise their hand. Why if I wait long enough the people on twitch will type something Good example.com. Why but why is it not? example.com slash test slash example and then one is the port But how do you know the authority comes before the path and the authority syntax was host colon port? Uh, that's good. Yeah, good question. The scheme can change but hdp and hdps are pretty standard. They don't change things Yeah Say it again, and what about what's let's try something else. What's the query? Slash Adam why is that the query and not a continuation of the path because the path is continues slashers Yeah The question mark Yeah, so the key question the key problem is We actually can't tell based on this. There's this is like too This actually may not parse correctly or it may parse in a different way So for instance, what if I wanted the question mark to be part of the path? Yeah, so similarly great. So why are you thinking backslash? Yeah, but where'd you come up with backslash to just make that up in your head or Like with strings in your programming languages, right? You have a constant string Uh, in case by double quotes, you want to put a double quote inside that string You put a backslash before it backslash double quotes Uh, some of the things like black slash n for new lines So you have this problem where you have some syntax, right? We have these uris that there are things here that have special meanings, right? And a colon here has a special meaning slash has a special meaning The question mark like we talk about has a special meaning the hash has a special meaning if you had a hash in there The server wouldn't send anything after that to the server. So it's not slash encoding, but Basically, we need to encode there's it the standard defines all of these characters that if you want to use them You have to encode them correctly. So there's actually a lot of them Colon slash I guess I shouldn't just I guess you can look at these there's a single quote And they use a something called percent encoding So it's different than slash so it's not slashes You've probably seen this in urls with like percent like a bunch of percent signs And it's actually used this is what the spec says but the spec is what should happen and the server is actually implement what does happen so Basically anything that's not alphanumeric A digit a dash a dot an underscore or a till this should be percent encoded And the way that is is you do a percent sign And then it's followed by the hexadecimal representation of the byte So how do we figure out the hexadecimal representation of a byte? Yeah, look at the ascii table, right? And we actually have our handy dandy man page Oh, I guess I should So we can check in here And if we wanted to encode The question mark that we were looking at here it is so question mark is 7f So in hex, so if we wanted to represent that we'd replace that In the string as percent 2 7 and then that way the web server knows to interpret that as part of the path or part of wherever it is Yeah 3f is question mark then what's this delete? Thank you. Yeah, that makes more sense Yes, 3f is question mark 2f is slash cool, so so ampersand if we want to use an ampersand because by convention most query parameters are key equals like name equals value separated by ampersand Uh, and but if we want to use that as a key or a value we'd want to encode that so this would be Translated into percent 2 6 Now what if you want to encode a percent or use a percent sign? Yeah Yeah, so just like the problem of once you have an escape character like in your strings once you can have slash Double quote if you want to include about the slash then you need two of them and then it becomes you know To do that. Uh, so yeah, so that would be percent 25 Space is percent 20. This is like a very standard one And so on so let's fix this so now if I gave you something like this Can you parse this? So now what's the the scheme remains hgps. What's the authority? example.com and what's the path? test slash example, this is a colon colon one dot html and then the the query Yeah slash atom, so it's everything after the end the percent 25 2f. Sorry percent 2f out Um, you tested on google and it worked. Yeah, good. So I'm not teaching you just random gibberish, right? Cool. So this is very important. So if you to pass data to the web application, uh, this is incredibly important. Um I'm gonna Really, uh, time flies. Okay, cool. We'll go faster. So with uris, uh I guess we won't get into this precisely necessarily for this, but it's important to know Uh, when we have a uri we can specify an absolute location So this tells you exactly use this protocol uses authority This is the resource i'm talking about or it could be relative to the current resource. So depending on what page you are This link may be different. So this says Relative to the current scheme. So this is how you can give a uri that Uses the same scheme so hgp or hgps. It will work Um Slash test slash help that html is relative to the current authority. So on whatever server you're on Dot dot slash dot dot slash this means relative to the current authority and path Um and context is always important. So it depends on where you are and where those all resolve Okay cool, so Getting into hgp. So hgp is again that protocol now we have the uri. We know exactly now how uri's work How we get that how to request a resource from a server? Um, it's based on tcp as we'll talk about later. It uses port 80 by default Version 1.0 was defined in may of 96 And there was a very important thing that happened in 99 where they had to upgrade it We'll talk about that in a second and Yeah, virgin 2 is actually not still under discussion. It's done So the way this works the server first listens for incoming tcp connections The client then makes a tcp connection to the server sends the request Where the server then reads that request? And figures out what the client is asking for and then sends a response So it looks something like this. So you have your browser. That's also called a user agent This doesn't actually have to be a graphical web browser. You can do all kinds of cool stuff with this And actually a lot of times your web application or your server Your mobile device all the apps on your phone will use We'll make some web requests and You have a server you make a request and you get a response But in reality, there's actually a lot of junk going on There's caches and proxies in between you and the server You make that request all the way to the server and it has to go back through all of these But the important parts that you need to know Our hb request consists of a method So this is the classic get and post as we'll talk about but functionally It can actually be anything as long as the server supports it The resource that you're trying to access which is derived from the uri The protocol version that you're talking the client information The body of the request so you can send things in the body And the syntax is actually very simple This is why like you will be typing these in manually like my hand because it's fun to do this stuff Or fun for me. And I guess I don't really I'm sure you'll have fun. Trust me So first you have the start line Followed by headers followed by body. That's at a very high level how the protocol works Each line is separated by crlf You can actually you don't web servers are very forgiving So you don't have to be protocol precise in your request You can just as long as this web server and understands what you're talking about it will give you the flag Headers are separated from the body by an empty line Just a crlf. So We'll get to an example in a second Methods so the method is the what essentially what the client is trying to ask the server to do to that resource So some common methods are a get so we want to hey, just give me whatever resources here And this is an all caps get post is traditionally associated with hey Process like I'm going to give you data as part of my http request as part of the body and I want you to interpret that Put is mostly used in like rest apis and stuff. It's not really used by web browsers And head is equivalent to a get except that the server doesn't return a body So you're actually just trying to look at what headers and things that the server returns to you There's also other things that you can look up and check out here, but we don't really go into these Um, cool. So a request looks like this. So this is the start line. So start line is three things separated by space First the method. So this is a get request to the resource slash that's the um What we call that Yeah, uh the resource. Yeah, there we go. Sorry and then the version. So this is htb slash one one Then after a crlf. It's all of the headers headers are of the form header name colon space and then the value So this is specifying the user agent tongue the server what software is making the request The host parameter is really important because it Allows a single web server to listen on the same ip address for multiple and serve multiple websites so the server knows what What request the user is trying to make and accept and you can't see it So what can't you see in this that is there in this request? So format is the start line Which we have at the top headers. So headers. There's three headers here and then what? And then the headers separated by the body by an empty line So there's an empty line and that is always there even if the body is empty. There's still stuff there cool Okay, modern requests may be quite different So this is like just a snapshot of some of the things But again, this is just additional stuff that's being sent along to help the servers and clients kind of figure things out The response the server will respond with the protocol version The status code. So this is if you've seen have you seen a 404 before? Yeah, that's in the hgp response code. That's to find the specification that says this resource does not exist um So the code a short reason headers a body And the syntax is very much the same a status line followed by headers again, crlf's all those headers and then a An empty new line that then specifies the body cool So the status codes can be let's look at this and then we'll go back to that thing So generally these are the categories of status codes 100 200s are usually good So a 200 is the most common one that says yes, you did everything correctly 300 is a redirect that says go somewhere else. Why would you want to tell somebody to go somewhere else for a resource? Because it changed the location change But you don't want that link to break because there may be links to that old place And so you want to rather than just say I don't know what you're talking about you can say aha. I know where that thing is It's over here. Go get it 400 uh means the client messed up So if you see a 400 when you're typing in the request manually, that means that you messed up A 500 is when the server messes up where the server causes some exception And these are some examples cool Oh more examples. I guess we don't need all these Okay Let's go and check it out. All right. So this says I need to make a request on with netcat So netcat is uh, I can check out the man page You can also check out the man page of man if you're very confused Uh nc destination port. So I need to make a request to local host Local host is just an alias to one two seven zero zero one Port 80 So what type of request did I want did it want me to make? Because this is just an hgp request. Let's do a get request to slash Man, I forget. I was hoping I could do this off my dome. Okay Done. Why did that work? Yeah, I did the status the the start line And then no headers So that was the empty line when I hit enter and then it responded hgp 1.1 200. Okay So it responded with its version a 200 response. Okay and gave me the flag. Yeah So netcat will is making a connection So the server is listening on some port and netcat is making a connection there You can actually set netcat to listen, but that's something you don't need for here. Um, So yeah, and the server. Oh, no, it did connection closed. So it should be done. Yeah But you can ask it to keep alive a connection and then make multiple requests through there You can also if you want to get like I guess really fancy See if that works. No I thought that would work. Oh because I did it has to be capitalized So you learn something new every day. Anyways, you can do however you want as long as it goes through netcat You can write a file type into there. You can do echo. I like typing it in because it's fun. It's like you're In the danger zone of making requests Um Cool. Okay. Yeah, so this was another type of request that we saw and this is like a real um some things to look out for for sure are I guess I don't have examples in here, but we should post some examples of the content length so when you're uploading a request as a client The server needs to know how much data you're going to send And so there's a header called content length to say how much data you're going to be sending um But it's like a real world example of the return code here All these crazy headers that get set and then finally the html content of the page Cool, okay, uh one crazy thing. We'll just do this really quick. I guess this is uh, that's all you need um So htgp is a stateless protocol meaning it's like Oh wait we go to 245 I've been rushing it. I thought we were done. Okay, great. You have so much time to run Awesome. Okay, so, uh That's like getting finding free time. Okay so Essentially so htgp is a stateless protocol and what this means is that it's kind of like anything in the movie Lomento It's like with a guy at short term or at least long term memory loss. So everything it's like Uh, you're just showing up and he's like, hey, how's it going? Like I've never seen you before until you just met five minutes ago So somebody with like no long-term memory processing somewhere with these web servers you go to a web server and you ask for something It's like, hey, I've never seen you before here is what you want And that's actually baked into the protocol. There's no notion of hey, but how do you know? I'm not the person that made this request Yesterday and so that actually makes it very difficult to do things like how do I make a web application? Like Pone college that knows when I log in because There's no way of doing that So we want would like to maintain state and specifically We want to know if I make a request a five minutes ago and then I make a follow-up request b Right now. How does the server know that a and b are linked? So one way I could do that is use the ip address. Is that a good idea? Why not? Yeah, so right now most of us will probably be out through the same ip address. I don't know how many external ip's ASU has if you were in your house Everyone in your house would then be logged into the same website So that could be bad for you depending on the websites you your house visits And so the goal is we want to create this notion of a session that links all the requests back to the individual browser Or ideally the user that made them And this allows things like authentication. We want to know we want to be able to have Give different access to the system to different users And to make actually think so There's several ways to do this. I'm only going to talk about one of them. It's the only one that's actually used Sometimes they embed information in urls about who you are. This is if you've ever uh Some sometimes you can do this or like your User information is embedded in a url The danger there is if you send that to somebody else then they could log in as you which is very bad Uh, okay, but the main way is cookies. So cookies were created as a way to solve this problem Anybody ever hear about cookies in the context of the web? Yeah, so what are they a lot more people for the first one? Yeah Yeah, so it's basically it's just it's literally to solve this problem Right. It's just some information that the server says hey browser you store this for me And then when you make a request send it back and that way I can know who you are Um, so the server is the one that initiates this process And ask the server to store a cookie Then the server or user agent at any time could terminate the session So the serve the client can decide to just uh throw away that cookie and not use it when it makes a request In which case the server will not know that this user is the same user Uh, cookie cookies were first created by netski Way back when they were trying to make an e-commerce application They realized this problem of like oh wait, how do we make an actual application where people can buy things? um In 97 is this was the first attempt to standardize cookies And actually this is like fascinating if you look at this history cookies are Seem to be so simple, but they're insanely complicated so this rfc from april of 2011 Describes how cookies are actually used in the web because there's a lot of different options you can specify on cookies We're going to need to go into that level of detail, but So cookies are name value pairs separated by an equal sign And the server includes the set cookie header in an htdp response So the server will as part of its response say hey set this cookie user equals foo That way when the user agent makes follow-up requests to that server It specifies a cookie header on the request and now the server can link those things So when making a request the server make this request cookie equals user uh foo The server can ask for multiple cookies to be set It can set multiple things if you actually check your cookies You'll see a lot of this weird stuff in there of your language preferences are done this way all kinds of stuff There's several attributes Path so you can specify what path of the server the cookie is valid for The domain so if subdomains are valid for that cookie Do you want any subdomain of google.com like docs.google.com to have the same cookie? An expiration date when the server when the client should Reject the cookie Ignore those for now So this is from a request I made a long time ago. So don't worry these cookies aren't gonna work Um, so this is from making a request to google.com So google is asking to set a and this is super interesting that um The cookie like I said the equal sign is just like a I think It must be very simple because it does after there so it does Uh, so the name of this would be pref and the value would be id colon Or id this thing colon f equals this thing All of all that's probably some custom google thing to specify all those sets an expiration date a path to domain Um And with an n id also as well here. I don't know what this 60 set that's kind of weird, but uh, I don't know whatever. It's just an opaque block that the server wants the the user to send Um, so it was set two years in the future Cool, uh, the server can delete cookies tell the browser that the that it's deleted Uh, by setting expires date in the past so you can set an old expiration date and on setting a cookie and it will delete it Uh, okay The client the user agent is really is responsible for following the server's policies But again, this is why you ever clear your cookies before Yeah, so this is what you're doing. This is all your browser is doing is going through deleting all of those Now when you visit those websites, it's like you're visiting it fresh. Yeah Can cookies depend on each other? interesting in what context Yeah, it's interesting. I the short answer is I don't know. Uh, I don't usually they're not usually it's it's defined until like So a terrible way to do it would be to put your user id Or something in the cookie But the cookies are stored on the user's browsers and they can edit and change them at will you'll see that as you make requests As sending cookie values back you can put literally whatever value you want there So if you could put if you can easily guess other people's user id's and change your cookie value to be another user That would be really bad. So usually it's some random value that then links onto a database or something or They cryptographically sign something and give it to you so they can tell if it's been tampered with Yeah, these are the main ways that it's done in that way They can identify you and then on the web application then they can say oh this person has access to this They're a teacher not a student or sorry. They're a student not a teacher. So they have access to different things There's a hand up Yeah Oh, yeah, it's still a threat now like some I mean, I guess now there's so many web apps that are sorry web app frameworks That handle cookies for you that you most web developers don't need to write their own custom cookie handling mechanisms but you can always mess it up like So yeah, it depends on the web app, but yeah, there was a lot of problems Earlier in the day like in the early web history People just didn't really think through the fact that like I'm giving a cookie to you Assuming that your browser will always give it back to me the same way Well, of course, nothing prevents you from editing that or making a request to me And that's really about what you're learning now is these requests You can make literally any request to any web server using just your terminal with netcat Cool. All right. So now that we're there Let's do only two things Let's look up the hgp specification I did not talk about a few things Okay, I think there's a link here on the thing, but we'll just go with this Cool Okay, so You can always read this. I know that may sound silly But it's actually kind of interesting stuff about how all this stuff works if you're interested in like For instance, uh transfer encoding. There's all kinds of crazy stuff about how to Transfer different informations from one side of the other What I'm interested in now because you're going to be making a request So the request we talked about the method the sp means space here method request uri hgp version Um, these are the different methods. This is like when I Brought up all those things like I didn't just make up those different hgp options, right? They come all from the specification that you can check out Uh, if you want to know what a request uri looks like it can be something like this Um, anyways the important thing though is So these are kind of uh important characters. Anyways, this is what I was mentioning earlier If you're uploading data to a server the content length is very important because it specifies in the body how much because There's nothing in the request that says when the server should stop Like how much data are you uploading? There's only the start line Headers and then a new line and then the body and the body could be unlimited length So, uh, you need that And then we didn't talk about this, but I think we talked about it a little bit So we curled localhost. Oh, I messed that up. Um So we did this One And then okay, so going over uris again foo and Okay, so name equals value and A bar equals value Okay, cool. So this obviously didn't work, but in this uri Which part is the querying? Yeah, so everything after the Uh question mark to the end and then the web server interprets this And separates by ampersand each of the name value pairs So the web application on the other side can look and say okay Is there something named name? So for instance, there are challenges that say hey Make a get request and set the name equal to value The other thing is this is something for I need a curl one because I can't demo here Let's go You notice thinking we should do is set it so that it keeps the history Without quitting. There's some bash option to do that. I just don't know what it is Cool, okay I don't know if I want to show you on the curl, but anyways, uh, I actually do have this So this is making our get request. So the v option just has curl show me what's actually happening So that arrows indicate the direction. So this is what I'm sending to them and the other direction is what comes back Now this is passing parameters as part of the get string We can see that they end up here as part of the uri But I may want to pass things the similar things in the values. So I can say name equals value And r equals value And so now I'm making a request of to slash foo again, like I said the content length of 20 Ah, why does that show me? I don't know why it's not showing me the body. Anyways, uh, so It will set the content like 20 to say there's 20 characters And then a new line and then sends these 20 bytes, which should be this value. Is this 20? I don't want to count it, but uh, yeah 20 good Way easier than counting So it sends that there as the body. So that would be how you can pass data as a post as part of the hgp body So shows you the different ways that you can pass parameters in that way And I think that that's all you need. So good luck