 All right, folks, I'm going to get started here. Thanks for coming out. USB keys have two files you'll need, the Vagrant file and the HTTP exploration box file. If you don't have Vagrant and VirtualBox, there are executables on there for Windows, Linux, a couple of variants of Linux and Mac. You'll need to have those installed. I'm going to do a half hour lecture, hopefully a little shorter than that. And then we'll have about an hour to run through the exercises. The exercises will walk you through starting up Vagrant and everything you need to do. So I have Charlie Sanders helping me out. Raise your hand if you need a USB key to get the files from. So basically, you've got half an hour to get all that working. If you don't, don't worry too much. You can pair up with the neighbor. I would appreciate feedback, negative or positive. You can tweet me at Craig Buchek. The email address is up there. My presentations are on GitHub. I haven't put the latest versions of this presentation yet, but it should be there tomorrow. If you guys want to follow along, tiny cc-hdv-exploration with an underscore, we'll take you to this presentation. Actually, I think I need to update that. So I'll do that when we get to the exercises. The exercises will be in there too. So I'll update them when we start the exercises. So the reason I started doing this is I actually have a previous life as an assistant and a network admin. So I've actually done a lot of troubleshooting of HTTP, going through networks, going through firewalls, and now I'm a Rails developer. So I've seen both sides. So we're going to talk about HTTP basics, request responses, talk about proxies, some troubleshooting, and HTTP too. And then we'll get into the exercises that touch on all of those. So HTTP's been around since 1991. The first version didn't actually have a version number, but retroactively we call it 0.9. It was standardized in 96. The version that we typically use now was standardized in 2007. That RSC is very handy if you have any questions about how HTTP works. I was just updated a few months ago. It was broken up into multiple pieces, basically in preparation for HTTP too. HTTP too has been ratified, standardized, but it hasn't been published yet. But that's a good URL to check out if you're interested. So HTTP is stateless, which means that each time you connect to the server, it doesn't really remember what you did the last time you connected to the server. And the only way around that is through cookies pretty much. It is text-based, that's kind of the whole point of the presentation. We're gonna look at the text that's going across the wire. And it's a request response cycle. Your client, your browser, will make a request to the server. The server will respond. So we're gonna talk real quick about URLs. So pretty much every piece of that URL on the screen will go from the client to the server except for the fragment. The fragment just tells the browser to go to this specific spot on the page. The scheme is going to be HTTP. The host is obvious. You can specify username and password in the URL. Recommend against that, but it's possible. And then the path is actually the piece that goes that we'll be working with for the most part. All right, so the HTTP request looks like this. That thing in green is called the method. So we'll talk about that in a bit. The next thing that's passed is the URL. Usually it's not the full URL. Usually it's just a slash. It starts with a slash in its relative to the top of the site. And then you have the version of HTTP specified there. In blue are the headers. So the headers are sort of metadata about what you wanna get or what you wanna do. In this case, there's a host header and a content length header. And the body, not all requests have a body. If you're just getting something, it won't have a body. But if you're posting something or putting something, it will have a body. In this case, I tried to put something to Google's front page. Probably not gonna work. I actually tried that and it gives a 405 error, which it's hard to find. So that's the body text, basically the content that you want to upload. So the methods I talked about, normally when you go to a little website, you're going to do a get request. Say get this page. You go to Google's front page, you do a get. A post is when you want to update something or when you submit a form. Although some forms can actually do get depending on if you're changing something or just doing a query. Like when you go to Google's front page, that form, when you type in the query, that's a get, because you're actually just asking for information you're not asking to change anything. A put is actually, it's an update or actually it's more of an upsert. You're saying I'm giving you the whole thing, replace whatever you got if you have anything, otherwise create it. Rails is a little weird on that. It doesn't actually do the create part of that. So it uses put a little bit oddly. Delete does what it says. If you have permission, you actually can delete pages or resources through the web. Head is basically a get request, but without the body. You're saying show me the headers that I would get if I did the get request for this URL. There's a concept of safe methods. So safe methods don't have any effect on information. When I talked about the post, it changes something. When you submit a form, it changes something on the server. A get is not that case. So the takeaway here is don't have your app change things when you do a get because when Google comes scraping your site, it's gonna do gets on your site. If your site changes something on the back end, you're gonna have a bad day. So just keep safety in mind. There's also something called item potency. That means that you can call the same thing multiple times. You can do a get to the Google front page multiple times. You'll get the same thing. Same thing with the head. Same thing with if you delete resource. The resource is gonna be gone if it was there or not. A put, you're actually saying replace something. If it's there or replace it, if it's not there or let it. So multiple times doesn't matter. The one that's missing from this list again is post. So if you post something multiple times, you'll end up with multiple copies of that thing. So request headers. We saw that we can provide those headers. These are some common headers. The host header is basically required. It says the name of the website we're accessing. And the reason that that was added is so you could host multiple websites on a single server. The accept header says, list the types that the browser would like to receive. So the browser may say that it likes to have HTTP. It likes to have JSON, but not list XML. So if the server knows how to do HTML and JSON, it will pick one of those, but it wouldn't pick XML unless it didn't know anything else. The weird thing about the accept headers, you can actually specify relative quality. Sometimes you'll see a queue number if you look at the accept header. And that says I prefer these types of things over the other things. The default quality setting is one, which means top priority. If you prefer JSON over XML, you could set its quality to 0.9 and XMLs to 0.8. The content length on a request header is the content of the body of the request that you're sending. And if you don't specify that, it actually just, it doesn't let you specify anything. Once you've uploaded the content length number of bytes in the body, the server will close the connection. The content type tells what you're uploading. Am I uploading some HTML? Am I uploading some XML? Am I uploading some JSON? The referrer is if you are a browser, it's the page you're on when you click the link. That is spelled incorrectly. But it's in the standards document, so that's what we use. The user agent is another name for a browser or a web client. It could be a web crawler, it could be various things. It's a string containing the information about the browser. And if you take a look at those, those get to be pretty crazy long, especially in Chrome and Firefox. They try to pretend. So back in the day, there was something called browser sniffing. The server would look at that string and try to determine, oh, I'm internet explorer, I'm gonna give you this copy of the page, and your Firefox, I'm gonna give you this copy of the page, which could be completely different. Google didn't like that because Google wants there to be one canonical version of every page that it indexes. It doesn't want you to fool it, so it sees one thing and then the user sees something else. So browser sniffing's not used too much anymore at all. But you end up with these really long strings that tell you sort of all the backwards compatibility that the browser tries to do. And we'll take a look at those headers in the exercises. Yeah, yes, yes. And then content type is only for a post or a put, and it's what type the body that you're sending up to the server is. Yeah. So authorization is for, if you ever go to a website and you get the pop-up box, authorization is basically that information you type into the username and password box. It's not encrypted, it is hashed and sent. And we're actually gonna, there's an exercise that we're actually gonna find that going across the wire. Except in coding, you can actually gzip things, the server can gzip. The content coming back to you, this basically says that's okay. You can say content encoding gzip comma deflate, that tells the server, hey, I can gzip this and save some bandwidth across the wire. Connection usually is for keep alive. It is the client saying, hey, I've made those requests, but when I'm done, I'm gonna make another request. So don't bother tearing down the TCP connection. I wanna reuse it for the next request. So if I'm getting a web page and I know it's probably gonna have some JavaScript, it's probably gonna have some CSS, it lets me get those all in one without dropping the connection. Cookie, so I talked about a cookie is the server has sent us this cookie and we are supposed to return that cookie back to it. And that maintains sessions with the server between the server and the client. And that's the only thing that maintains a session between the client and the server. And that header is limited to about 4K of information. So if we do our cookie based sessions in Rails, we do have to worry about that limit. You wanna store a whole lot in that cookie. All right, so we've made the request and the server responds with the HP response. That first line is called the status line. It's got the HP version. Interesting thing, you can actually make a request for one version and get a different version back. I find that a little odd. The 201 there is our status code. And then there's the description of the status code which is created in this case. And then like the request, the response actually also has headers. And then it has a body. Not every response has a body. There's a few that don't have to have a body. Created is actually one of those. So if you are doing a put or a post, it could actually just say create it and then don't give you anything back. But you probably wanna do a redirect. So you probably want some extra information in there maybe. So status codes. So the status codes have different meanings and they are actually standardized. Although occasionally you'll see some non-standard ones. The 100s are informational. You'll rarely see those. The 200s are what you'll see most of the time. Usually for a get, you're gonna get a 200. For a post or a put, you're probably gonna see a 201. Redirection, if you go to google.com, it'll actually redirect you to www.google.com. That's a redirect. I believe that would be a 301. If you send headers that are involved with caching, which we'll do in an exercise, you'll get a sometimes you'll get a not modified. And that says, hey, you've told me you already have a copy of this in your cache. So I'm just gonna give you the metadata and the headers. I'm not gonna give you the body. You should use the body that I gave you last time. So then we've got the error response codes. So 400s are client errors where the client made some sort of mistake. Unauthorized is if you have not sent an authenticate header and the web server requires you to authenticate, the client is supposed to retry with the authenticate header. And so before it does that, it pops the box up, has you type in your username and password, resins with that header. Forbidden means you've probably either you're not, you can't authenticate or you've authenticated, but you still don't have permissions because maybe you're not an administrator in the box. 404, we've all seen that one before, page not found. 407 is proxy authentication required. It's like a 401, except it's the non-transparent proxy that is asking for a username and password. That's pretty rare to see. We will do a little bit of proxies, but we're not gonna do any proxy authentication. 422 is unprocessable entity, which is a weird way of saying I don't understand what you're trying to ask me. That is recommended, that's what I would recommend, either 422 or 409 if you are making an API request and it doesn't have the right information that it needs in the JSON or XML or whatever that you sent. Server errors. So if your server crashes really badly, you'll see a 500 error. I believe you see that in Rails a lot of times when you're debugging. 502 and 504 are gateway errors. So if you've got a reverse proxy sitting in front of your servers and your servers have gone down or take too long to respond, you'll see a 502 or a 504. There are plenty of others. I ran into a 405 earlier. There's one of the April 1st RFCs called I'm a teapot response, a 418 code. Not sure when you'll need that. I'm sure someone has implemented it though. Response headers. So like a request, the response has headers. The content length will almost always be there because you'll have a response that has content in the body. The content type, it's the mine type, so it's gonna be like text slash HTML, text slash plain, application slash JSON, or is it image slash JPEG, image slash PNG? That tells the client what type of file this is and then it can handle that however it expects to. The content encoding is, I talked about accept encoding where you can GZIP it. The content encoding says that the body has been GZIPed. Notice that the body is GZIPed and the headers are actually still in plain text. Content disposition is a little trick that you use when you want to, when the person clicks on the link, you want them to download the file instead of have the file display in the browser. In that case, you would use content disposition header and you can also provide a default file name that the browser will try to save it as. Location. Location is used for redirecting and we'll have an exercise on that. Usually you wanna provide a response code that says redirect and then you provide the URL to redirect to in the location header. SetCookie, we talked about cookies for maintaining state, maintaining sessions. SetCookie tells the browser to remember this token and it's just roughly a random string of characters and when it gets sent back by the browser, the server knows to associate it with a session. It looks it up in the database. www.authenticate is basically telling the browser that to pop up the box or if they've already typed their username and password to provide that information to the server. All right, so we'll run into proxies. We'll have some exercises on proxies. So a proxy is something that acts in place of another. In the case of a HTTP proxy, a web proxy, it intercepts our HTTP requests. So it can modify that request. So what it does is it intercepts a request, modifies it probably, does some caching perhaps and sends it onto the server, gets the response back from the server, can modify it then again and then sends it back to the client. So it sits in between the client and the server and it can modify pretty much anything that it sees in there. So proxies are good for caching. You can add security. So I've actually, actually our exercises, we will add some SSL to our Rails app. So you do this to simplify so you don't have to have the Rails app understand SSL. Also can save some CPU time. Also can be used for load balancing. You can be used for authentication. I've had it where I had Apache in front of a application server and we added the pop-up authentication for it with Apache acting as a reverse proxy, which I'll talk about in a second. So there are transparent proxies. Basically you don't have to set anything up. It's inputted itself into the stream of the network and there's non-transparent proxies which a lot of times if you're at work and at a big company, you'll have to configure a browser to point at the proxy. That's a non-transparent proxy. And we've got an exercise on that. Reverse and forward proxies. So the proxy you have at work that sits sort of right next to the firewalls would be a forward proxy. A reverse proxy sits next to the server. So right near the servers, anything that's, any proxy that's added on the server end is a reverse proxy. And we will actually have exercises on both of those. A CDN is a content delivery network. It's basically a paid service that does proxying for caching purposes. So you can cache all your static content. Technical things in here I won't get too much into. Now one of the nice things that can provide you, protect you from distributed denial of services. If you've got a big site, you should probably be looking into these. Troubleshooting. Any network problem you have, you have to think about the OSI model. And I wish I had a picture of that. I forgot to put that on there. So you've basically got the physical layer, you've got the network layer, you've got the transport layer, which is TCP. And then you've got your applications sitting on that. So there's a lot of things that could go wrong. You could have a network cable pulled out. You could have, you know, your routers down or the server's not running. So all those different layers you have to think about when you have a problem. Troubleshooting is trying to figure out which layer it is. And then narrowing down on that. So one of the first tools is ping. Can I connect to the IP address that the server is on? The problem with that is sometimes the firewalls will prevent that, either firewall in your company or the firewall on the other end. Traceroute is similar, but it shows you all the hops in between. So maybe the network is down between your internet provider and Google. Traceroute might be able to help you find that out. Tell them that we're gonna use in our exercises. It's a good way to tell if the port is listening. And if I've got connectivity to the IP address, tell them that will tell me if the service is listening. And that's actually kind of where I start. I kind of start in the middle. And then if tell them that doesn't work, I'll try the lower layers. And if it doesn't work, I'll try the upper layers. If you're on your server and you wanna see if your service is listening, you can do a net stat on Linux that's dash PLA and T. It's easy, mnemonic for me to remember. On Mac, it doesn't have all those options. You'd use dash NA and then grab for listen. And then so that's gonna list over on the left side, it's gonna list all the IP addresses, which is usually gonna always be your IP address, and then colon in the port number. So if you're looking to see if Rails is running, look on the left side for something colon 3000. Just assuming you use the default port for Rails. Telnet doesn't work with HTVS because it's encrypted. So you have to use this tool that OpenSSL provides called sClient. And we've got an exercise on that. TCP dump, if you really wanna see all the details on what's going across the wire, it'll tell you everything. We've got a short exercise on that. There's so much information we could probably spend a couple hours on that. Wireshark is a good way to visually see what's going on. You can actually take the output of TCP dump, save it to a file, pull it into Wireshark. Wireshark has a nice feature. TCP dump shows you each individual packet and they're disjoint. When you're having communication between a client and a server, your packets can only be so big. So the communication will be broken up into pieces. Wireshark has a nice feature to put all those back together, which is kinda cool. All right, so as I said, HTV2 was recently approved and ratified. It came out of a project at Google called Speedy, SPDY, apparently they wanted to make it Speedy to say, to write the word Speedy. So as I said, we can compress, we can GZIP the body, but we can't, in HTV 1.1, we can't compress the headers. HPAC is a part of HTV2 that allows header compression. Another thing, the standard doesn't require it, but every HTV2 implementation out there requires TLS and that's probably because it's got TLS or SSL has a protocol negotiation built into it and that protocol negotiation can say, hey, do you have HTV2? I'd like to use that. Do you have Speedy 3? I'd like to use that, but if not, fall back to HTV1. So I said that HTV is all text. HTV2, that turns out not to be the case because we're compressing the headers. It has the same semantics. You won't be able to use Telnet, you won't be able to use TCP dump, but you will be able to use some of the other tools that we'll be looking at today. I did get an example to work with HTV2 that we can do as an exercise. When you're making a lot of small requests, like let's say you've got a lot of icons on your page, ideally you would just get each icon individually, but those icons are only 25K or something. At that point, the headers at a couple K are starting to become large overhead. And if we could compress those down to a couple dozen bytes, would save a lot of bandwidth. And there's some other features in HTV2 we'll get to in a sec, basically just to save bandwidth. The other thing about HTV2 is it multiplexes the connections. You can actually have multiple files coming across at the same time with a single TCP connection. Right now your browser has to make multiple TCP connections. I don't know what the browsers are up to. They started it like four at a time, they got up to eight at a time. But that means you've got, the TCP connection takes up resources and it takes time to set up. So if you could do just one TCP connection that would save some time. That's one thing they've done. So you can grab your HTML file and you can actually start processing and see, hey, I've got some JavaScript, I got some images, I've got some CSS, I need to grab all those too. So I can grab all those simultaneously. So that's gonna save a lot of time. Server push, the server can actually know, hey, he's getting this HTML file, he needs a CSS file too. I'm gonna start giving it to him even if he didn't ask for it. And the client can say, well, I'll take that. Hey, I finished with the HTML file. I didn't need that CSS file. You can stop giving them that to me. Which is kind of weird with caching. I don't know how it knows what's been cached and what doesn't, but the possibility is there to save some time. So when we do web design and web development, we've got the asset pipeline, right? And that combines all our JavaScript files into one big file. It combines the CSS. We probably combined that into one file. The images, we probably do something called spriting where we put a bunch of images in one file. And then the CSS has to go grab each piece to put each icon on there. If you look at the, I know I've looked at the homepage for Yahoo. And it's got all the icons on the left. Those are all in one file. And the CSS gymnastics we have to do is a pain in the butt. HTTP2 will hopefully allow us to stop doing that. We can just write it the way that it should have been written in the first place. Not worry about performance. HTTP2 should do that for us. HTTP2 is kind of weird that it starts with a 1.1 connection and then it upgrades. The semantics of that are pretty crazy. You probably don't want to get involved in that. We'll see a little bit about that in the verbosity of the tools we use. We'll show you a little bit of that. So where is HTTP2 working? Chrome 41 has it. Firefox 36. IE 11, but only in Windows 10. So nobody has that yet except in beta. Nginx says they're gonna support it by the end of 2015. They already support speedy 3.1, which is pretty darn close. I could not get Curl to work with Nginx using speedy or HTTP2. So Curl 4, right now you have to specify the HTTP2 flag, and for this, I had to manually compile that feature in. Wireshark 2 will have it. They're currently in the beta series 1.99. Patches doesn't seem to have plans, which seems a little weird, but it does have mod speedy available. All right, so it's time for the exercises. I need to update that URL real quick, but Charlie will be walking around helping people get started. The exercises are basically starting on page 27 of this, and so the first step is basically add the vagrant box with that command there. Do a vagrant up, and then a vagrant SSH. You'll be in the box, and then you can move on to slide 28. Will Rails support HTTP2? Probably not for a long time. So Rails really doesn't even support HTTP. Your Unicorn or your Puma are what supports that. So when those support HTTP2, we will have it built into Rails. Rails itself sort of sits right behind that. Right now, if you want it, you would put a proxy in front. You would put InginX, is my recommendation. In fact, InginX said they have 95% of the, well, it was speedy at the time. Server's on the internet. So you would use a reverse proxy. Well, Rails also doesn't support HTVS, so you probably want HTVS on your site, so you probably need that proxy anyway. Another question? Oh, the link? You should see this version without the vagrant init. If you did the vagrant init, you'll need to get the vagrant file back. And I think you'll need to do vagrant box remove. But raise your hand if you did the init and we'll help you out. So once you guys get into vagrant with the vagrant SSH, just work through the slides. Raise your hands if you have any questions and we'll come around to help. You can go to the slides next slide. It's not. Yeah, second time I tried this. I tried to bring the demo, but it just hangs on the slide. It's actually really something I'm not used to. I need to. Oh. That's like, turn off. I did that too. Yeah, yeah, yeah. So I ignored that. I was just trying to research that. Um, that is something that's, I don't know if I should. Well, I just know that the gross, I also have a nice, right? It's not, it's not, it's not, it's not, it's not. It's not, it's not, it's not, it's not, it's not. Can you work under your vagrant file? It's kind of, that's, it's a long text, so it's not, it's not, it's not. It's not, it's not, it's not, it's not. It should be very close enough that you can create a great partial work that is actually out. Oh. So it's pretty general, you should know. Maybe it works to try to install the first text, you know, the virtual box that's on here. I guess that's the one that I guess it was made for, I'll just talk to you. It's in the virtual box for free. Oh, okay. It's in the virtual box for free. It's in the virtual box for free. It's in the virtual box for free. It was that, it was great. Okay. So that is how you go get the virtual box, yeah, to the virtual box. Thank you. Okay, thanks, Alex. You're welcome. Okay, thanks, Connor. Oh yeah, I think it's the deal, we're just saying for you to get higher. All right. Okay, that's fine. This is the expectation of the place where I'm going to put my hand, because it's a little bit difficult, but yeah, so, um, open up for the thoughts, and like, you know, come on, sure. Oh, yeah, I've got one. And then like, right with the thumb, that's how I'm going to say it. The thoughts won't do it. Um, think thoughts won't do it. Yeah, whatever. And now these solids, like, all right, cool, thank you. Thank you. Okay, next one. That should be like the next slide. Yeah, there's no, nothing wrong with pretty much anything. I think the next thing we're going to do is start on the end. Yeah, we're going to have to make sure. Definitely. Okay. Yeah, excuse a little bit. If I should get a cigarette. I'm going to have to get a cigarette. I'm going to have to get a cigarette. Because, uh, Yeah. So, we're in the first one. What? We're in the first one. It's tall. It's tall in the first one. No, it's not. It's not. It held 992. Yeah. I didn't want to. I just want to. Yeah. Yes. I think you have to look up there. Of course I'm missing. You're like, yeah. I didn't hear about that. I did. I did that yesterday. I have a lot of things like that. I think I just talked to someone. Where am I? Can't we just sit down? OK. OK. Thank you. Thank you. Thank you. Like inside of you? Hey, folks, for the HV1.0 and 1.1, make sure you enter the blank line when you're using Telnet. If you don't do the blank line, I'll just sit there waiting for you. Yeah. Local slide. From like the right around the start? Yeah. A lot of response from... They should be able to do... I don't remember. Can you tell them that? I think it was the blank line. I think it was the blank line. It was the blank line. Yeah, thank you. Come on. Good. Yeah. I'll go over it. I actually don't know how. Do we have some running backwards? Yeah. Oh, I don't know. Yeah. OK. So I'm coming from another business. Send the... It's all this stuff. Do I need to... This is like press enter now? No. OK. OK. There we go. We did it. Thank you. You can do three more things like this. OK. You can still get it. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. OK. you can ask me questions later. You can finish this at your leisure.