 Hey, guys, if you wind up liking this talk, I'm going to be tweeting out links to slides and all that stuff later, so check out my Twitter. So I'm very excited to be here, right? I come down to San Francisco every couple of, or a couple of times a year. And every time I come in, I always feel like I'm flying into Starfleet Academy. You know, SFO is really aluminum and roundy and everything's from the future and the sky's clear. And then I go back home to Seattle and it's a whole different story. Little known fact, that's actually a young Aaron Patterson. So my talk is about the short but happy lives of HTTP requests and specifically with a specific focus on how to make those shorter. We all know how this works, right? You type in a URL into your browser, that triggers your Rails controller, which causes a view to be rendered, which causes a nice picture of a calming manatee to be rendered back to the browser. And so, you know, that's great. That's a mental model I kind of use when I'm developing. But if you've been at this a while, you might start to notice that this connection between the browser and the controllers and the views is a little bit mysterious. It's kind of like a black box. And if we're going to build fast-performant web applications, we kind of need to know us in that black box. I like to think it's full of manatees, but that's just me. So let's talk about the internet. As soon as you, I was like, you got 28 more minutes of this. As soon as you start to research the network, try and learn about the stuff, you run into this thing called the OSI model. It's very simple. There's actually a lot of dispute about the OSI model. Some people think it looks like a hamburger. At this moment in time, though, we're all pretty much in agreement that it's made out of cats because, I mean, how else could the internet be so efficient and transmitting pictures of cats? So to find an image of the OSI model that didn't involve some stupid meme, I actually had to go back to a 1995 issue of Dr. Dobs Journal and scan in a picture. And so what you see here is we've got a 10,000-foot view of the network stack. At the top, we have things that are more abstract. At the bottom, we have things that are less abstract. You and I are used to dealing with the application and presentation layers. Let's just go down the stack. The session layers, kind of like SSL, transport and network are kind of TCP, IP, and the data link and physical are kind of like Ethernet card and the wire, right? So I want to talk for a second about wires. It turns out that wires are surprisingly important to modern web development. Because your wire determines your latency. And latency is a time that it takes for one byte of information to travel from my computer to your computer. It's a time in the wire, right? It's not bandwidth. It's not measured in megabits per second. It's measured in milliseconds. And the thing to know about latency is that it's got a lower bound. It's got a lower bound determined by the speed of light, right? Which is great because the speed of light is like the fastest thing in the world. It's 300 million meters per second is the speed of light in a vacuum. And just like a little pro tip for you guys who didn't go to science school like I did, just call it the speed of light in a vacuum. Don't say S-O on a vacuum because that's like something completely different. But you run into problems when you actually, say, calculate the shortest distance between London and New York City or New York City and London. When you do this, you find it's about five and a half million meters. And do the math, that means that your minimum theoretical latency round trip between New York City and London is 37 milliseconds, roughly. So you're never going to send a message from New York City to London and get it back in less than 32 milliseconds or 37 milliseconds. And in fact, it's a lot shorter than that because we're not sending light through a vacuum. We're sending it through fiber. And light travels through fiber much more slowly than it does through a vacuum. So we find out that we have latencies that are kind of like this, right? New York City to London is about 79.6 milliseconds. That's really too precise. Let's say 70 to 90 milliseconds. Within the US, latencies of about 40 milliseconds are common. Japan, you got 16 millisecond latency. It's not because they're super highly advanced, it's because, well, the country is a lot smaller. And I bring this up because latency kills user experience, right? If I click on a button and it takes 100 milliseconds for something to happen, I know in my heart of hearts that I didn't really cause that thing to happen physically. And if it takes 250 milliseconds, it feels sluggish. If it takes 500 milliseconds, I'm wondering how the stock market's doing. Yeah, I just started investing in the stock market. I'm like $50 down. 1,000 milliseconds, I'm out of there, right? And this has real world implications because Google found that if searches took longer than 400 milliseconds, people didn't search as much. Big online retailers have done studies that show a correlation between conversion rate and latency. So how do you get rid of latency? There's an easy way, which is to move your service closer to your users. And that's why we use CDNs, content delivery networks. But you can't really do that for everything, right? Chances are you have a centralized database, you have, you know, the centralized infrastructure and it would be really a pain in the neck to duplicate that all over the world and keep those in sync. And so we're going to talk about the slightly harder task of eliminating roundtrips. Now, before we do this, I've sort of dissed on bandwidth a little bit. I haven't really talked about bandwidth. And I want to do that for a second. And to do that, we got to go into the data link. This is the creepiest picture I could find. The data link is your ethernet card, it's like your cable connection, right? And it determines your bandwidth. Bandwidth has to be super important, right? Because these cable companies spend millions of dollars trying to convince us that life at 10 megabits per second is at least 10 times better than life at one megabit per second. Only it turns out that's just a lie. Some smart people at Google have done studies measuring real-life page load times as a function of both bandwidth and latency. And they found that after you hit about three or four megabits per second, you get diminishing returns. Adding more bandwidth doesn't significantly increase or decrease page load time. However, page load time decreases linearly as a function of latency. And so if nothing else, this slide will save you $30, $40, $50 a month, call your cable company, downgrade your plan. Why is this a case? Well, it has to do with the way that the internet has evolved. So the way we use the network has evolved, right? I just loaded up Slate.com. And Slate on his homepage makes 286 requests for 1.9 megabytes of data. This is the web we have now. We make a ton of requests for a tiny bit of data. And it turns out that lots of small files are a lot slower to download than one big file over HTTP. So why is that the case? It's kind of the case because, well, because of the way the protocols interact with one another, the way that HTTP uses TCP. And so let's blame the protocols for a second. I'm just going to run down the stack. You guys already know this, but the viewers at home probably don't. We have IP. The IP protocol just routes packets between computers. It doesn't make any guarantees of delivery or order or anything like that. And so we have TCP, which is an abstraction of a stable network that runs on top of this unstable network IP. And it guarantees delivery. It guarantees delivery in order. It's all about reliability. And finally we have HTTP, which is what we use to request and manipulate files over TCP. And so HTTP 1.4 sort of fundamentally is inefficient in the way that it uses TCP. And now we're coming to one of the big things. If you don't remember anything from this talk, just remember that new TCP connections are expensive. The reason for this is that, well, in order to be super reliable, TCP is also a lot more chatty than other protocols. For example, when you open a TCP connection, you have this thing called a two-way handshake that happens. Your client says, hey, server, we should talk. The server says, all right, sure, I'm fine with talking. The client says, all right, I see that you're fine with talking. Let's continue to talk, and I would like to see that funny cat picture now. So now we've got this connection established. And the good news is that you don't need to know about any of this to do your job as a web developer. But there is one thing you need to know about this. And that is that you have just incurred one roundtrip of connection overhead just to start up this connection. And if you're dealing with a latency of 100 milliseconds, you just added 100 milliseconds of time to your load time. And maybe this would be okay. Maybe we could handle this if there were just one extraneous roundtrip overhead per new connection, but you're not getting off that easy because there's this thing called congestion control. It's a very complicated subject. And so I'm just going to skip over most of it and tell you the one thing that you really need to know. And that is about the slow start process. Because just a second ago, when I established that connection with the server and asked for the picture of the cat. The server now has a problem. You see, it doesn't know anything about me. It doesn't know anything about how fast my connection is and how much data it can send me at once. If it sends me the whole picture at once, that could wind up overwhelming my network and all those packets would get dropped and now I would have to re-request them. If it sends me too little at once and it's not using, you know, my pipes to full efficiency. And so what it does is it kind of like dips its toe in the water. It sends out a little bit of data at first, maybe one kilobyte or something. And if I see that and I say, hey, that's great. I got that. The next time it'll send two kilobytes and the next time four. It's going to keep going until packets start getting dropped. And then it's going to back off and say, okay, now I know what rate I can send this data to this person at. You know, which is great. It makes sense. If you believe this graph, we just incurred 10 round trips to transfer about 250 kilobytes of data. And that's 400 milliseconds if you assume a 40 millisecond latency. Now, this is an example. It's a little bit exaggerated, but still, new TCP connections are super expensive. So we're going to focus on avoiding them, right? HTTP from even version 1.0 has this concept called keep alive. That is where the browser opens a single connection and then uses it for multiple requests. Like, I request a web page from a server. I make a connection to that server. And then that same TCP connection is used to send me the HTTP or the HTML. It's used to send me maybe some images, some other assets. And so we get to do multiple HTTP requests with only one startup penalty. But there's a couple ways you can get shot in the foot with this because your server controls keep alive. And it's possible that perhaps if you have a lightly loaded server and your users are staying on your site for a long time, there's lots of back and forth between your users and your servers. Perhaps you would want to extend the keep alive period. If you have a really heavily loaded server, you maybe want to shorten that to reduce memory usage by your server. And Apache and Nginx behave sort of fundamentally differently when you deal with lots of connections open. So just be aware of that. That's why I'm giving you links and not telling you actually what to do because I don't want you to blame me when you screw it up. But here, I'll give you one actual concrete suggestion. Most network stacks have this feature called slow star after idle. And what this means is that your network stack is going to monitor open TCP connections. And if it sees that those connections aren't used in the past half a second, one second, it's going to require that the next request that comes over that connection do this whole slow start process again. And that just defeats the whole purpose of your keep alive. And so the first command here is to check to see if you have this enabled and the second command is to disable it. So try that out. Also don't blame me if it goes wrong. So there's lots of ways you can tune your TCP stack. I'm not going to talk about any of them because as long as you don't have a super high network traffic site and you're running a recent version of Linux kernel, you're going to do fine. They've done a really good job there. So at this point, you may be wondering like HTTP, WTF? Like why is it so inefficient? And you've got to cut them a little bit of slack because originally HTTP was intended to be sort of a protocol where you could telnet into a server and you could type in a request and get back a resource. So we had very simple origins, right? You had this sort of single request model. And now it's gotten a little bit more complicated because now we have all these headers. We've got request headers that tells us what it should send us. And we've got response headers which tell us all the stuff about the stuff that was being sent. And we've got caching and we've got different content types and all that. Did you happen to notice that about three, four of those request headers were cookies? And just in case you didn't know, those cookies are going to be sent on every request you make to the domain that set them, whether or not that's an image request, a JavaScript request, CSS, whatever. And so if you have one kilobyte of cookies and you make 100 asset requests, you've just forced 100 kilobytes of probably useless data over the wire. Which on a desktop, it may not be a big deal. On a cell phone, it may be really annoying. So this is just one request. And normally, on average, web pages these days do 112 requests per page. This is what the internet has told me is true. And so now our job is to get this number down. So we already talked about headers. So we might as well throw in some browser caching there. Here's a link with a really good article about browser caching that you may want to look up later. It's on Heroku. And that's great for the second time that you load the page. But the first time you load the page, you still do 112 requests, which is crazy. So we have these best practices now. We concatenate our JavaScript and CSS. This is built into Rails. And so we take 40 JavaScript and CSS files in development, and we squeeze them down into one or two files. And now we're doing great, right? Because we've gone from 100 and something assets, we're down to maybe 30 assets. I do kind of feel dirty about all this, because I've just glommed together a bunch of code that doesn't logically go together, and I've made it so that if I change one line in my JavaScript, my user has to download an entire, say, 100 kilobyte JavaScript slug again. But it's OK. But what I'm really upset about is that I still have 30 requests. I just can't get any lower than that. And I was really upset about this. And Twilight Sparkle came to me. And she said, I apparently find this a lot more funny than you guys do. She said, could you do this concurrently? Because you do get six connections for a domain, and you do know how to make domains. I bet you guys know how to make domains, too. And it turns out, yeah, you can. This is a technique called domain sharding. It's used a lot by sites like Flickr, where you have a ton of assets being loaded. Essentially, instead of loading all your assets from one domain, you make 10 subdomains that all point to the same IP address, and you load your assets from them. So instead of six concurrent connections, you get 60. And I mean, it's really an ugly hack because it's not that great to have 60 open TCP connections in your browser, especially maybe if you're dealing with a mobile device. But what the hell, we do it anyway. We do this with CDNs. This is from my own website. It's some requests we make. We probably use six or eight different domains. And we buy the CDNs for the geographic distribution, but we get the domain sharding for free. So a slightly less hacky way to deal with this is to move requests out of band. These are some interesting new attributes that you can use in your HTML to tell the browser to, say, pre-fetch images, to pre-fetch a web page and pre-render it. Also, you can tell it to look up DNS records. But now this is really getting into perceptual optimization. It's really getting into DOM and JavaScript stuff. So I'm just going to wrap it up. Network performance. You want to move your servers closer to your users. So use the CDN. Check your server configuration files. Check your network stack settings to make sure that Keep Alive is actually keeping alive. Don't use a lot of cookies. And just do what you've got to do. Swallow your pride. Concatenate those files. Shard them across domains. Just, yeah, embrace the dark side of the force. But don't worry, because the Calvary is on the way, right? We've got this thing called Speedy. It's designed to be a lot better at this lots of small files problem, supported by, I think, the newest versions of all the browsers, although as usual, like IE really screws it up. The workaround is to disable Speedy. And but you still need CDNs to get the content closer to your users, because Speedy doesn't lower the speed of light or anything. And it requires SSL, which may be bad for some people. I don't know. And finally, our great hope is HTTP 2.0, which is coming soon. And hopefully within the next couple of years, we'll all be using that. And we'll be able to ditch a lot of these best practices that are really just workarounds for the protocol. If you want to learn more about this stuff, you should really buy this book, High Performance Browser Networking by Ilya Grigorik. It's available free. I just saw that online right now, so go check it out. He talks about everything I've talked about here, but in greater depth and more intelligently. And finally, if you have any questions, just feel free to just come at me, bro. I'm super friendly. I'll give you a big hug, like this little guy. And yeah, so I'll be around. And that's it. Get to the slide deck from my Twitters. Is that it? Yes.