 So all right Hopefully that motivated a little bit as to why we need to think about the network and what you think about the net What you should think about the network when you're optimizing for it some specific tips Well, I could stand here for four hours Because I've been working on this a book for the last year with a Riley called high-performance browser networking which is exactly well it's everything that we talked about here except much more in-depth and With some tips for how to like what do you actually do to optimize TCP on your server? turns out there's very simple things you can do to make The experience faster for your users and for TCP specifically it's actually just like upgrade your server That's the best thing you can do if you have to run on Secure connection or use TLS. There's a lot of things you can you can do or vice versa You can hurt yourself you can do it right Wireless networks HP one and HP two and we'll talk about those in a second. So some examples We covered a little bit about how the the radio network works or how the wireless networks work but There's specific techniques that you can use in this context for things like to improve the performance of the applications One example is so battery life optimization. How do we optimize for the battery life? Well turns out on on mobile networks You actually want to use techniques like data bursting and pre-fetching which is to say, you know You let's say you're loading an app which has a list of articles, right? And thumbnails with each article instead of saying like oh, I'm gonna load the previews progressively as you scroll That's actually an anti-pattern on mobile It's it's both Costly in terms of latency because every single time you have to do that you have to wake up the radio You're gonna incur this control plane latency cost and other costs But also it's very inefficient for the battery power because recall that the radio is the second most expensive component On your phone So what you want to do is actually pre-fetch everything up front or as much as as me is this meaningful free application And then hopefully turn off the radio and not touch it ever again. All right. That's the ideal case things like periodic transfers so Beacons turns out to be this is a huge huge problem So a great case study that was published between AT&T and Pandora So Pandora is a music app, right? You you click on a song it downloads the entire song and starts playing it and It's not streaming the music at downloading the entire song Which is exactly what you want it right because on a on a 3g network or a 4g network Having your radio stay on while I download the song is very expensive So so far so good except that the Pandora app would then send an analytics beacon in about every 60 seconds Basically just reporting like hey, did you like the song? How far along did you listen? Did you rewind? And you know whatever other metadata they ran the analysis and they discovered that that beacon was Contributing point two percent of the bytes of the total transfer, but it was consuming 48 percent of the battery right so Simply by moving that beacon into like a later phase They just it wasn't critical right they could defer it until later and say like I'm gonna send this data When I request the next song Which means I batched these requests and it's no longer an issue So they made their app a lot more efficient by just removing these things for the web things like you know real-time beacons for real-time analytics Awesome anti-pattern right you come to CNN comm you start reading that great whatever news story You're a radio is waking up every five seconds today Sending a beacon saying like I'm still here. I'm still here. I'm sure there's a beautiful You know vanity dashboard somewhere at CNN that says like we got a bazillion users on our site right now in the meantime They're draining the battery of all of our collective devices pretty fast right so Simple things that we can do to fix this kind of Performance problems for TCP and CLS right my quick tip here is you know We can talk in depth as to what you can do But basically if you just upgrade your kernels on your servers You're gonna get a lot of wins just right off the bat And in fact most of the things that I'm listening Listening here are taking care of if you can't upgrade then you know We can talk more in detail later if you want to ask TLS is a very complicated and interesting topic Depending on where you stand. It's either complicated or interesting You know some tips here, you know, some of these things may mean a lot to use some not If you if you want to talk about CLS optimization, I'm happy to chat afterwards, but that's definitely a deep dive HTTP turns out that HTTP has a number of problems in itself, right? So HTTP was created In a world where we weren't building apps like today like we were building pages, right? It was a document you fetch one document you terminate the connection That was the original model except today a page a page or an application is 80 is composed of 80 resources So there are inefficiencies in the protocol things like concatenating files spray. How many of you like spreading images? Like spreading images You guys gotta be kidding me. So spreading images is a hack, right? It's an unfortunate hack that we have to do because HTTP can't deliver the performance that we want I mean that that is That is the reason we have to do it We can't small transfers are very expensive with HTTP and TCP today, which is why we're just saying like look I'll just put it all into one nice bundle one nice file and that'll make stuff go faster And indeed it does make stuff go faster, but it's it's painful Same things like concatenating files right concatenating CSS and JavaScript files It's best practice that we have because of the limitations of the HTTP protocol and it they these best practices actually have Negative consequences as well. So for example for Sprites if you have very large sprites, right? They also occupy a lot of memory on mobile devices or on any device now when we decode the image because we have to decode the entire image We can't just say like oh, let me fetch this you know 16 by 16 pixel region out of your thousand by two thousand grid of icons That doesn't work same thing for JavaScript and CSS for example for JavaScript It's not uncommon today to find Files that are over one megabyte in size right once they're concatenated and these are there's a large applications that we're talking about and the problem with that is JavaScript is not parsed incrementally right. We have to wait for the entire file to be fetched and only then can we execute it Which actually adds a lot of latency So if you just split that same file into let's say 10 chunks we can execute it progressively like in small little increments and give you a better Experience so for example on gmail today right when you type in gmail.com and you get that loading bar That's exactly what it's doing. It's downloading a lot of JavaScript But it's downloading it in chunks and saying like okay, I'm gonna execute this I'm gonna execute this next part and the rest so we can give you some visual feedback and also Accelerate the loading progress. I mean it's an unfortunate thing that we have to do this, but that's that's how it works So we have this new exciting project which is HP 2 right so it's being standardized by ETF right now and The great news or the best news about HP 2 is that it'll allow allow us to undo many of the hacks that we've had to apply on all of our Applications so now that all of you guys have sharded all of your assets Concatenated all your files and sprite at all of your images. Yeah undo all of that right Well, it's actually more complicated than that and we'll talk about it in a second because HP 2 also won't happen overnight right like we will have clients that will be stuck on 1x So how do we kind of go between the two worlds? Because you don't want to hurt your 1x users because they're probably the ones in a slower connection to begin with Versus 2x so it's a it's a complicated topic In an interesting one too and then finally the application so all the stuff that I'm talking through here is covered In the book and by the way, it's so it's free. It's online. You can read it and please actually comment on it It's still in the early draft Then there's things like XMLH spear quest like how do we we've been abusing XMLH spear quest for a lot of things like real-time Streaming and in all the rest there just it wasn't designed for that sort of thing So we have new and better APIs in a browser things like server cent events web socket and even web RTC So web RTC is actually bringing UDP in a browser Something that I thought would never happen, but it's here. It's available in Chrome It's available in Firefox and you can have peer-to-peer communication between multiple browsers, which is amazing So network is the foundation of your performance strategy It's very important to get right, you know as a depending on where you sit You know a designer a web developer or a server guy We need to have a mutual understanding of like how does the stack actually work? What are the the constraints that are imposed by the network and based on that we can actually start designing a smart applications So I mentioned HP 2 Let's talk a little bit about I'm not gonna go in depth on HP 2 But I just want to highlight a few things like what's what's new about HP 2 and you know 2 sounds like a big thing like are we gonna replace all the ankle brackets and you know demand that you use curvy brackets all of a sudden No, so HP 2 does not replace HP. In fact, it's just it's a simple extension So the reason for the 2.0 is that we're redefining how the data gets transported on the wire As far as your application is concerned nothing has changed right like your XMLH spear request code looks identical Nothing has changed from that perspective But how the data is shuttled between the client and server is different Which is why we need the 2.0 because they're basically backwards incompatible so the way this works is we've had IP we have TCP we talked about all these and then we have HP sitting on top and the new component is this binary framing layer and The idea behind the binary framing layer is that we want to be able to split Messages HP messages and deliver them across the same connection So right now if you want to transfer two resources at the same time with HP We need to open two HP connections right and we need to transfer both files in parallel to transfer both files in parallel With HP 2 we can actually do that over one single connection because we basically take one entire message And we kind of subdivide into little parcels and say like you know Here's this chunk belongs to this this stream this chunk belongs to these other stream And then we can just multiplex them over the same connection Which actually gives us much better TCP performance and better throughput lower latency and a whole host of other things And it also undoes all of these hacks that we have to do for things like concatenating files right because There is no There's no overhead with making small requests anymore You can send me a hundred small requests and I'll just send them all in parallel or one TCP connection We don't need to open a hundred TCP connections One really cool feature of HP 2 is HP server push So the idea here is that hey, I've just sent you a request for your index dot HTML file right You know what's what's inside of the index HTML file like there's a logo icon and a CSS style sheet and other things So instead of me getting that data back and then parsing it and saying oh by the way Also, give me the style sheet and these other things What if the server could actually you send me the index request and I could push you all of this all of these Resources at once and say like look you're gonna need the HTML But you're also gonna need you're gonna need the JavaScript and the CSS and these five images, right? This eliminates the extra roundtrips, which of course helps us reduce latency and One thing to note here is that this is not an application API Like this is this is not a JavaScript thing that you script to say like I'll give me a callback When a server pushes a resource this is completely different mechanisms. So this is lower level and In fact, you know the sounds kind of crazy the first time people hear it So we already have server push. It's called inlining right. So how does inlining work? We're saying look I Know you're gonna need this like icon file or this JavaScript file Right, you're gonna ask me for it and it's expensive for you to do so because it's a very small file We're gonna incur the extra latency So what I'm going to do is gonna place this resource right into this file Right like base 64 encode an image into the file I'm gonna push it to you as part of the page that is push right you're basically saying I'm inlining this resource for you So push does the same thing except it doesn't make the resource be part of the page Right. So the problem with inlining is that let's say you have a logo icon that you want to inline across all of your pages Well, the bad news is it's now part of every single page, right? It's let's say that was 10 kilobytes or 5 kilobytes now You've inflated the size of each and every page by 5 kilobytes with push You can actually push that one resource and say by the way This is the logo dot PNG or what have you and put it in your cache, right? So this is really really cool, and I think we're gonna see a lot of exciting stuff Coming out of this this recovery server support, but It'll be great once we have it. So how do we use HP today to oh today? Well, the short answer is the spec is still in the process of being written. So it's not yet ready But we do have speedy. So speedy was a precursor if you will to HP to It is it's still available and basically we treat speedy as an experimental version of HP to Right. This is where it's a test bed where we test new ideas We experiment with them and then we kind of move them in it into the official HP to a spec So today speedy is actually supported by Chrome And Chrome supports it on iOS androids across all the different platforms Firefox and Opera So this is well over half of the browser market that supports speedy and you can actually use it today There's modules for things like Apache engine X node and other things, right? So basically any popular server today has Ability or capability in libraries to talk speedy and the great thing about Speedy's for example take Apache will say your site is running on Apache You'll literally add a module on your server and then the rest is taken care of right So you don't really need to do anything to modify your application from there Of course, you should modify your application to remove things like the main sharding and all the rest because those things will actually hurt your performance with HP to but that's a separate story and Of course at Google we've been offering speedy for if you use Chrome today You're using speedy If you're sorry if you're using Chrome today and using Google services You're using speedy right because a lot of our services run on SSL and we use speedy there Twitter WordPress Facebook They're all deploying speedy as well to the users. So, you know, we see good Latency and performance wins there. So some common questions that I get about HP to You know, do I need to modify my sites? No, you don't right? We already said that but you can optimize your sites for them. How do you what is the best? Optimization the first one that you should start with is on shard. So if you if you currently Splitting your resources across many different domains You want to undo that or you want to have logic that is able to automatically figure out whether that should be applied for speedy connection or non speedy connection Because sharding the connection will basically forces multiple TCP connections Which negates a lot of the performance benefits of HP to Server optimizations we kind of talked about this already. So a lot of TCP tuning that you need to get right to have the performance Good performance for HP to and finally, you know, this is the sounds all complicated. It doesn't have to be you can install a simple modules And you know, you'll have this Capability right in your server. A cool little tip is that if you're if you're running on Google App Engine today If you just enable SSL on your application, you'll automatically get speedy, right? And you don't have to modify your app. So that's kind of a cool feature there. Okay. So finally, let's talk about Measurement it's important that we understand how the network works what are limitations But you know is what's the problem to begin with like should I be optimizing my TCP stack? Or should be should I be profiling my JavaScript code to begin with right like every single application has a different bottleneck So it's important that we have good tools to figure out Where the problem is So there's a great spec That is supported across most of what modern browsers today called navigation timing How many of you guys have used navigation timing or familiar with it? Just a few hands. Okay, great. So Navigation timing looks scary right this kind of scary diagram here But basically what is showing you here is the full life cycle of the page So anything and we covered all this already right so a DNS look up at TCP connection sending the request How much time it took to get the response and each one of these labels? Is actually a timestamp that is provided by the browser that gives you low-level access for each one of these stages? So at a very high level right you can think of it as kind of three clusters One is users connectivity. So depending on whether I'm on a Wi-Fi connection or a 3g connection The time for example to do the DNS look up will vary quite a bit Then there's a server response time So you can actually figure that out based on this data and then there is a in the browser execution time Which is how much time it took to load JavaScript and all the rest So the way to get at this data is to actually just pop up your console Whether that's Firefox or Chrome and you can just type in performance timing and you get this JavaScript object back Which has a lot of these timestamps right here. So each one of these examples here is that same label that we saw in the previous diagram and You know, you can tell that we're serious about performance because each one of those timestamps is in microsecond not millisecond but microsecond granularity, so What do you do with this data? Well, this is available on each and every page load So and the important part here is this is running in your browser Right. So what does this mean the user comes to your site? The performance that timing the object reflects their experience of your site So they're DNS look up time there to be connections. I'm this is not a synthetic test where you're saying look I'm gonna have a couple of servers in North America in Asia and somewhere else kind of ping in my site and figure out how well it's responding You're gathering data from real users on real networks here, which is the real advantage here So once you have this data, right? You can just grab it and beacon it back So if you have an analytic server that you're using you can just report it to yourself and Aggregate it if using something like Google Analytics, we already collect this data for you So if you have it installed on your site Then you're already gathering this data and if you just go into your dashboard and you go to the site speed report You'll actually see Some performance data on your site the one tip that I'll give you is that by default Google Analytics will only send only sample 1% of you visitors, right? So this is just a default number and we also have a limit of I think up to 10,000 samples per day So for example in my site, you know, it's not a very high traffic site I just set the sample rate and manually overwrite the sample rate to say you know Just gather the performance data from every single user because I want to have a really good sample of data, right? I only have a couple of thousand visitors per day. So for me, it doesn't matter So if you go into your site speed reports, and you're seeing not a lot of data Just update this one variable in your configuration and you should be good to go And then you get something like this which is you log into Google Analytics and you get a report that says Hey, you know, there's been 6,000 page use and the average page load time is about 10 seconds And I'm sad to say that's actually my own site. So maybe I shouldn't be on stage talking about the stuff But actually the this is a good point, which is the 9.7 Seconds is actually very it's a skewed number because you can see here that there's for some reason There was a 60 second page load time here Not a good experience, right? And the average is getting skewed by this by the sample So what I can do then is I can go into Google Analytics or any you know any analytics solution that you're using This is just an example. He can start segmenting the data, right? So you have the user's IP address you have maybe your application data like a user ID or other things and you can start Going deeper and say like well was it the case that everybody was experienced in 60 second page load time Or was it a specific maybe geographic region, right? So in this example, I'm actually segmenting all of my traffic by Geography, so I'm saying look I want to look at Singapore San Francisco in Japan, right? And it turns out that it was Singapore specifically like there was users coming from Singapore had just couldn't load my page Like they were stuck there spinning for 60 seconds Actually later I tracked this down to one of the social widgets that I had on my page Which just wasn't loading in Singapore and it blocked the render of the page on my site So you know problem fixed afterwards But you know it doesn't this didn't affect everybody and Frankly, I would have never discovered this unless I actually gathered this data from real users, right? Because if I had just a server a synthetic server ping my ping my site from London in New York, I would never caught this And then finally I guess a really important point to make here is that averages for performance data I'm just leading if you if you're tracking average response time average latency and other things That is the wrong metric to use for performance data what you want to use is something like a mean or sorry a median in and Even better look at the actual distribution for example, you know, here's kind of a silly example, but here's a long tail distribution What is the mean value of this distribution? Right? It's somewhere right here but that's kind of a meaningless number and Let me show you this as an example So in the same Google Analytics reports We also give you the actual histograms of the response time. So for example in this case I'm showing you what is the page load time right split by buckets like how many people have finished loading All the page in the page in less than one second in one to three seconds three to seven seconds and so on right so you can kind of get this hump here and then there's a long tail and On the right is a comparison where I've actually upgraded my site I made it faster and you can see that the whole distribution kind of shifted upwards Right, which is exactly what you want to see more users are loading the pages faster But they're still outliers right first for whatever reason there's still 4% of users that are experiencing 60 set 60 second plus page load times. I'm not sure why right? I need to track that down And here's a really good example of why averages are so misleading. So this is a different metric This is for server response time right and look at this number here. So on my site I was running a WordPress blog and I had caching enabled, you know as any good WordPress site should and Most of the time right so for 40% of the time the pages will load really fast But then for whatever reason a lot of the pages There's a second hump here, which is some pages will take one to two seconds to load and the reason for this is because Some of the some of the posts were in cash So I was able to serve them very quickly But then whenever I missed the cash and I had to go to database and render out the whole thing It would take one to two seconds right so you look at this distribution you say what is the average? Well, I'm gonna say that the average is pretty much meaningless here right because what's actually happening here Is there's two completely different distributions of users? There's the fast users which are experiencing the cash load times and then there's a slow users in this case There's a lot of slow users over 30% 40% of the users right and then I once again I upgraded my server made everything much faster and now Let's see 90 over 90% Is that right? Yes over 90% of all the users are getting their pages loaded in less than 500 milliseconds All right, so this is why you want to look at histogram data for all of your performance metrics so Measure your user perceived network latency with navigation timing if you have not This is if you have not already If you don't have an analytic solution that can deliver this today There's a number of them available of course I mentioned Google Analytics But there's a lot of third-party solutions they can stall on your site as well specifically for Google or specifically for collecting ROM data but of course I find that the real power of having the Performance data is that you can then segmented and intersected with other metrics in your current analytic solutions things like What user type is this right like are there in a tablet? Are they which geography they're coming from and you can also if you also have your revenue data? Let's say you're selling widgets on your site right you can actually say what is the revenue per user for users that are Experiencing a five second page load time Right and you can compare those segments and you can look at them side by side and you know I think most of time you'll find that there's gonna be a big negative number associated with a five second plus Audience so that's also a great way to motivate Your company to say like we need to invest into performance right because I can I can be here on stage and like rah-rah performance But if you can't connect it to the bottom line, and why does that actually matter for my organization? Yeah, you know what we got a lot of other things to worry about I need to build that new whiz bang future All right that supposedly all the users want So use advanced segmentation set up weekly reports You know I just have Google Analytics email me a report every Monday actually so I'll get one I got one yesterday which just says like here's what happened last week, right? And I can look at that histogram and say okay. There's some outlier in you know this specific region I can look into that or for some reason my latency is spiking So let me pause here for one second You know we talked we covered a lot stuff about the network. Do you guys have any questions before we go any further? Yep, so the question is like if you want to implement speedy and you have an Apache server And then in front of the Apache server you have an engine X server, right? How does that actually work? So that's a hard question Both Apache and engine X support speedy But what only one of them can terminate the TLS connections first of all practically speaking today You need a secure connection at TLS connection to run speedy for a reliability reasons because there's a lot of intermediaries things like caches on the web which cheese Sorry, I'm losing my voice here Which don't understand speedy and They may fail the connection. So part one. We need SSL, right? Then once you have a cell you can have for example engine X terminated. Oh my goodness And once the connection is terminated Engine X will actually transform Hsp2 requests into Hsp1 So it will send the regular Hsp1 request to your Apache server. So your Apache server doesn't need to be aware of Hsp2, so that's probably the simplest way to do it. The other way to do it would be to actually just Put engine X into a DOM TCP router, but that's probably not what you want to do Right. Yep. So what you want to do is you want to terminate the TLS connection at engine X and engine X will be smart enough to Convert the Hsp1 request into Hsp1 and send them to your Apache server or any other server So if you have a Java backend, a Ruby backend or a Node backend, and it understands and speaks Hsp It'll just accept that request send the response back to the engine X and it will re-encode it into Hsp2. Thank you Yeah, so that's probably the simplest way to get started Any other. So how will Hsp2 affect mobile battery life? Hopefully it'll make it better. So one of the problems today with a lot of Hsp connections, for example, is Closing those connections. So oftentimes what happens is you fetch a resource like an image file from your CDN, right? Your phone goes to sleep, it turns off the radio and then 15 seconds later that connection needs to be terminated So we wake up the radio just to send the fin packet, right? Like one bit of data saying like I'm gonna close this connection and That drains the battery. By having one connection, we can actually both deliver better throughput But we also don't have to close it as many connections. So it actually it works out to be better. I think in the longer No, no, okay. So this is a great point Just the fact that you have an open TCP connection does not mean that you need to keep your radio on. This is a very important point. So What happens is Let's say you have your router at home Like let's take a simple example, right? Your router is the one that terminates the connection when it comes in from the web, right? And then your router forwards the packets to your laptop, right? Similarly in a wireless network or mobile network, the network will terminate the TCP connection and it keeps that connection open, right? Then the radio, the tower can tell you, hey, turn off your radio because there's no packets coming to you at this At this very moment and then if a packet, new packet comes in I'll tell you to wake up and you can resume the connection So the physical connectivity is not correlated to TCP connectivity, right? And this is a this is a great point actually because very frequently I find frameworks You know JavaScript frameworks and other things which have specific Code that says like set interval one second or whatever, right? And like just ping the server because I want to keep the connection open because otherwise, you know bad things will happen It's like no, no, no, you don't need to do that, right? The radio network will take care of that for you. Your phone is smart enough to wake up When it needs to without terminating the speed connection How does htp2 or speedy compare with htp1.1 with keepalives? So how does htp2 or speedy compare with htp1 keepalives in regard to like number of open sessions and Right so with htp1 Most browsers have a limit of six connections, right? And ideally all those connections are long-lived connections because you want to grow your bandwidth and all the rest That So that that's keep alive at work right with previous versions of HP for you needed a new connection for every single request Like you send me a request for an image. I terminate the connection and then you restart keep alive allows us to reuse that connection Htp2 is much more efficient in that it also allows us to send multiple requests in parallel Right, so these are independent things We each like htp1 keep alive just means that you let's say you only had one connection with htp1 This is how it would work and you want to send me five requests With htp1 you send a request and you wait until I give you the response back. He gave me the full response I send you the next request so it's serialized, right? And that's not good for latency for obvious reasons with htp2. We can say one connection. Here's all five requests Server you determine. What is the best way to send me all the data back? Right, so for example, maybe you want to send me the html data back quicker than the image bits because you know It's more meaningful for me to start constructing the page than to start this. I can't display images until have the html right, so these two things are independent With speedy or htp2 Do you have to configure all the resources for a page or does it parse the html and read the resources? Are you referring to server push? Yeah, so server push Is a really interesting area that still needs a lot of research So if you look at the actual specification the way it's written it says nothing about how Like it says push as possible, but it doesn't give you any algorithm for determining how the push should be made So this is something that servers or your applications can innovate on top of right like this is just a basic building block as an example the jetty server Those guys have implemented a cool algorithm where The server looks at their request So you send me the index html request right in within the index html There's a bunch of images that I need to request when the server sends the request for those images It actually also sends the refer header saying like I'm referring or I'm requesting this image from this page right the server then aggregates all this information over time and kind of builds a Relationship map to say like whenever somebody asked me for index They also later ask me for the logo and a CSS file and JavaScript file and then the server can automatically figure out Which which resources to push right? This is an example. This is the kind of auto magic example if you will a more Hands-on example would be to grab a low-level server like for example the nodes PD implementation And then there there's an actual API that just says like push this resource right So you can have really tight control over which resources you push Another thing to mention is that push is all is also not like it won't You need to be very careful how you leverage push like if I already have the logo dot PNG in my cache. I don't need it right, so Then we need to figure out an efficient way to figure out Which resources to push and when right so maybe that's a cookie maybe some other mechanism There's basically what I'm saying is there's a lot of room for innovation here different servers are approaching it from different angles today You can have a hands-on look you can have a our hands-on strategy. You can have an automated strategy and everything in between Yeah, okay, so do other bars of support or what do they do if you want to talk speedy and they don't support speedy Right, so speedy negotiation happens during the TLS handshake, right? So if your client doesn't support it, it'll just Fall back to HP one without any extra penalty So basically what happens is when we first send the TLS handshake We also advertise the fact that the client advertises the fact that it supports speedy And then the server can opt into you to use speedy or not Right, so if the client is not aware of it it just wouldn't advertise it to the server and the server would say great I got to fall back and use this which is why for example the nginx and Apache modules work transparently right you just drop them in and The server itself determines like for this client for this chrome client. I'll use speedy for this Ie client. I'll have to use HP 1.1 in the meantime. So that's that's a nice thing about it The speedy benefit restful services. Can you elaborate on that a little bit more? So let's see That's more of an application concern, right? So it should make it more efficient So actually one example I'll give you is It turns out that most HP requests have a high overhead If you omit the cookie data an average HP request adds about 800 bytes of metadata Right, so let's say you want to send a tiny little JSON payload that says hello world like message hello world You know 16 bytes of data on top of that will wrap it in a nice package of 800 bytes of HP metadata Things like here's the user agents train here is the refer header. Here's the whatever, you know everything else So there's a lot of overhead associated with that with each speed to we actually have header compression Which is to say all of the metadata will be compressed so much fewer bytes Transferred so that's one example you can You can have multiple requests going over the same connection, but otherwise, you know, this is effectively transparent to you Sorry is asking about speedy on Android web you so the answer is no So the the question was does Android web view support speedy and the answer today is no If you look at the Chrome Repository we're working on a new project, which is the Chrome view, right? Which is a chrome powered Android web view if you will that will support speedy But it's still in the early stages experimental stages, but something you can check out Sorry, I'm not sure. Oh Oh, yeah, I'm not sure what the limits are on web view So most browsers mobile and desktop have this limit of six connections for HP one And that's that's kind of an empirical number that we arrived at One of the reasons for this is some routers are not very well designed Let me put it that way and they start dropping randomly dropping connections after we send too many requests So all the browsers kind of you know picked the lowest common denominator Unfortunately, which is six and actually this is kind of a fun fact In Chrome 27, which is the latest release just shipped last week. We actually changed our domain or Changed our connection logic to say Yes, you can have six connections, but we will only download 10 images at once No matter how many connections because we found that through our testing that a lot of sites were abusing domain charting they were trying to download way too many images and those images would basically saturate your bandwidth and Not allow us to download the Javascript and HTML and other things fast enough such that by basically imposing the slimming of 10 image requests We were able to get faster rendering performance, which is kind of counterintuitive But what it means is that if you're charting images today across n domains With 10 at most 10 requests that means you should be using at most two two separate domains All right This is only chrome today, but you know we found that this was a nice win in terms of visual rendering performance Yeah All right Anybody else? So are we gonna? Are we going to have to change our application servers to support pipelining? The answer is depends on your application server Most likely yes because most of the application servers built today are not built with the assumption that you can use pipelining So let me clarify that HP 1.1 theoretically supports This pipelining idea in practice. It just hasn't worked out. It's not really deployed on the web So all of our requests are sequential right with Things like server push and ability to push multiple streams And we also have this idea of priorities in each p2 so you can actually say when I send the request to your servers I can say like this is a very high priority request. It's a JavaScript file You know, I just sent you five image requests that don't worry about those like I just discovered JavaScript file Which is like I need this yesterday, right? With priorities your server can now look at this and also prioritize how it processes those requests, right? So there's once again, there's a lot of I think interesting innovation that will have to happen on the server side We're shifting a lot of the hacks and workarounds from the browser. So I'll give you an example today a lot of the browsers today play funding games with Sending requests so let's say we're parsing an HTML file and there's 80 resources on it, right? But we can only have six requests going at once We discover a bunch of images at the top of the file. Should we send those immediately? Well, we don't know right because if there's a JavaScript file later Which is actually blocking your rendering then maybe we're better off waiting until we discover But we don't know because we haven't parsed that far. So what should we do? Well, we can we can start playing games, right? We often actually even though we have the resource we defer it and say like I'm gonna wait because I'm not sure And that it creates additional latency with HP to we can just get rid of all that logic and just send everything That wants to the server but this means that the server needs to be much smarter now, right? It can just say like here all the bytes that you asked for Right because that could be a large jpeg which is not helping the user and it looks like we have about seven minutes before the break So this is actually probably a good place to break. So if you guys want we can ask a few more questions And then we'll continue after the break Yeah, I'll take that as a yes Yep, sorry, can you can you try that again so using push, right? Yes Yeah, so the way push works is you send me an index HTML or a request, right? I send you the response for index HTML and I also send you the associated resources and each one of those resources Is just as if you made an HP request it goes directly in your cache Such that later when the browser asks for it, it just pulls it directly out of the cache So you can all the same logic applies you can have cash control header is you can you can have many other things In fact, you know kind of a fun Not really explored area right now is something like well If I can push your resources that also means I can invalidate things in your cache All right, that's kind of a crazy hack optimization, which is to say let's say I told you to cash My application JavaScript code for a year All right now it's sitting in your cache, but now I have an update All right. Well with Regularly today, right? There's no way to invalidate that with push I could actually like create a fake request and send you like a response to a fake validation request to say invalidate this right it I'm not sure if that's actually supported by the browsers, but it's supported by the spec right So I think each piece will open a lot of interesting innovation here Like the implementations and then your next and Apache today are fairly simple, right? Like we're getting the basics right of like here's how the framing works Here's how all this bits are laid out on the wire, but these more advanced use cases is something that we're playing with It is Cross browser. Let me see if I can pull this up here and see if the Wi-Fi gods are with us Speaking of unreliable performance Here you go. So I nine plus Firefox Chrome Android. So the notable omission today is iOS in Safari I hope they implemented soon. So this this is an official standard now a W3C standard. So There's great adoption for it Yeah, and I've personally opened I think three bugs on the safari tracker to say hey We're getting at nap timing and the response is closing as a duplicate of blah and I can't look at blah So I can I can't tell you But hopefully soon So what is the best way to implement real-time deliver like notifications on mobile? So That's a fun question. It's it's hard. So what you don't want to do is So first of all real-time means many different as many different meanings for different people, right? Sometimes that literally means like I have a notification and I need to deliver it within, you know X number of milliseconds That's kind of the SLA for my application for others. It means like I just need to send a notification within a minute Right. So what you don't want to do is unless you absolutely have to you don't want to be waking up the radio We're just pushing superius updates, right? If you if you have some ability to batch updates, that's the best way to do it So for example, let's say your application is emitting update events every I don't know 20 seconds, right? There's a lot of activity going on but the user doesn't actually need to have an update every 20 seconds You can batch those and deliver them every two minutes or every one minute Or you can look at the battery life and a device and say like hey, this you know the batteries are really really low And the user will probably appreciate if I started, you know sending these updates less frequently So it requires a little bit more logic if you look at services like Google Cloud Messaging so all of them platform So iOS Android have services which allow you to do efficient push and the way this works is for example Google Cloud Messaging actually knows when your device is on Right it so it can be smart you push your message to the Google server Right it it buffers it and they can set a flag on it and say The specific flag is actually delay while idle, which means that if the device is idle don't wake him up Like this is a you know, this is a cool notification about you know the latest Whatever, you know NBA scores, but this user doesn't really care that much like when they turn on the phone get it down to them But don't wake up the radio and the Google server is smart enough to do that for you You can also set things like time to live in a message to say like Well, I sent you the score, but if the user doesn't check in within 60 minutes, that's old news So just drop it on the floor like we don't need it anymore Right, so it's combination of these types of services that you can use to do really efficient push And then if you can't use a service like that then you can just build smarter application logic to say You know can I batch this request can I have adaptive intervals and then even deciding on which is the best strategy It's kind of an interesting question. So sometimes it may be actually more efficient to poll for updates Right, because there's a lot of coordination costs between like should I batch this or should I not? Let's say you have a lot of updates coming once again every 20 seconds, right? It may be just simply more efficient for the server for the clients to pull the server once every two minutes Instead of having a end-to-end connection But if actually at the end of the day It doesn't matter how much data you sent one bite one megabyte you will wake up the radio and the radio will be on for about 10 seconds This is I guess one thing that didn't happen the slides But whenever you turn on the radio the radio is on for about 10 seconds It doesn't matter if it's one bite or one megabyte So if you're gonna transfer data transfer as much of it as you can and then turn off the radio like don't don't trick We'll meet true don't trickle the bites by saying like here's a preview of this image and 10 seconds later Here's a preview of the next image. That's actually an anti-pattern and Actually, let me go back and I'll show you guys I'll share the link to the slides and there's a lot of links embedded at the bottom If you're interested in specifically mobile So I give a talk at Google I O last week specifically about mobile performance From radio up which is to say how does Wi-Fi work how does 3g and 4g work? And what are the some specific strategies that you can use to optimize your application for battery performance and also just latency and other constraints, so You can check that out later