 So I hope you are ready with energy for the talk. But if you want an app, that's fine. I don't have a problem with that. So we're going to talk about web performance. So I will start presenting myself really quickly. Max Emiliano is my name. Max is fine. I'm first on Twitter. I'm a mobile web developer, so I have been doing web development since that browser, since 1995. So a long time ago, using front page, for example. And as we mentioned before, I've been traveling a lot for doing consulting training in 66 countries. This is my first time here in Serbia. So I've also been doing training for a lot of companies and consulting. And you can find content and trainings from me on different providers, such as LinkedIn Learning, previously known as Linda or Safari. I have also a couple of books. There is a kind of a trick there. So because we have original books, these are the original books, and those are translations on languages that I don't understand. So the last two books are basically on the topic that we will be covering in the next 35 minutes, probably 34 now. So performance, web performance. How many of you are currently creating websites or web apps? Perfect. And how many of you are creating fast websites? OK, only 10%. And there are a couple of, OK, let's see. We are going to talk about that and how you can hack the web performance, how we can push that even further. That's the whole idea of this talk. So I have two goals for this talk. The first one is to show you new tricks that you can start applying in your own websites or web apps, pretty soon. And the other one is to make you feel bad. About what you are doing right now. So let's start with that. So I know that you know that your users are currently not happy. And probably you are losing money. OK, you or your customers or your clients are losing money. Why? Because of the performance or because the web is slow. Your website is slow. I'm sorry to tell you that. So some quick information about this. Half a second delay. It's increasing user frustration by 26%. OK, that's a lot. Half a second delay. It's reducing Google's traffic. And that's money for Google, in terms of ads, for example. 100 milliseconds only. Delay will decrease Amazon sales by 1%. These are all the information, OK? So there are a lot of information and data that you can get on this problem. And I'm pretty sure that you know that today we are not talking about page load as a metric. There are new metrics that are user-centric, not about JavaScript metrics, but about from a user's point of view. And we have first meaningful pain that is when something meaningful appears on the page that is not just a line or a background color. So something that the user's brain is currently parsing. Then we have first interactive, because it's not just to see something on the screen. We should be able to interact with that, like, for example, scrolling or touching. So sometimes you see a page, but it's not working. And visually complete is finally when you are seeing the actual content. That is not page load. And I'm sure that you are creating your own custom metrics, because that's something that it's really important. So what is important for your users? What are you expecting your users to do on your website or your web app? And then you measure that. So I know you know that performance is important, and we don't need just the metrics. We need goals, because how fast is fast? Probably you say, oh, my website is loading in 15 seconds. Really fast, right? Well, not really. So we need to set our goals. Currently, the goals in the market are around this. First meaningful pain should be between one and three seconds. So the user is typing a URL or clicking on a link. And after maximum three seconds, it should be seeing something meaningful on the page. And that means not on your computer and Wi-Fi. That means on any context, including, let's say, a low-end device with a 3G connection, with a slow 3G connection. First interactive has one more second to make that content interactive, so to four seconds. So now you need to think about this and see how close are you to these goals today. So I know that you're already doing a lot of stuff, like optimizing the network transfer, such as, for example, enabling GCEAP on text-based files on your server. You are currently working with TLS and HTTP2. You are using CSS as an appetizer. So we're here in Serbia, appetizer is like a big plate. But typically, the appetizer should be small. So CSS should be small. And you should start with that. So the CSS should be the first thing that you're going to deliver to the browser, because CSS will block rendering, just small and as an appetizer. And JavaScript has dessert at the end. We need to defer JavaScript as much as possible. So JavaScript is your baklava, basically. So you are already optimizing images. You are already creating a great image with PNG or JPEG. You're optimizing that. You have an HTTP cache policy. You are currently using service workers, right? Everyone is using service workers, right? So you should avoid the redirects. That is, when you go to a website, and that website says, no, you know what? No, no, it's not here. You need to go there. And you go there, it says, no, it's not here. You need to go there. So for example, you type a URL, and then, first, you go to HTTPS, which is fine. But when you get there, it says, oh, you're a mobile phone? No, you need to go to m. And then you go to m. And it says, oh, you want a home page? You need to go to m. slash home page. So all those redirects are basically wasting time. So you are not doing that. I already know this is basic web performance techniques that you are all doing that stuff. What's the problem? The problem is, even with all those techniques, the average time to load the mobile landing page today is 22 seconds. It's really far away from our goals. So it's really far away. And this is from Google, by the way. So if a page is taking more than three seconds to load, more than half of the user will start leaving that page at that time. So you are losing users. You are losing conversion. You are losing money. Because we are not doing web performance for the sake of web performance. You have to measure with your clock. We are doing this because of conversion. We want conversion in our websites. So that's why we are doing web performance. And one of the problems is that we have been always underestimating mobile since the beginning. So for example, today you can say that it's iOS and Android. But from a web point of view, is it Safari and Chrome only? So the answer is no. So if you have tested your website on mobile browsers, it shouldn't be just iOS with Safari and Android with Chrome. There are a lot of browsers out there, Samsung Internet Browser, UC Browser, Opera Mini, Chrome on iOS that is not actually Chrome. So these browsers are currently having good market share. So more than 3%. That means millions of users that are currently using those browsers. And also we have Facebook that now has a name, Facebook Mobile Browser. What I'm saying is when the user is browsing on Facebook and clicking on a link by default, that link is not being browsed in the browser, but inside an in-app browser. That in-app browser is using a different engine, both on iOS and Android. So have you ever tried your website there? How it looks like? How it works in terms of performance? Is it the same as Safari or Chrome? It's not. So a lot of users are currently on that situation. And also when we are talking about the mobile web that is today more than half of our users on some situations it can be 70% and 80% of your users on mobile browsers, we are typically using cellular networks. And I know that you are thinking about this. Oh, we have 4G now. So we don't need to worry about performance. We shouldn't need to worry. But to be honest, if you look at that information, that's for a couple of months ago. So less than 30% of the users worldwide are currently in 4G. And it's not so much. And also sometimes when you have, you're a lucky 4G device user. You're also a lucky 4G data plan user. You are also a lucky user that is currently in a city with 4G coverage. So lucky users. On that situation, 10% of the time, you are downgraded to 3G. So it happens a lot. I'm currently in 3G here. I'm in roaming. Even with the iPhone 10, I'm in 3G. So that happens a lot. It's not just because it's Eastern Europe. I've been in San Francisco. So in the middle of Silicon Valley, I was driving like half an hour north, and it was in 2G. 2G in the middle of Silicon Valley. So that happens a lot with your users. So you need to be very careful about that. And when we are talking about 3G, 4G, or even 5G, typically, your brain is thinking about bandwidth. So, oh, it's fast. 4G is faster. Yeah, it's faster. But the problem is not the bandwidth. The problem is the latency. So if you think about this, if you have at home a 50 megabits per second connection and you go to 100 megabits per second, are you browsing the web faster? Probably not. It's better for Netflix. It's better for YouTube. It's better for downloading big files, but not for browsing the web. Because when we are browsing the web, we are just downloading a lot of small files. And the latency, if you look there, in 2G is really big. It can be up to one second. That's the round trip time that takes to get data from the server. 3G is better, up to 450 milliseconds. 4G, it's even better, like a quarter of a second. But if you compare that with a home DSL connection or an office DSL or cable connection, 4G has 10 times the latency of your home connection. So even those 4G-lucky users have big latency. Then that's the problem. So that's why we need to push web performance even further. So we need to hack web performance. So I have here a couple of tricks that you're gonna start analyzing and using in your own websites or web apps to push the web performance further. So we're going to start talking about the first load, how we can hack the first load. So the first time that the user is accessing your website and it's currently being downloaded from the server. We should try to avoid more than one round trip. This is a very extreme case, okay? But it's interesting, I have done a lot of consulting and we did this trick and it's incredible how fast you can get the first load. So in TCP, I'm not getting too deep into TCP and networking, but basically there is something known as a slow start. So even if you're not a network engineer, you know that on the internet, basically the data is split into packages, okay, TCP packets. So the problem is how big are those packets, okay? So if we can try to fit our website in one packet, it will get faster because it won't need another round trip. That's basically taking the latency. And there is something known as initial congestion window that in Linux-based servers, it's around 14.6 kilobytes. In fact, it's kilobytes. And if you don't know what a kilobytes is, Google that later. So 14.6. So if you can fit your first response from the server, so it's only for the first response this. In 14.6k, it will fit in one TCP packet, okay? Only one. So the problem is you say, oh, 14, it's not too much, but we are GCP in our files. So that's around 70k of HTML, which for example, for the viewport of a mobile device, maybe okay if you can fit the HTML there. CDNs are also playing with different values of these initial congestion windows. So you can see a couple of experiments going on on this idea. So if you can deliver the ATF content, what is the ATF content, above the fold. So above the fold is the content that you see in the first scroll. So if you can fit the content for the first scroll on a mobile device of your website or web app in 14.6k, she sipped, you will render the content as soon as possible. And you can also embed CSS and JavaScript videos for that first initial load only. And if you still have a space, you can embed the logo. And I'm talking about embedding an AMG tag with a source logo JPEG or logo PNG. I'm talking about using base 64 and data URI because we wanna embed that in one HTML response. Another thing that you can do now to increase web performance is to avoid the HTTP to HTTPS redirection. So you should know, I'm pretty sure that you know that you need to move your websites to HTTPS. So it's mandatory today. Every website should be in TLS, so HTTPS. The problem is that when the user is typing the URL, it's not typing HTTPS colon forward slash forward slash. So on the browser is going to HTTP by default. So we have a redirect there that can take around 300 to 500 milliseconds. And we have a tiny batchet and maybe half a second is a lot. So we wanna avoid that. The solution is something known as HSTS. And it's kind of a protocol that you're going to add in your HTTP header. And you're going to say, on that first access to HTTP, hey browser, you know what? My website from now on, it's will be always in HTTPS. So a stop, go into HTTP by default. Okay, so next time the user will go directly to HTTPS and we are going to save up to half a second. And also, when you have this, you can go and opt in into that website. And then browsers such as Chrome, Safari, Firefox and Edge are taking the list as a white list pre-loaded in the browser. So a new user will type your URL. And if you're in that white list, it will go directly to HTTPS, avoiding one redirect. So after the first load was, after the first response was sent to the browser, we need to hack the data transfer. Okay, so okay, how can we send the data even faster over that network, over the wireless network? So first, I want to talk about something known as Qwik. Okay, there is no typo there, that's the name, Qwik, without the K. And it's an experimental protocol to provide an HTTP-based API over UDP and not TCP. In fact, if you are using Chrome and you are browsing Google websites, can be Gmail, Google.com, Google Drive, you are currently browsing the web using Qwik, okay? So Qwik will reduce latency a lot because it's not using TCP, it's using UDP. But it's compatible with HTTP too. So basically, it's compatible with all your current, let's say websites and architecture. It has something called zero RTT, RTT, so round trip time connection establishment. So you can analyze that later, but it means that when the user is coming back to your website, it will have zero round trip time to get that connection open. Okay, so it's increasing a lot. In fact, for the Google search page, it's increasing the performance by 3%. YouTube has reduced 30% of buffering thanks to using Qwik. Facebook has a similar protocol for their app. It's improving performance by 2%. And according to Google, three quarters of every request on the web can be faster if you will use Qwik instead of normal HTTP too. Qwik is experimental, but it's currently available on Chrome and there are some works on other browsers as well. And of course, you need to add something on your server so you can serve your website using Qwik if the browser is compatible with Qwik, okay? Cool, so subflee, repeat with me, subflee. So this is a new algorithm for G-SIP that will encode your G-SIPs, compressing more, okay? It can save up to 8% of the data transfer. So the same HTML, when you send that over the wire or the wireless channel, it will occupy less bytes, okay? So the problem is that it's 80 times slower, so it consumes 80 times more CPU for compressing, not for decompressing. But because it's G-SIP base, it works with IE6. It works with every browser. So you can start applying this on your server and then you will save data transfer so your website will appear faster on user's devices and you will save data transfer in the meantime. If you want to push this even further, there is another protocol known as Broadly. By the way, we are talking about German names here. In this case, you can save up to 25% compared to G-SIP. But it's not G-SIP compatible, so you need a browser compatible with Broadly. So you can check the encoding header. By the way, there are a couple of Broadly compatible browsers today, including Chrome, including some versions of Safari and Firefox. So you need to check the encoding header and then send the Broadly version. For LinkedIn, for example, it has saved 4% in loaded time. So if you take 4% from here, 3% from here, 2% from here, again, because we have a tiny budget, you can actually save a lot of time and you can actually be closer to your performance goals. So Facebook has saved in CSS and also serving JavaScript files using Broadly. And by the way, with Broadly, remember that we mentioned the 14.6K threshold for the initial response. If we are using Broadly, we can store up to 80, 85 kilobytes of HTML on one TCP packet. Readable streams. This is also interesting. It's coming right now on some browsers. It will let you process the data as soon as it's arrived. So let me give you a quick example. It's typically today, if you have a web app, it can be a React application, it can be an Angular application or a View or whatever, a vanilla JS. So you are downloading a JSON file, you are parsing the JSON file client side and you are rendering something on the HTML. Okay, that's basically the idea. But let's say you're downloading 100 items, I know, clients, cities, whatever you're downloading. So you're starting parsing that JSON file only when you have the 100. You cannot take a chunk of the JSON as soon as it's arriving from the network. You're receiving the file using ASHAX or the Fetch API. You're receiving the file when you have the total file. So what if we can start receiving from the network layer pieces, chunks of the response and start analyzing each chunk and start rendering as soon as it's at one part of the JSON is available. So that API, it's called Readable Streams, it's available on some browsers and you can start playing with it. And it can save on pain metrics when you are doing this kind of client side rendered solution. So that's for the initial page and also for sending the device from the server to the client. What about loading the rest of the resources? Images, videos, web fonts, style sheets and JavaScript files. So we have new modern cache control systems that you can start using. For example, if how many of you are React or Angular developers? Let's see. Okay, quite like 20%. So you're probably using the CLIs, probably, I don't know, or webpack at least. So these CLIs, when you're compiling your apps, are creating file names with hash for your resources. For example, your style.css file is not styles.css, styles.hash, like x, h, 0, 2, 4, .css. And every time you change that file, the name changes. So now we have a way for the browser to say, you know what, instead of doing conditional requests, we know for sure that that file will never change because if we are changing the CSS file, React or Angular, CLI will change the file name. So we know for sure that it's not going to change. So we ask the browser, hey, browser, if you have a cached version of that file, use it because I will never change it on the server. And that's known as cached controlling mutable. Okay, that is currently available on those browsers. In Chrome, it's kind of great because Chrome is not supporting it yet, but it has like a similar intelligent solution that is finally getting to the same result. And also, there is a common new pattern that is available mostly with service workers. How many of you have played with service workers? Let's see. Okay, quite a few, only 1%. So we need to change that. But there is a new pattern there known as a stale-ware-revalidate. Like you always serve the file from the cache, okay? So from the local cache. So you serve the file really quickly. It's in my local cache, so here you have it. But it might be outdated because maybe on the server you have a new version. So then the browser is updating that in the background. So the next time you request that resource, it will be updated. That's known as a stale-ware-revalidate. If you have played with service workers, it's a common design pattern. And now that design pattern is coming to browsers. So you can specify that you wanna use that behavior, stale-ware-revalidate, and the browser will do that for you. It will give you the cached version when you're actually, I mean, an IMG tag and a script tag. It will take it from the cache, but in the background, it will go and update the resource for the next time, okay? Warming up engines. So when the browser is trying to download resources, we can try to warm up the engines and help the browser to start some processes as soon as possible. For example, DNS queries. When you are downloading files from a different domain, different origin, DNS queries needed. So you can change, resolve the name to an IP address. And that might take up to 200 milliseconds. And that's a lot, because we have a tiny batchet. So, and also connecting to TCP and the TLS connection can be another 200 milliseconds when you are connecting to another server. So we can announce DNS queries as soon as possible. At the top of your HTML, you can add a link with a real DNS prefix and you're announcing the browser, hey browser, later I will request a file from that server. So you can start the DNS right now, beforehand. Also, you can use pre-connect. And in this case, it's not just the DNS query, it's also going to create the TLS and TCP connection to that server, okay? Even if the browser doesn't know that you will need the resource from that server. And on both, you can use an HTTP header instead of HTML. So when you are answering with your HTML, you're going to say, hey browser, you know what? Later, this HTML will need this domain. So start the connection right now. To bundle or not to bundle, so what is this? So this is a typical question that I receive in terms of JavaScript and CSS. So typically you have 10, 50, one hand with CSS file and JavaScript file, I've seen that. So should we create like one big bundle or should we create a lot of script tags and link tags for different files? On HTTP 2, bundling seems like an admin pattern. However, it's not. Bundling is still a good idea. In fact, this is a tweet from Paul Irish saying that from the Chrome team, saying that today, bundling is still the best possible idea. So create one big JavaScript file, one big CSS file, but big however here. I'm talking about only the necessary content for rendering initial rendering, not the whole website or page. So we need code splitting first and then for the initial rendering, only one file, okay? So that's the idea. So don't create a five megabyte JavaScript file because that's not the idea. It should be really small, only for the important part and then you defer and download the rest. So web fonts. Web fonts are nice, okay? We can change the default fonts that we are using on websites, but we have a problem that probably you have seen that. The flash of unstyled text, what is this? You access a website, you see the logo, you see the backgrounds and no text. And text on the web was always non-blocking. Text was just text, it was there. But if automatically, we don't need to wait, but now when we're using web fonts, we are waiting for the font and that's a problem. So now you have on CSS a way to define how do you wanna load this with font display? So look for font display. So you can specify the algorithm so you can create a more performant solution for this. And finally, for resource loading, we have preloading. So preloading is even like a better way of prefetching or the DNS prefetch or the preconnect. So we are pushing it even further. We can help the browser prioritize the most important resources. For example, the font file, okay? So you use a link, rel preload, and you're going to say, this is a style sheet, this is a web font, this is a script tag. You will need it later, but start downloading it now. So we can help the browser to download the most important files sooner with more priority. Okay, and that will save a lot of time for your metrics. You can also, of course, use an HTTP response header instead of a link tag as well. Hacking images. So we know that the picture is worth a thousand words, the problem is that if it loads, okay? So we need to load pictures. So first you need to embrace responsive images, okay? That means you need to deliver not one image, not three versions of the image, okay? Because today a lot of traffic is coming from images and you can improve a lot your metrics if you create N versions. That is the version that is exactly the one that is needed for that particular user. Also it's time to replace normal standard JPEG and PNG files. Remember softly, the one I mentioned before for Gzip? The same algorithm works over deflate and PNG are using deflate, meaning that you can compress your PNG files 20% more using this new algorithm, okay? And it works everywhere because it's still PNG. And also we have a JPEG version of this, it's called Gatsley, another German name, that will let you like compress your JPEG files even more without reducing quality. So these are new compression algorithms that are open source are available right now so you can start helping your users with performance, okay? So start working with that. And the last thing that I have for you has to do with how to hack user experience. So with something known as reactive web performance. So I'm pretty sure that if you are a Netflix user, you have seen this in action from a Netflix point of view. It's difficult to see Netflix with a buffering system. Typically, Netflix is changing the quality of the streaming as soon as the bathroom is changing, okay? This is a similar idea of a product web. So we can know about the context today using client-side performance APIs, network information API, device memory client hints APIs. So basically you can detect the current connection of the user, the bandwidth, the latency. You can detect the current available memory. So you can know if it's a feature phone, if it's the latest Android device and make a decision about that because maybe you have the latest Android device, the latest iPhone device, but you are in 2G in roaming. So it doesn't make any sense to send a very super high quality ultra resolution to that device because it will never arrive. And that's a bad experience. We need to keep the user the consistency and the consistent experience. So for example, if you're in roaming, don't send web phones. We can change the service worker's cache policy. So if we are in 2G, we say, okay, let's go first to the cached version because it will be slow and we want to improve user's experience. We can reduce the amount of loaded data and maybe you can send a low resolution image not matter the device pixel radio or the resolution. Okay, you might have a 4K screen but if you have a very slow network doesn't make any sense to send that picture. You can just send a low resolution image. So are you feeling bad now? What are you doing? Yeah, no, yes. So performance is top priority. Okay, if you wanna succeed on the web, on the web apps, it's top priority because it was proven that it has to do with conversion. It will give you conversion. So you need to push it even more. You need to start, you need to learn how to measure with the new metrics and it's a really worthwhile effort. So okay, so please, take time and budget to improve the performance of your website, okay? Please, please do it. So that's an iPhone, okay, by the way. Make it fast, please. So that's all for me. So thank you and see you later because I have another talk on progressive web apps on the other room, okay? Thanks.