 Today, I will be talking about building fast and performant apps. So this talk is mostly our learnings and experiences we had building housing go, our new revamped mobile website. Let's start. So what does, what comes to your mind when you think about fast and performant apps? There are a lot of things. First paint, page load time, document complete, DNS lookup, a lot of other things. Right? For the scope of this talk, we'll be focusing only on three main metrics that will be tracking throughout this talk. That is the first paint, the first meaningful paint. The first meaningful paint refers to the time when user actually sees the actual content he is done the action for. And the first interaction, the first interaction I mean first JavaScript enabled interaction that is when your JavaScript is downloaded and ready to, downloaded and executed and you can handle user's interaction via JS. So one important factor to all this is your critical rendering part, right? It's the part that the browser takes to render the first view to the webpage. It has few components in it, you're downloading the first HTML, downloading your CSS, downloading your JavaScript, processing them all and then displaying the content. This talk goes something like we'll take a very unoptimized version of housing's mobile website. We'll look at the numbers in the starting and then we'll go about improving the three metrics that we define. So the first version of the website looks something like this. So all these, all the videos are done on 3G connection and from Dallas in Chrome, emulated as a browser. Let's see again, right? So many of you have noted that the first paint happened at around 5.2 seconds. There was some meaningful content on the page at 7.4 seconds and the first JavaScript-enabled interaction happened at around 6.9 seconds. I'll come to how we determined this time in the later half of the presentation. So let's talk about how to optimize the first paint. So the first pixel that's painted on the webpage depends basically on how soon or how well you download and make your CSS on, download your CSS and make your CSS on and then your render tree and the layout happens, right? For this version of the website, the code looked something like this. You have a head tag, there was a link, style sheet tag pointing to our CSS and then body and rest of the page. Let's look at how does this waterfall looks like? You see the first request went for the HTML, then your CSS, the second request is for CSS and when the CSS is completed at around 4.7 seconds or 4.8 seconds, then your start render happens, right? So that's kind of render blocking. By render blocking, I mean the browser had to wait for this request to be completed to display the first pixel on the screen. How can we avoid this? The answer is the inlining critical CSS. The word critical is very important here. Critical means only the CSS required to display the above the fold content that the user is seeing, right? Inline CSS code looks something like this. So we have divided a CSS into two things. One is our app CSS that's global throughout the website and one is search for the page that we are looking at. So this is the heaviest page of our website that has a lot of API calls and a lot of things happen on the server and on the client end also. So when we inline the CSS, let's look at how does the waterfall looks like? So you have eliminated, if you see the second request, you have eliminated the CSS request, although you increased your DOM, your first index HTML size, but that helped you bring down your first paint from 4.7 seconds to at around 3.5 to 3.6 seconds, right? Let's see how does it look like, like if you compare the two side by side, right? So the first paint has also moved down to 3.8 seconds and the first meaningful paint has also moved down to 7 seconds as compared to what we had earlier. So these were the numbers earlier and this is what we did with it. Also the first JavaScript enabled interaction time moved ahead because now your JavaScript is started downloaded much earlier than it was being done before. Cool. So that's how we improved the first paint of the website. Now moving down to, we'll skip first meaningful paint for the end and we'll continue to first interaction. Moving to first interaction, I'll explain how we have architected our website. So each page has index HTML. It has three JavaScript files that are needed to actually execute or enable the user to interact using JavaScript. We have vendor.js, app.js and view.js. Vendor.js is mostly bundle of JavaScript files that we rarely update and it's mostly collection of libraries that are globally there like React and then everything else. We have app.js. App.js is mostly the bundle of JavaScript files that have all logic of routing or some global functions that are required throughout the application. We have view.js. View.js is basically our file that controls the JavaScript interactions for the given view or the given page that is being rendered on the website. So to improve the first JavaScript enabled time, it's very important to download your JavaScript as soon as you can and also execute them as soon as you can. Our application being a single page application, we wanted all these JavaScripts to download in order and execute in order. That's very important. The execution in order is very important. First vendor.js is executed and then app.js and then view.js. Let's look at the different techniques that are available to download JavaScript files currently. So we have plain script tags. Plain script tags are just normal script tags with SRC pointing to your CSS resource and the good part is they'll download your JavaScript together but it's executed in order one by one. Like first vendor.js will be executed, then app.js will be executed and then your view.js will be executed. It's also pass blocking. So anytime your browser encounters a script tag in your HTML, the passing of the DOM stops, the browser downloads the resource, executes it, and then move ahead. So they are not good if you have multiple.js files or you have dependency of multiple.js resources. There's one more defer attribute to script tags. What defer attribute does is it makes your script tags non-pass blocking. The browser don't wait for the resource to be downloaded and executed to continue passing of the DOM. But one thing you get is that you have made your JavaScript resources asynchronous. They'll be downloaded parallelly but they'll be executed only just before the document complete document loaded event is fired. So if your page has a lot of images or a lot of other resources that are somehow downloaded before this event and they push this event to a longer time on the timeline, then you are in a way delaying intentionally the JavaScript enabled time for a user. Then we have async attribute on script tags. Async attribute is wonderful. It makes the script tags non-pass blocking. Also it will execute the JavaScript resource as soon as it's available in the browser. The problem we had with this is because script tag async tag will tell the browser to execute the resource as soon as it's available, we were not able to maintain the execution order of the JavaScript resources that we needed. It was very important for us that this execution order remained constant throughout the website. So this kind of didn't work for us. We did a bit of research and after doing research we landed here. So what's this? This is a function that's appending script tags with given source in your document. So the good part about this was that dynamically injected script tags are by default asynchronous. They don't stop the parsing or rendering of the document in any way. Also by telling the browser to set the async attribute false, we were telling the browser that don't execute them asynchronously, execute them in the order. So for us this was the kind of thing that we were looking for. So this gave us both downloading our resources in a non-pass blocking way and maintaining the execution order. So we were not very much convinced by whether it will work for us or not. So we tried to measure a few things. The things we measured were when does our app.js load, when does our view.js load and when did the component did mount of your view gets called? That's for the JS enabled interaction. So we used this API that's available, window.performance.mark. And we kept each marks for different at the bottom of different JavaScript files and one in the, at the end of the component did mount. We ran few, we did it few times and this is some, and we brought the data on the graph. So this is what it looked like. So on the x-axis you have the runs and on y-axis you have time. So if you see for the app.js execution time, the deferred technique was taking the highest time, right? Then came the inline scripts. So deferred was taking the highest time because it's deferred till the document loaded event is fired. And inline script because it has a few files before that the vendor JS file. But to a surprise execution time for our app.js was very low at the difference of 1.3 seconds to 1.4 seconds as compared to the other two techniques. This gave us a bit of confidence that, okay, this might work. We saw the same behavior for our Vue.js and the same behavior for our time to interaction. Right? So that's huge gain of around 1.5 seconds, more than 1.5 seconds. What we got to enable user to actually start interacting with the page. Yep. So that's what we went ahead with. We did some polyfill for IE and other browsers. But this general audience of like 90 percentile audience is on Chrome. So this worked really well in Chrome. And like from the metrics being the first interaction time at being 6.5 seconds, we brought it down to 4.7 seconds. That's huge gain in enabling the user to interact with your page. Right? We did this, but this had few cons to it. Generally dynamically injected script tags are considered harmful, like harmful because browser has thing called preloaders or pre-scanners. So whenever your document is being passed, browser or when the browser's main thread is stopped at rendering something or doing some other function, the browser preloaders or scanners peep ahead in the document and they look for script tags or native script tags with link tags or native image tags with sources. And whenever they get those kind of tags, they download it ahead of when actually it's called by browser. Right? So this gives browser the capability of downloading resources efficiently. So this part we missed, we couldn't get the advantage of browser preloaders. But then browser offers you some other meta tags that helps you tell browser that, yes, we have this resource and download this for us, right? So those are called pre-browsing meta tags. There are four to five pre-browsing meta tags. Let's start with DNS prefetch. So what DNS prefetch does is we use CDN for our assets and we have multiple CDNs, one CDN for assets, other for images and so on. So we definitely know that we are going to make connection to this CDN for downloading our asset. So this by putting this tag in our head, we tell the browser that make an initial connection to this so that whenever the request goes for that resource, we save this time. Like we don't waste time in resolution doing DNS resolution for that domain. Great. Then we have pre-connect tag. What pre-connect does is with DNS resolution, it will also do TCP handshake and TLS negotiation. So that actually helps you to eliminate a lot of time when you have resources coming in from multiple domains. So that's what we are currently using in our code also. Then we have a very powerful meta tag, link pre-render. What this does is if you give something like link rel pre-render and href pointing to a page, what the browser will do is it will actually load the complete page in the background and say you have a page which only has A and B option and somehow seeing the data you know that user is like 90% of users click A. So if you give this tag in your head and tell the browser that render my page A in background, what the experience that the user gets when he or she clicks on page A is that it's almost instantaneous for them. Right. Browser has already made the render the page and it's available right there for the user. But use this tag very cautiously. If you put a lot of re-render tags in your head, it will stop the browser. It may choke the browser like the network may be very populated and may not be free for the contents of the actual page. Like moving ahead, we have a prefetch tag. Prefetch tag is basically for few resources that you definitely know are going to be required on the page in future. The good part is that it's optional like the implementation is optional for few browsers and the download priority is very low. Right. So say you have some asset in your page that's a very an image that's required below the fold, but it's kind of big image. So you want to tell the browser that start loading this image whenever you get time. So you can do this using these tags. Yeah, so this one more powerful tag that's pre preload of preload has one benefit or prefetch is that the resources that you define using preload are on the highest priority of fetch and they are they are necessary for the current navigation. So browser tries to download them as soon as it's possible for it. So using these tags, we were able to eliminate the issue of not telling browser and browser preloaders or pre-scanners what resources we'll be using on the page. And also it gave us an improvement of 0.1 seconds in the first interaction in enabled in JS in enable interaction time. So so far we are at this pace. First 20 that's at a 3.7 seconds. First meaningful paint is still at seven seconds. First interaction has come down to 4.6 seconds, right? Let's talk about first meaningful paint. So the first so to talk about this, let's let me introduce the two kind of applications that we have. So client-side rendered applications and server-side rendered applications. So what client-side rendered applications are generally your client will send a request to the web server, web scientist shell or basic HTML with few links of CSS and JS. Back to the client, client processes it, make the API call required to show the data on the page and then the data is available to the user, right? So the first meaningful paint that happens in these kind of applications is a little bit longer as considered to what happens in server-side rendered applications. So server-side rendered applications look something like this. The client sends request to your web server. Your web server internally makes API calls which are faster because there's no network latency and no network issues. The browser processes the data, makes the DOM string, sends the full page back, right? And the user sees the page. Yes, by doing this, you increase your first HTML that the browser is downloading. But when we were developing our mobile website, our main aim was to show the actual content to the user as soon as we can. So we went ahead with our server-side rendered app for our mobile website and the difference looked something like this, right? That's a huge win over the content that's shown to the user. So these metrics moved down to these numbers. Your first paint increased a bit because your initial document size also increased. And your first meaningful paint came down from seven seconds to 3.8 seconds. That's almost four seconds, three seconds. And that was huge for us, showing user the content what he has intended for as early as we could. So apart from this, we did one more thing, so continuing on the philosophy that just push the response as soon as possible. When we were analyzing our pages, there were a few part of the pages that were almost constant for every route. So the head part with the CSS and few links was almost constant for every route. So what we did is streaming HTML or streaming the response from the server. So our first meaningful paint for client-side rendered looked something like this. So the request came in, the browser sent some chunk earlier, and then it made the API calls and then sent back the remaining chunk. I'll show you how it looks in a minute. So this was the code that we used as soon as the request came in. Ours is an old and express app. So this is the first middle where the response comes in. And we just send this header back to the browser. This is just a 4KB chunk that is sent back. And it looks like something like this. It has few pre-connect tags to the domain we are going to make connection to. And at the time, we were using prefetch because preload just came in in latest Chrome. So the prefetch tags enabled us to tell browser to actually start downloading these resources. So you can see the request for these resources going on. And then once the server was done with API calls and processing the complete data, we sent back the response completely. And so the initial index HTML, the size increased from 4KB to 13KB. That's how streaming works. I'll show you the timeline, how it looks like. So you see the first request that's going, the first chunk that came in, it was processed. And as these first four are preload tags, the request is sent to our preload tags and then the remaining HTML comes in. And the actual request that's from the bottom of our body is then sent again. So that's how we tried to push the JS files for the user as soon as possible. That helped us a lot in improving our first interaction time. So our first interaction time went down more by 0.4 seconds. So that's how we optimized each and every part of what we were tracking. Yep. Thank you. I can show you the current one we have, it's not working, okay, it's not working. It's around 3.6 seconds now and the JS interaction has also moved down to around 4 seconds. So if you just run a web page test, you can see that. Hello. Yeah. Hi. Yeah. Hi. Yeah. So with the multiple MATATAS as we discussed for improving the performance, have you also considered the domain sharding as one of the key factor in improving the performance? Yes, definitely. So we have enabled H2O on all our CDNs. So that helps us push the resources with a single connection. And definitely so we have, so for our main assets, we use assets 0.housingcdn.com. For our assets that are not very important to us, we use assets 1. And for our images and all other things, we have IS1, IS2, IS3. So yeah, that definitely helps. Okay. So all of those were a canonical name or actual different servers? They are subdomains of the CDN that we're using. Okay. All right. Thanks. Hi. Yeah. Hi. Hi. So your full stack is completely in JavaScript, is it? You were using Express in the back end for... Can you speak louder? You are using Express in the back end for REST APIs as well. Yeah. For making the API calls, yes. Use promises. No, no, no. I'm sorry. I was not there. The APIs are written in Express. No, no, no. We have our different services in back end, Feon, Ruby on Rails, Feon, Go. Okay. And what's the database you're using? It's Postgres, mainly. Postgres. And you're using... We're using Elasticsearch for maintaining our fast responses of... I'm asking because when you talk about FirstPaint and people are talking about FirstPaint from past one day, the back end also plays a role, right? Because that is giving you the data back. So that's why I'm trying to understand. So you have Ruby APIs, they're interacting to Postgres. Yeah. We have a layer of Memcache and Elasticsearch. So that part is pretty well taken care of, and the APIs from the web server are within the same VPC. So there's no network latency, they're really very fast. Okay. Thank you. Yeah. Hello. Yeah, hi. Hi. How did you manage to extract your critical CSS? What? How did you manage to extract your critical CSS from the bulk of CSS? Okay. So if you open Chrome DevTools and you see the stack of the network request of HousingMobile, we have something called LazyViewCSS that kicks in after the window load is happened. So the way we have separated the chunks is that AppCSS only has rules for few classes that we use globally, or say for our header that's constant in all the pages, or just to show the first menu of our side menu. We have even kept the side menu CSS in the LazyViewCSS. So by doing that, we are like the hypothesis is that user actually takes few minutes to actually go through the content that's provided on the screen, and by that time we would have downloaded LazyViewCSS. If not, the screen may look, the action may feel unresponsive for the first time. There are few online tools as well like CSS Extractor, and there is one Chrome extension as well. How reliable it is? We have not tried tools like that, but in our experiences from doing multiple revamps and building the mobile website again, it's recommended that you write your CSS, keeping in mind all these things from the very start, and you try to minimize as much as you can. Thank you. Hi. Hi. Where? Hi. So I wanted to ask you a bit more about HTML streaming that you mentioned. Can you please elaborate more on that? Okay. So as Abhinav also mentioned yesterday, we are using almost the same stack we have in GenX as the reverse proxy. So there's a flag in GenX proxy buffer that you set to false, and when your response is coming in your Express app, you keep on writing small chunks of response and in small chunks in the response, the stream automatically takes it back to the browser. So that's how we are doing it. Hi, Rahul. Hi. So interesting tags which you specified, but I think they are Chrome-specific, right? Right. What do you do about other browsers that you need to support like Mozilla and stuff? Okay. Is the experience any different over there, or is it the same? I guess we just had to do something differently for UC browser because that doesn't support dynamically injected script tags. So there, the script tags are basically plain script tags that are at the bottom of your page. But for all other browsers, we used a polyfill. I'll share my slides. There's a link to it, which tells about the polyfill that we're using. That's mostly for IE and all other browsers by default support this behavior. So there wasn't actually much effort in making it work across all browsers. Hello. Yeah. Hi. So another streaming question. So how do you handle the HTTP status for streaming responses? So when you're sending the static chunk first, it's always going to be a 200 okay, right? But your API call may fail. So the connection is always open until unless you end the response, the connection is live. But you can send the HTTP status only once, right? Yeah. That's send when it's, when you do response.end. So like express handles it internally. I'm not sure I can completely answer that. It may take some research, but express does it automatically. Thanks for the talk. Can you elaborate a little bit more since you already separate out, since you already separated out a vendor and what was the need to differentiate your gf? Okay. So generally, if you have, we have used react router with dynamic chunks. So if you don't separate out your views in different chunks, everything just comes at, at the top of your route, router declaration. So your app just actually becomes very big and if all the JS that's not even required to enable the interaction for the given page also gets downloaded. So that's not a good practice to do. That's why. Thank you.