 Good afternoon, everyone. I'm Tom Greenaway, and I'm a developer advocate based in Google Sydney. And I'm Martin Splitt. I work out of the Zurich office, and I'm a web master's trends analyst, which is just a fancy name of being a developer advocate for search and web ecosystem. Today, Martin and I are going to tackle this topic. The best practices for ensuring a modern JavaScript-powered website is indexable by search. But what do we really mean by this sentence? Well, by best practices, we mean what every developer should know, the techniques, knowledge, tools, and approach and process. And by modern JavaScript-powered website, I know it's a bit of a mouthful. It means websites which use modern JavaScript frameworks for their front-end and probably are rendering their HTML in JavaScript on the client side. And they're typically like single-page apps, but this applies to any website that uses front-end JavaScript. And they might be, say, powered with Ajax web sockets for content. And lastly, what do we mean by indexable by search? Well, by indexable, I mean the content can be understood. And by search, we mean Google search. But these rules, they apply to other web crawlers, too, and such as, say, other search engines or social media services. Right, like Facebook, Twitter, and all the other wonderful ones. Cool. Cool, so now that we know what we want to address here, how are we going to split this up? How are we going to go through this? First things first, I think it makes a lot of sense to quickly go over how Google search actually does the indexing and figures out what content there is in the web. Then we're going to look into something that, as a web developer, I really, really wanted all of these years. And we got that now. The tools to help you all and us to debug the things that we are seeing. And last but not least, we don't want to just debug things after they happen. We want to make sure that we're getting ahead of that. And basically, we're going to talk about a few best practices for indexable search, indexable content. Cool. So with that out of the way, it's good that I have you here because I have got something to talk to you about. OK, that sounds a bit intense. What do you mean? Don't worry, it's not that intense. It's just I have, how do I put this? I have a friend who's called Marvin. Let's call him Marvin. And they are building a single page web app. And they have done that, and they have followed most of the things that you do these days, like PWA and all that kind of stuff. But they have issues getting users to find their content online. I see. And unfortunately, that's not an isolated issue. Right, right. So you ran a Twitter poll, and other people are encountering this as well. Yeah, I got a bunch of responses. And it's like, so how can I check if my stuff is findable in search, or hey, my stuff's not showing up, but I don't know why? And I think that needs a little bit of a dive and an explanation, I think. Right? I guess that's a good idea to do. Well, there are a lot of tools available for debugging on Google Search. But perhaps before we get into those, how about we just go over how Google Search sees and indexes the web? I think that's a fair point. That sounds good. Well, here's a basic diagram of how websites were traditionally indexed. Google bought our search crawler, found pages, downloaded them, and processed that content, and then put them into an index, and then performed more crawling based on the links it found. Yeah, OK, but what happens in the process step? That seems to be where the magic is, right? Well, when we fetched a web page from a URL, and it was a traditional website, that web page was complete when it arrived. And we call this the rendering of the page, the construction of the HTML. So when you say rendering, you don't mean stuff like putting the pixels onto the screen or dealing with DOM transitions and animations and stuff. It's just where does the HTML get constructed, like server-side rendering versus client-side or hybrid rendering? Exactly, like traditionally websites were rendered entirely on the server. And then any JavaScript that was used was probably just for cosmetic purposes. Right, OK, cool. But now that we have figured out that this is about the dealing with the constructed HTML, it's no longer that way, right? We have changed our web architecture quite substantially. So what happens there today? In the processing step? Yeah. Yeah, that's a good question. Basically, there's now a renderer inside the process step. So more specifically, we have a version of Chrome that opens the content of the page and runs some JavaScript. And then it spits out the final HTML. But we also have a queue as well, which is quite important. And that kind of leads into this next point, which John Wheeler and I revealed at IO earlier this year, which is that in a nutshell, the rendering of JavaScript-powered websites in Google Search is actually deferred until Googlebot has the resources available to process that content. Right, deferred. So what kind of timeline are we talking about? What's the delay? Well, it could take minutes or maybe an hour or maybe even days or up to a week before the render is actually completed. A week? Yeah, I know. What? OK, that's, wow. It's a shock, I know. But you have to understand that the web is really big, right? It's quite huge. In fact, we've found over 130 trillion documents on the web so far. OK, that's a mouthful. And this number is two years old, and I guess the web's growing. OK, I understand that. Cool. But is there anything that we could do to help the crawlers a little bit? I mean, if I remember correctly, when I attended the session at IO, you said something about dynamic rendering. Is that something that would come in here? Yeah, exactly. So dynamic rendering is a technique that allows us to sort of short-circuit the rendering pipeline by delivering a server-side-rendered version of your normally client-side-rendered website by rendering that client-side JavaScript on the server with a mini-client-side renderer. For example, a headless browser like Puppeteer could be used. Oh, right. Yeah, that's pretty cool. So how does that work in detail? Well, here you can see how a server identifies that the device requesting the page is a user browser. And then it delivers, well, it serves a payload of HTML and JavaScript that gets rendered on the client, right? So that's basic stuff. But when a crawler like Googlebot makes a request, the server sends a different payload. And instead of sending the HTML and JavaScript directly to the browser, or in this case, the crawler, we send what is normally sent to the browser, but we send it to the dynamic rendering service, and we run the payload through that service. So then the dynamic renderer then spits out a completely statically-rendered HTML payload for the crawlers. Ha, that's pretty smart, OK. And to be clear, that dynamic renderer could be an external service, or it could be like running on the same web server infrastructure. All right, yeah, that makes sense. I guess for this kind of stuff, you can use tools such as, I guess, Puppeteer is one that you mentioned already. But you could also probably use RenderTron, which is a higher-level abstraction. My Puppeteer is basically an NPM module that you install in that remote controls or, well, programmatically controls a Chrome instance that runs headlessly, which is great. But I like something more high-level. And I think RenderTron steps in there, where you basically just have the server running, which is a RenderTron server that uses Puppeteer to steer that. And you give it a URL to render, and you get the render static HTML back. That's pretty cool. OK, it's fantastic. I guess you can also deploy that pretty easily. I think there's this thing called Google Cloud Platform. That's probably pretty easy to deploy to. But I guess you can also deploy it pretty much anywhere else, right? Do you have an example? I do have an example, actually. So there's an NPM module that is called the RenderTron middleware. So if you're using, let's say, for instance, Xpress.js, you can use that as the middleware. But what you're doing here is basically you just, first step, you require it. You need to get that into your project. And then what you're doing is you basically configure it to do the right thing for you. In this case, we want to specifically jump the rendering pipeline of the Googlebot. RenderTron by default doesn't do pre-rendering for Googlebot because Googlebot does run JavaScript. But we want to get the advantage here anyways. So we can just use the pre-configured and pre-built list of render agents that they are rendering for and add Googlebot in there. And then once we have that configuration ready and have imported it, we can go on to actually use it in our application middleware stack. So we can basically point it to the running server somewhere and say, for all these user agent patterns, pre-render. By the way, now that I have got you here, because you never respond to emails timely, which is fine, I neither do I. No offense. And I mean Chrome Dev Summit is a big event. So this rendering does cost a bit of resources. So I'm wondering, is there any way to figure out what Googlebot, what does RenderTron really do to figure out what's Googlebot, and how can I verify that it really is Googlebot and not just saying pretending to Googlebot? Yeah, well, the easiest way is to user agent sniffing for the Googlebot string. Here's an example with the mobile user agent for Googlebot. But you might want to do this for other services as well that you want to serve pre-rendered content to, like social media services. And also, for Googlebot, you can do, additionally, reverse DNS lookup to confirm that it really is coming from the Google service. And like I said, this is the mobile user agent, so you can detect the desktop user agent as well. And that URL will give you a list of all the different user agents. Right, that's nice. Which reminds me, actually, I should sync with John Mule and check if there are new tools in Search Console for this stuff, or maybe you can tell me. I was going to say, why do you bring John into this? Come on. Sorry. I'm here. I'm right, like literally a meter away from you. Or I don't know how many miles of foot or inches that is. No idea. But basically, we have a bunch of stuff for you, and I'd like to walk you through that. So you know the Google mobile-friendly tests already? Yeah. So this is kind of useful. It shows you if your page is mobile-friendly. It does give you a screenshot of what we are seeing in Googlebot. And it's pretty easy to use. You just paste your URL in there. But it does more than just that. Because it also gives you this, which is what I always wanted to have. When Googlebot does not render what you expect it to render, you get the JavaScript log messages that you would give them or get them from the Chrome Dev. So you can really debug the JavaScript. You can really debug this. And you know, here's my favorite. I mean, we had this, is it undefined question earlier on? Apparently, it is undefined. And undefined is not a function, which is unfortunate. But that happens. Also, do you know about the new URL inspection tool that we've got in Search Console? I don't think so. Can you remind me? Yeah, sure. So if you have verified for a property in Search Console, you can basically paste any of the URLs that are belonging to that property into the Search Console URL inspection tool. And you get when we crawled it, if it's on Google or not. And you have a bunch of information. And you can run a live test as well. So this blog post here is drumroll, not on the index, but that's fine, whatever. We also have something else. It's actually a pre-announcement that we're going to make at Chrome Dev Summit now. So can we get a little bit of the drumroll? So we'll get the live code editing feature in Search Console. OK, what does that do? Fair enough. I think, yeah, OK. So imagine you're building a website. Let's say there's an after party today. So you build the website for the Chrome Dev Summit. And you want to check if your structured data works to get highlighted in search results. You want to check that as quickly as possible. Yep, that makes sense. You want to be able to iterate quickly and you don't want to have to wait for deployments from your site. Yeah, exactly. I want a development cycle that makes more sense. So what you can do is you can plot that into this wonderful tool. And here we have an example. We are using JavaScript to create a script tag that contains the structured data. And we have all our wonderful structured data for the event here. And then we can click on the little test code button. And what it gives us is this. And we're like, yay, our event is available. And this is a code editor over here. And what we see over there is it's missing the performer for the after party. Oh, wait, that's a shame. I think it's meant to be like the Chrome Dino. It's the Chrome Dino. So we should add this performer. So what we can do is we can go straight back into the code editor and click the button again. And it live updates as we have retested our code. So we can do all of this in the browser in the single tool. And I think that is pretty fantastic, really. That's pretty good stuff. OK, but yeah, that is definitely a neat way of testing and trying our code out, like in the terrain quickly. That's something that Search Console generally tries to do, right? So you have this really nice flow. So let's say someone in your company or agency or wherever has access to Search Console. I don't have access to Search Console as a developer normally because I have so much other things to worry about, really. And then someone finds an issue through one of the reports. So how do they get this information to me? Well, one way of doing so is basically they can just go and see this issue from the reports page where there's like a bunch of samples here. In this case, the content is wider than the screen. And in the second stage of this, they can basically just say, all right. So our developers might be an external developer. We don't want to give them access to all the data. So we share this particular issue with them. And they get a link. They don't have to sign in or anything. They can just use the link that is shared here to see what the issue is, get access to the documentation that explains what the problem is and how to fix it. And then last but not least, when I as a developer then go like, I fixed this. I have this under control. We know that that's often not true. So what Search Console offers us is this Validate Fix button. So if I'm like, Tom, I gotcha. I fixed this. Or I can press this button and go like, I gotcha in 10 minutes. Yeah, right. OK. So it really establishes a flow. And it does. It is a really nice workflow and works across departments, which I think is pretty fantastic as well. Yeah. That's nice. But actually, there is another addition to our DevTools. Right. You were talking to me about that. Yeah, exactly. I'm sure everyone obviously knows about Lighthouse. They've been to the forum space and they've seen the awesome statue we've got there. It's fantastic. Yeah, what if I told you that there are actually SEO audits inside of Lighthouse? In fact, we've got a few more coming soon. So basically, this can automatically detect things like whether your HV header responses are accurate or not, and meta tags as well. Like, they've got correct title and description tags. Or they've got HREFLANGs set up correctly if you're using that for localization. And also descriptive link tags, even for anchors. Click here. Exactly. You want to avoid click here because it doesn't really communicate what is actually the thing that you're linking to. Number five is going to surprise you. And then robots.txt. And then the new features we're adding are automatic detection of the size of tap targets and the margin around tap targets to make sure that they're nice for users. And also structured data testing as well, which you're just talking about. Yeah, it's going to get more of that. That's fantastic. That's really good to see. Cool. All right, so I think from the tool side, we have Lighthouse Audits for SEO. We have the Search Console. We have the Mobile Friendly Test and the Rich Results Test with editing features. I think we're pretty good on that front. But do you have any recommendations in terms of best practices that I should tell my friend, Marvin? Yeah, except my friend. I'm serious. Like, this is a friend of mine, it's totally not me. Any best practices that we should recommend to them? Yeah, yeah. Let's go through a few. OK, cool. Well, firstly, it's important to know, remember how I said that Googlebot is running Chrome nowadays? That is fantastic. Finally, we have a modern browser. Well, actually, wait a second. It is Chrome, but it's not actually the latest version. So it actually runs Chrome 41. Right, 41. It's not even 42. The answer to life, the universe, and everything, it's just 41. Not quite there yet, but Chrome's work on it. But seriously, though, since it's Chrome 41, and Chrome 41 was released in 2015, doesn't support all of the latest features of modern browsers. So for example, it doesn't actually support ES6. So the latest language features aren't available. And while it has web components, it's actually version zero of the spec. And another thing to note is that it's actually stateless, which I'll explain in a little bit. Ah, OK. Yeah, it's interesting one. I'll break it down. For example, with this code, how many ES6 features can you spot? A few. That's a good answer. Yeah, OK. More than zero. Yeah, exactly. Now, we might forget that some of these are relatively recent features, like advanced object literals, or const definitions, back quotes, and variable substitutions. Yeah, you use them every day, right? Yeah, exactly. So one way to deal with this, if you need a solution, is to use something like Babel, which allows you to transpile ES6 code down to ES5 automatically. And you can easily compile a set of files, or directory, and compile into a single file for serving. And using presets, you can also detect a minimum browser version, used as a baseline. So you can ensure that ES6 features and ES6 code go to browsers. I can't support it, and then browsers, I can't get the ES5 transpiled code. Right, that makes sense. And now, while Chrome 41 does have web components, it's actually an older version of the spec that you're probably used to. So after Chrome 41 shipped, several features such as custom elements and Shadow DOM were actually had some changes made to their specs. So depending on exactly which features are used in version 0, it might be very simple to migrate, or it might get more complicated. But the most important thing is to be aware that there are differences. There's differences in this. Lastly, this probably shouldn't come as too much of a surprise, but Googlebot, it basically doesn't really have any memory. And what I mean by that is that every time an access to the web page or a site origin, it just always acts like it's the first time it's ever encountered that website. To achieve that, we turn off a bunch of things. So we don't have service workers running, so we don't have a service worker cache, and we don't have local storage or session storage and so forth. Makes sense. If you click on a search result, that's like you're coming to that page the first time. So we will make sure. Yeah, what is the first time user experience when you encounter it? Makes sense. So Martin, do you have any suggestions for how we can substitute for some of these things? So I think if you look at things like the web components and the few of these features like intersection observe and all these things, I guess polyfills is a good way. Once you've already done your homework and did the progressive enhancement or graceful degradation, at least. But the problem with polyfills, I feel, at least, is that there's a bit of a risk to ship dead code to people, give them a bunch of stuff over the network, which, depending on where they are and what plan they are on, might be actually pretty expensive and time consuming. So you want to reduce that. And actually, I really like this library called polyfill.io. So polyfill.io basically sniffs on the user as well, or the user agent, not on the user. On the user agent of the person requesting or the browser requesting it, so if the crawler comes by and there's like, oh, this is a Googlebot crawler Chrome 41, so I give them a bunch of stuff that the normal user on a more recent Chrome or Firefox or Edge or whatever doesn't need. So basically, it tries to give you the right amount of code that you need to make this work. So that's pretty fantastic, I think. Yeah, that's kind of cool. Yeah. Is there any place where I can figure out more about these feature issues, where features are missing in Chrome 41 that are there in the modern one? Yeah, definitely. If you check out caniteuse.com, it's a great resource for this sort of thing, because you can check the features across various browsers. You also specify specific browser versions as well. You can say, hey, what's different in Chrome 41 specifically? Cool. That's pretty nice. And also, like the golden rule of any kind of indexability and building website for search crawlers and that kind of thing is to just make sure you test really frequently. Yeah, fair enough. That's a test, test, test. Fair enough. Cool. So going back to dynamic rendering for just a second and kind of the tools discussion. So if I test my stuff and if I figure out, like, oh, no, this feature is like really tricky to work around and I want to not change my code, I guess I can use dynamic rendering for that. But I guess there's like trade-offs and there's like situations where I shouldn't be doing that. So what are the sites that should, generally speaking, look into dynamic rendering? Well, because the rendering queue can introduce some delay, if your site has lots of frequently changing content, like a news publisher or something like that and you've got maybe breaking news and articles that are coming out changing very frequently, you probably might want to look into using dynamic rendering to overcome those delays. And if your pages use features that aren't available in Chrome 41 and it's not possible to work around those limitations, maybe in the short term, then using dynamic rendering is a useful workaround until either Googlebot catches up or you have the time to adapt that in your own implementation. Right. That makes sense. And also, while Googlebot is supporting JavaScript, other crawlers might not. So for example, if your site is using social media interactions a lot, those crawlers tend to not run JavaScript. So if I share a link and I'm sharing it on social media, but I want a nice preview card to be created, it probably wants to try and access the image and the title and the meta description or something like that. So if you're actually using client-side rendering for those things, then you might get just the variable templates in the content. Use an image in it or something like that. Yeah, in the preview, which would be bad. So in order to get better representations there, you can also serve a dynamically rendered version page. Oh, sounds pretty good. But ultimately, dynamic rendering, it's a powerful technique, but it's still a workaround. So, Martin, do you have any ideas on if we've got plans to improve this on the Google side? Yeah, way to put me on the spot. So I don't like to make predictions on that kind of stuff. But definitely, so the way that our infrastructure is set up, there is a bit of a gap between when we actually execute the JavaScript and then do the rendering and then the indexing bits and the crawling. So we try to bring these closer together, which is an interesting technical challenge. And then at the same time, we're trying to catch up with Chrome. But we don't want to just catch up, because then we're going to have the same freaking conversation a few years, like, oh yeah, Chrome is running version 70 and everyone's like, oh, when does it get into three-digit land? So we basically work on a process that we hopefully going to start very soon. So I make no promises on when, but we are working on figuring out a process to stay up-to-date with Chrome so that Googlebot does run with the Chrome release schedule so that we are basically giving you an evergreen, yeah, right? That would be fantastic if we could do that, yeah. And also the shortened rendering. Yeah, exactly. That goes hand-in-hand, really, because we have to touch the rendering infrastructure anyway. So we might as well do both things, but I think we might get an update quicker than we get the two things together. But we'll see, yeah. Cool, all right. So thank you so much for all the talks about how the indexing works in bits and pieces. That was pretty fantastic. I learned a bunch of new stuff. Yeah, and I hope this helps with Marvin as well. Marvin is going to love what I'm going to tell him. I'm sure. Okay, yeah. Well, thanks for showing me all those search console tools as well. Oh, yeah, they have a lot of fun, yeah. And if developers need more support and help, and like Marvin, like, can they get somewhere else? Yeah, so I try to get Marvin to, like, go to our documentation because the documentation is expanding quite quickly, and we have a bunch of cool documentation coming up and good documentation already there. We have Webmaster Hangouts, so you can pop by and talk to us over a Hangout and ask us questions there as well. They are recorded, so if you want to go back to one that has happened and there was an interesting question, you can find it there as well. We have what's called the Webmaster Forum. A bunch of fantastic people are there called product experts who are also helping if there's anything coming up with search. And last but not least, if you've been to the forum space before, we have a search console booth where you can try out search console and get stickers and stuff. So definitely check that one out tomorrow. So yeah, that's a bunch of resources. Awesome. Yeah. Cool. Well, yeah. Thanks, everyone. Thank you very much for staying with us for a moment.