 Hello, and welcome to the JavaScript SEO Office Hours. Today is June 17th, and I see that we have a few YouTube questions. Besides Dave, no one else joined the Hangout recording today so far. Oh, now as I speak, here's Pedro joining or not joining us. I don't know. Yeah, yeah, here we go. Hi, Pedro. You're on air now. I just started the recording. I'll edit this part out. No worries. I'll edit this part out. All right, so assuming that we just, how are you doing? Just came to say hi. I'm doing good. Oh, OK, that's good, then. Sweet. So I'll officially start the recording now. I mean, I started it, but I'll just retake the intro. Hello, and welcome to JavaScript SEO Office Hours of the June 17th, 2020. Great to have you watching this video or joining this recording or asking questions on YouTube. We have a few YouTube questions today that I'll go through. Then I'll give the audience an opportunity to ask their questions. And then you'll have the chance to submit questions for the future JavaScript SEO Office Hours on YouTube. Sweet. So first question is a React.js question. They got two different homepage SEO on the page Google crowd, OK? I'm guessing that they mean like meta tags or something, I'm not sure. One is index HTML, and the other one is the helmet component. Well, the hammer component is probably part of what gets rendered for next HTML. All right, let's see. How is it possible? Oh, right. I have checked with a tool, some sort of tool. I have not heard about the tool beforehand. So they used the tool, and the tool gives a grade of zero and says content is not available. Then that's not a good tool, I would guess. So I would say, don't use tools that don't reflect the real thing. I would always recommend using the tools that we provide, because they show you the rendered HTML as it comes out of our actual indexing rendering pipeline. So if we see them in the testing tools, then that should be fine. Now, they posted the website that they have. I'm not going to pull it up on screen, but I had a look. And they do have the meta description and a few other meta tags twice. I would assume that you have put it in the actual index HTML file, but then React, as it renders the page, creates its own set of these tags. So I would suggest that we either remove them from the index HTML and rely on React helmet to render these tags specifically on the components that get rendered. Talk to your developer. If you're not sure what happens there, talk to your developer, but very likely you want to remove the ones in the index HTML because they are not specific to the content that is on the page for React if you are using client-side rendering with React. Then we have a second question. Does the CSS and JavaScript formatting of a website affect SEO? Not sure what you mean by the CSS and JavaScript formatting, but generally speaking, use the testing tools, look into the rendered HTML. The thing that we see there, that's what we actually care about. So if you feel like I'm not answering the actual question, try to give me more details for the next hangout so that I can answer the question better. Then we have a navigation-related question. And I think that is the second-last question that I'll have today. And then it's the audience gets their turn to ask questions on top of it. A large website has a three-level flyout menu. Three-level flyout? Wow, OK. Fair enough. Second-level links are provided in code as ahreflinks. Good. Third-level links are loaded by a JSON file when hovering with a mouse. Aha. Right. Do I understand it correctly that the third-level links which are packed in JSON file are not provided and will be ignored and not crawled if they're not linked elsewhere? Yes. Anything that isn't in the HTML as we render it, so you can use the rendered HTML that is produced by all the testing tools, for instance, like mobile-friendly tests, which results test, Google Search Console, URL inspection tool, anything that isn't there as a link, we're not going to see it. Especially if there's a user gesture involved, it's not going to happen. Last question, Google is crawling URLs on the site like a certain example with the only references to something like that and rendered code being associated with a form. So there's a form that has a form action and that's where these URLs show up. Google crawls these URLs and gets a 404 error. Why would that happen? Without a specific sample site to look into, many options come to mind. It might be in the sitemap. It might be that there is some other website out there linking to these things. It is possible that somehow we get JavaScript that submits these forms to something. Things can definitely happen. It can also be that we think, oh, maybe there is interesting content behind this form. I don't think we do that, but I don't know. So I'd rather not speculate on that. But basically, if it's a 404 URL, then that doesn't really hurt you unless you are running into very specific problems with you sites. If you're not having a problem, you shouldn't really worry about it much. If you want us to not crawl these URLs, then you can also block them via robots.txt. That's also an option. Then we won't crawl them anymore. But generally speaking, if it's a 404, I don't see that as a problem. There's nothing then for us to index. And besides maybe crawl budget, if your site has millions of pages, that shouldn't be a problem. Awesome. Sweet. With that, I hope that was helpful for those who posted questions on YouTube. If not, feel free to post follow-up questions for the next thread. And now I'll open it up to the audience. Anyone questions in the audience that you would like to ask? Hi, Xiao. Hey, Marty. How are you? I have a follow-up question. I think one of the YouTube question touched base on that a little bit about CSS and JavaScript. I remember several weeks ago, I asked you if we blocked CSS JavaScript, the problem or not. You say, no, it's not computer cloaking. But in some cases, we let them crawl, but sometimes we return robot non-index. And sometimes we don't let them crawl then. Even the content, like textual content, itself doesn't change. They do change some user experience and some functionality of JavaScript. Do they be beneficial? I mean, if you don't let Google crawl, then it's not computer cloaking. But if we let Google crawl, then it will let them index, then it's beneficial to help Google better understand and interest beneficial for SEO. So if you're giving us a no-index, then that doesn't mean that we can't crawl then. So we would still crawl them. We would just not put them in the index. But I don't think we would put them in the index in the first place. So that's already a little bit of an edge case. When you block us from running JavaScript and CSS, then you are running into a situation that can potentially be problematic. For once, if there is content that somehow gets generated from running some JavaScript, then we are not going to see that content. It can also be that, as you say, like if you have user experience improvements, such as lazy loading that isn't using native lazy loading, but you have some JavaScript library that does this correctly, then we might not see the image URLs unless you use the no-script workaround. So I would not block Googlebot from using and crawling the JavaScript and CSS files. From what you said, you say that Google don't index CSS or CSS, they don't. So how do they know that then how do Google know that what's happening on the page? So I think what might be the problem here is a misunderstanding of what is crawling and what is indexing and what is rendering. These are three different distinct things. When we crawl, crawling just means we make an HTTP request and fetch the result back. If you use robots.txt and say, disallow crawl, that means we can't make this HTTP request. That is a problem because then we can't execute the JavaScript because we can't download the JavaScript. That's one thing, that's crawling. Then we render, which means we take the crawled JavaScript and run it in the browser. And then whatever content that produces on the page that includes that JavaScript can be put in the index. And that's the third stage where we say here is content that we want to potentially show to a user. So we put it in our database, so to speak. Normally, JavaScript does not have interesting content for us to show to a user because it's not really a website, it is just an asset to another website. So we would index the website that uses the JavaScript, but not the JavaScript file itself. So we wouldn't put application.js in the index because that's not something that we would ever want to show to a user. We would crawl it, we would render it, but we would not index it. So the robot noindex HTTP header on JS or CSS meaningless in the first place. Because Google don't do it in the first place. Pretty much, yeah. There are certain situations where somehow, sometimes, we get it wrong and then put something in the index and then you can remove it from the index using that. But it's very, very rare that that happens. So Google don't index the HTML file itself in that in Google index. The HTML file? Yes, the JavaScript file, no. So when Google indexing a page is indexing the DOM, rendered DOM or the HTML? Well, the rendered DOM is pretty much what gets. And also not even that. So it is even more complicated. So basically what happens when we index is we take the URL that we are getting from wherever, link, sitemap, whatever, and then we index a bunch of information about the content on the page. And the information on the content in the page comes from the rendered DOM. But it isn't exactly the rendered DOM. I see. Another thing that I found kind of interesting is that I realized some pages is blocked by robot text. But some URL that redirect to that page is not blocked. And Google sometimes still rend those URL that redirect to the block page. So Google know that that URL is redirect to that page. And they can see the page that being blocked but being redirected to. Wait. So you have a page A that redirects to page B, and page A is blocked by robots.txt, or? B is blocked by robot.txt. So then Google rank page A very highly. Which indicate that Google actually know that page A redirected page B and they consider the content of the page B because they can see that? I would not expect that to be the case. But it is possible that if page A gets enough positive signals from elsewhere, so if it's linked from a lot of places, and just generally seems to be useful for whatever reason, I don't know where these signals come from. But if it looks like this might be a good candidate to show, then we could show it even though it is just a redirection to a page that we can't crawl. It also actually we sometimes index pages that we can't crawl. Even though we don't know what is on there, we might get the information of what might be on there from the context in which it is linked to. So in this case, we will say that we shouldn't block page B in the first place if page B is important. If you care about page B, you shouldn't block it, yes. OK. Go back to the JavaScript thing. Sorry, I just want to finish that up. You say that if they have contextual problem, it's a problem. But if that state, there is no contextual problem, textual change from JS or CSS, it's just user experience. And I am under impression that Google takes user experience into account, especially for some more interactive page. In this case, JavaScript and CSS will be relevant to SEO. In this case, if we block JS and CSS that serve inferior user experience, will they hurt our SEO? Yeah. OK, thank you. You're welcome. Awesome. Good questions. I like good questions. Do we have another question from the audience? I'd like to ask one if that's OK. Sure. Is there any way to reset the performance observer between routes? So if you've got an SPA site or something like an upside-side rating, it wins collecting on the first page you hit, but then you go to a different route, you go to a different page, and it's kind of then the sum of two because it is one page. Is there some kind of method that you can reset between the two? So you're getting the proper CLS, the LCP, et cetera, from my page. I don't think you get an opportunity to reset it. That's a really, really interesting question, and I wonder how we would go about that. Because technically, in a single-page application, the browser might not know that there is effectively a navigation happening. So I don't think there is a way to measure that. You can probably know you can't even use any other APIs or metrics. I mean, the scruffiest thing I've done is just kind of timestamp it and take it from there and reassess it. Yeah, that would be an opportunity, yeah. I think you can also use, I think there's like, ah, I think it's called the user timing API that you can possibly use to build your own, but I'm not sure how well that would work. Yeah, that's like performance.mark That allows you to set marks on the timeline so you could maybe use that. And yeah, there's no ready-made easy solution for this to see. Excellent. Do you know how that rolls back into Krux data and what's sent through, or do they? Not sure. Not sure. I'm not familiar with how Krux gathers its information, its telemetry. Good question. Don't know. I'm wondering if there's like anywhere you could like bring this up. I'm pretty sure there's like the standard has, I'm not sure if the standard is like finalized. I think the standard is still pretty much in flux for performance observer. So this might be a question that would be good to take into the standard's body, which probably is literally just a GitHub discussion for this. Yeah, the spec is on GitHub. And if you have questions for the spec, then probably literally opening an issue on GitHub is a good way, it's very likely a good way to get a feeling for what they think about this, because I'm not sure if anyone ever thought about this. So I'll post the link to the GitHub repository where that spec is being worked on. Good question. I like it. Thank you very much, Dave. All right, I think we have time for one or two more questions. Anyone in the audience? Now is your opportunity to ask. No more questions. So I think let me reload the YouTube thread, not to miss a question. Sometimes we get a few late birds. Oh yeah, there are late birds. I think there's now nine comments beforehand that has been like less. We're updating our website to a responsive design and possibly incorporating headless e-commerce. What are the main considerations that should be taken into account from an SEO JavaScript perspective that the Google caller can still see your content? You can test that and all the testing tools, look at the rendered HTML. Sound a little bit like a broken record, but that's more or less what you should always do and surprising amounts of people don't do that. So definitely check for the rendered HTML. Oh, okay. Can't join the Hangout, but someone just joins. I'm not sure. I hope that this isn't like a random thing because I heard that Hangout apparently sometimes kicks out more than six people, but we are worth eight at some point. Now we are seven. I hope that this isn't like a larger problem, but sorry Clint that you couldn't join. I hope I answered your question reasonably well. The biggest concern in the living is making sure that your content shows up in the rendered HTML. That's the biggest concern. I have an AHRF link to navigate to my website. Good. When you click on the link, the onclick code prevents the default action. So the AHRF is actually ignored and the onclick updates the content on the current page. Is that considered poking? No, unless the content is very different from what you would expect to get. And yes, we would still consider this to be a link unless, again, you're misleading the user and you're doing weird stuff, having an onclick handler that overwrites what the AHRF would do is not a problem. Okay. Anyone in the audience, last question. Asked opportunity to ask a question. No questions? Okay. There are no further questions. Thank you very much for joining the Hangout and thank you very much for posting all your questions on YouTube. I hope this is helpful. Let me know if you have further questions. I will post to the YouTube channel a new thread for next week's JavaScript SEO office hours. You can always ping me on Twitter as well. And have a great time. Stay safe, take care. Thanks a lot for watching. Bye.