 All right, welcome, everyone, to today's Webmaster Central Office Hours Hangouts. My name is John Mueller. I'm a Webmaster Trends Analyst here at Google in Switzerland. And part of what we do are these Webmaster Office Hours Hangouts. Looks like we have a bunch of people here already, lots of questions that were submitted. As always, if any of you want to get started with a question, feel free to jump on in now. I do, if you don't mind. All right, go for it. OK, so Google mentioned a few times already that the top ranking factors would be get content, get backlinks, and user experience. What would you say that the factors are for Google Business and Google Maps? I don't know about how that's handled on Google by business. In general, I'm also not a fan of having top ranking factors because there are lots of different things that can always play a role in different ways. It changes over time. It can change very quickly. For example, if something is very important and in the news at the moment, then we do ranking very differently than if we're looking at something more as a reference topic. So it's something where I think by focusing purely on the ranking factors, it's easy to lose track of the ultimate goal, which is to provide something that answers someone's questions. And I suspect with Google Maps, the types of questions that go there are very different, where people might want to find a local store that's currently open or phone numbers or addresses or routes to get there. And that might have completely different ranking factors there. So I don't really have any smart ranking factors for Google My Business, sorry. Do you think that asking your customers for recommendations could be something good to increase your positions on the server results? I really don't know. It's something where on our side, we separate the Google My Business side from the web search side, and we essentially see them as completely separate things. So the ranking factors there can be very different, and I really don't have any insight into how they handle it on their side. Cool. Thank you. Hi, John. Hi. I have a couple of questions, actually. One of our clients have some affiliation with a Canadian magazine. Now, they want to publish some of their content, not all content, some of their content on their website. Now, my client is based in Australia, and the magazine is based in Canada. So we have the right. They give up their permission to publish their content on our website. But if we want to publish their content on our website, what is the best way to do? To avoid any penalty from Google? Because this is duplicate content almost. So what is the best way to do? I think it kind of depends on what you want to achieve by doing that. If you're publishing it on your website, just so that users of your website will be able to get there as well, then what might be an approach is just to say I'll put a rel canonical to the canonical source of this content. And that way, your website wouldn't get indexed. Their website would be a bit stronger, and the content would still be on the website. What about if we use a href-lang tag instead of the canonical tag? Because that is something that's another approach. If you can clearly say that this is like a translated version, or this is the English for Australia version, and this is the English for, I don't know, US version, then that's something you can do as well. The difference between the canonical and something like an href-lang is with the canonical, you're making that kind of canonical source stronger by combining those signals. And with an href-lang, these two pages have to stand on their own. So if it's a competitive area, and you've split your content into two pages, then you kind of have two pages that you need to work on to support through your search results. It's not just one page. If there's any chance of getting penalty from Google for this duplicate content, if we use this night, OK. No, no. My next question is, if, for example, sometime we launched a new website, we do redirect from old URL to new URL, if we do this, the page rank of the old URL will pass through the new URL, or the new page will be totally phish page rank. If you redirect from one URL to another, then we try to forward as much as those signals as possible. So it's always a bit tricky that some things might kind of get stuck with the old URL. But for the most part, we really try to take everything and forward that with the redirect. And what about if the old URL has a lot of backlinks? For example, while that client I'm talking about, he has a lot of content, and a lot of people give links to his blog post. Now, what about those backlinks? Will the new URL get the benefit of all those backlinks if I do the redirect? Yeah. And the next thing is the canonical. If I don't do the redirect, should I use the canonical? Will it work in the same way? Similar. Yeah, it essentially also does the same thing in that we try to pick one page and we make that one the primary one. With redirects, with the rel canonical, with things like just duplicate content in general, we take a number of these factors together, and we use them to pick which of these URLs would be the canonical one. And that's the one that would get the signals. So if we see that redirects go to one page, then that's a good sign for us. If we see the rel canonical is also pointing to that one, that's a good sign. If we see that the internal links point to the other one, for example, then it's kind of something that we have to make a judgment call and say, well, redirect goes here, internal links go here. We can send who they really want to be. It's not a matter of them ranking differently in the end. It's just, well, we choose this one or we choose this one. So the clearer you can make all of these signals to point to the one that you want, the more likely it will be the one. Thank you guys, Maria and Martin. Thank you. All right, let's look at some of the questions that were submitted, a bunch of things here already. So the first one is something I need to look into a little bit more in detail. It's someone who's been getting an extensive number of requests from Googlebot that end up being 404, and the requests look like they follow a pattern, but it's really hard to tell. I took a quick look at this on our side, and it looks like this is something that's coming in from the Google Shopping side. So perhaps there's a shopping feed associated with this website that's triggering us to try to crawl all of these URLs. And maybe that feed has an issue or has a problem that we're not interpreting properly. So I need to wait to hear more from the team about that. But in general, these kind of requests wouldn't be problematic. Even if they lead to 404 pages, we would drop those 404 pages. And we would still crawl the website otherwise normally. So it's not that the website would be ranking worse in search. It's essentially just that you're kind of stuck with all of these requests in your log files that you probably don't really need. So hopefully, I can get that cleaned up. In the last hangout, you mentioned it looks iffy if you add a noindex tag to new pages. On our site with our products, we also offer affiliate products through a live API feed. We're not able to add any value to these pages, so we add a noindex tag to individual product pages. Since the same product, same text is published on numerous other sites. So based on what you're saying, could this be a problem for us? So I'm not completely sure which part of one of the previous hangouts you referred to with regards to noindex. In general, you can put the noindex tag on anything that you don't want to add index. So if these are affiliate products and you know that you're not providing anything, the value that you want to have found in search, then put a noindex tag on it. That's perfectly fine. One tricky part with the noindex is just if you switch between noindex and index, then that just takes a bit of time for us to actually pick up those changes. So that's kind of the thing I would watch out for. But in a case like this, where you have products or pages that you don't want to have index, put the noindex tag on there. That's what it's for. Regarding URL structure, what would be better for rankings? Having the last path right after the TLD or some subdirectory, something like domain.com slash category slash keywords, whatever, or just having domain.com slash keywords, whatever, which would be better. From our point of view, both of those are fine. That's something that works either way. Sometimes having a separate category makes it easier for you to track things. So it's not so much that from Google search side, we have a strong preference. But if it's easier for you to track and monitor your pages, then that makes it easier for you to recognize issues and to resolve those faster. So that's kind of what I would tend towards rather than just purely looking at what search engines want there. And from our point of view, both of these setups are correct. I know there are numerous ranking studies out there that look at the top rankings and say, well, shorter URLs rank better. From our point of view, there is nothing hard-coded saying that shorter URLs would rank better. So you use a structure that makes sense for you. When working on large e-commerce sites, analyzing the log files, what would you recommend to look after first? Redirect loops, 404s, mobile versus desktops, time spent downloading? I don't know. I think those are all good things to look at. I think with the large e-commerce site, the first thing I would look at is try to see if there's any way to optimize the crawling in general. And usually that's visible by looking at the URLs that are actually crawled. And a lot of times, especially with an e-commerce site, you'll have different URL parameters attached in various ways, different sorting and filtering URLs. And usually when you look at that and you try to dig in for patterns, you can see, well, I can save maybe half of the crawl by fixing these one or two small problems with our URL structure. And that's something where I think a small change can have a fairly big impact. Because if you can reduce the amount of URLs that need to be crawled for your website by such a big amount, then that can make it a lot easier for us to actually stay on top of things and crawl your other content much faster, much more frequently. So that's kind of what I would look at there first. I think redirect loops are another thing that can have a big impact on your site as well. But usually that's something that users would see too. So if there were lots of redirect loops within your website, then probably you would get complaints from users not need to dig into your crawler logs to find out more about that. But I know there are also a lot of really cool tools out there that analyze log files. I think it's really fascinating to see how Googlebot and other search engines crawl the website. And it can really give you a lot of information on where you can focus your energy to kind of put a little bit of work in to have a really big impact when it comes to crawling. If anyone disavows completely all backlinks of a website with a disavow tool, and then a year later they resolve that or clean that up, does Google reconsider those links which were in the disavow file? Yes, we absolutely, when we look at the disavow file, we look at the current version. When we crawl the rest of the web and we see that there are URLs on there that are disavowed, we'll drop those during that crawl. If during the next time they're not in the disavow anymore, then we'll pick those up and use them properly. So that's something where if you fix these issues, they will absolutely be reflected. Can you confirm whether EN-EU is now officially or unofficially supported in HFLang? Oh, my gosh. So in the HFLang tags, you can specify the country, the language, and then a dash, and then the country code. And some sites are somewhat creative or take an easy approach to that, and that includes our sites as well, in that often they'll have a CMS on the website that has, on the one hand, normal country code codes that they use within the CMS. On the other hand, they might have kind of custom country codes that they also use. So this is something that I believe on Google's side we do as well, or sometimes we still do this, where maybe we'll have a Spanish version for Latin America and we'll have that internal system as something like LA for Latin America and the rest of the country codes are perfectly normal. But for Latin America, we just group them all together because I don't know, maybe that was easier at that point. So when we generate the HFLang files or the links for these, we in the past included something like ES-LA, which from our point of view is actually Laos, so it's not really Latin America. It wouldn't really work. But we also have other codes that are not valid at all. And this is something where I wouldn't say that the webmaster is doing it wrong. It's just, well, the previous system has been set up in that way and HFLang kind of picked that up. So I don't really fall to sites for running into this problem. I can see where it comes from. But in practice, what happens there is we would flag these wrong country codes as issues in the HFLang report. So if you look at the HFLang report, you'll see, oh, you're using country code, I don't know, Asia, for example, to try to cover all of Asia, which probably doesn't make much sense. But maybe you're doing something like that. So we would flag that as a wrong country code in the HFLang report. But what I notice is if you include EU as a country code, we tend not to flag that as a wrong country code. So the question is kind of like, so if Google doesn't flag it as a wrong country code, does that mean it actually works? Because maybe it would make things easier for sites that target English-speaking users within Europe. And looking into this a little bit deeper, it looks like we don't flag it as an error. But on the other hand, we don't map any users to that kind of country code, which isn't really a country either. So we would pick that up, put that in our HFLang database, essentially, but no users map to EU because they map to individual countries here. Therefore, it's not actually used. So that's where I think some of the confusion came from here, in that on the one hand, we don't flag it as an error. On the other hand, it's really hard to tell what is happening. So I wanted to dig into the bottom of that to see what was actually happening there, rather than just blindly saying, well, that's a wrong country code. You should use something correctly. So long story short, you can specify this, but it has absolutely no function. So if you want to target individual countries, you really need to do it the way that it's talking. I received a notification that my HTTP website is now mobile-first indexing. Does that mean the HTTPS version of the same site is automatically also mobile-first indexing? No, not necessarily. So we could be picking these up separately. I believe, in general, the mobile-first indexing is something that we do specifically on a per site level, which includes the protocol in this case, so also dub, dub, dub, or non-dub, dub, dub. That's also included there. For the most part, if we know that a website is on HTTPS and it's one of those sites that we switched over, you probably will have received that message as well. I know for my sites that I have verified in multiple ways, I received that message for pretty much all of the variations. So it's something that we send out individually. Does that mean I have to do anything special to get mobile-first indexing for the other version as well? No, you don't. Over time, as our algorithms re-evaluate the web and see that your HTTPS site is ready for mobile-first indexing 2, then we'll switch that over and notify you about that as well. Question about the indexing of product category pages of online shops. Sometimes Google uses information about a number of articles, like article 1 through 10 of 30 in a search snippet. But why does Google consider this valuable information for the user? Won't the snippet be more valuable and contain more information if Google just lets the webmaster use all of the characters in the snippet? I don't know. I probably need to look at individual cases and discuss that with the team. But one thing to also keep in mind is that we don't have an explicit limit on the number of characters that we use for the snippet. Sometimes we use more. Sometimes we use less. Sometimes we use more the structured data and show that as rich results information in the search results page as well. All of this can vary over time. It can vary by query, by user, by language. All of this can change over time. So it's rarely the case that we would say, well, this snippet is exactly like this. And we will always include this part here. Actually, I don't think that's ever really the case. But if you have specific examples where you see that kind of the query that you're doing and the snippet that's provided is really unhelpful for a user, then I would love to see those so that I can take those to the team and try to find ways that we can improve our algorithms. Let's see, how are sites prioritized in the second wave of indexing when JavaScript-based pages come to be rendered? A few people have mentioned that the perceived importance might be something that could move a site up or down in the rendering queue. Is this the right way of thinking about it? And if so, what plays into prioritizing sites and pages for rendering? So maybe just to take a small step back, what this is referring to is if a page requires JavaScript to be rendered for indexing, then what happens in our systems is first we crawl the HTML version and we index the HTML version. And then we realize we need to do the JavaScript part here as well. We need to render the page normally, like in a browser. So we will send it off to our rendering system and at a later stage, the rendering system will render that page and send that back to indexing as well. So that's kind of the second stage of indexing, where first we look at the HTML, then we do the rendering, and then we bring that back into the index. And the timeframe between crawling and indexing the HTML and getting the rendered content, that can vary a little bit. I don't think we have any explicit parameters where we would say this is like we would treat these as being more important and needing rendering faster than others, except kind of the way that we do it for general crawling and indexing, in that some pages we think we need to get into our index fairly quickly. And so we'll try to get those into the index fairly quickly. So that's something where I don't think we have any explicit, separate kind of factors that would involve how we would prioritize the time between crawling and rendering. We saw a massive attack by link building from a competitor using X-ROMR and other blackout tools. His main target was one of our pages. They built a lot of links. All over the world was senseless and spun content, but they link with one of our keywords as the anchor text. So we lost a lot of traffic. And we built up a disavow link, but we don't see that the traffic is coming back. What can we do to tell Google that this is essentially not something that they did? So I think, first of all, I took a look at the page and the site, in general, to see what was actually happening there, if there was actually a problem there. And I don't see any evidence that these links were ever really taken seriously by our algorithm. So on that side, I suspect focusing on those links alone is not really the strategy that you'd need to do there to improve that. Also, probably for the most part, we've ignored those links already anyway, especially since you mentioned things like X-ROMR and the general blackout tools that are used to drop links and create spun content. These are all things that we've seen hundreds of times in the past. And our algorithms are pretty good at recognizing that and just ignoring it. So that's something where I wouldn't panic when you see something like this. I think putting it in disavow file is a perfectly reasonable approach here, because that way you know that Google systems are not going to take it into account, even if they get tricked by this blackout spun content stuff. So I think that disavow file is a good move. It certainly helps to ease the worries that these links might be taken into account. But I would not focus purely on the disavow file or on these links, but rather think about what you can do in general for your website to kind of try to find a way to significantly improve it. And obviously, that's a little bit harder, because significantly improving a website is not as easy as just running a link checker and disavowing those links. We saw a huge drop in SEO visibility in Google CH in February and huge improvements on Google BE for similar rankings. It's a Swiss company in B2B. No major changes were made. What could be causing this to kind of drop in one country and go up in another country? It's really hard to say without looking at the site itself. So if you want to send me the site through Google Plus, for example, I'm happy to take a look at that. In general, though, it can happen that a site becomes more relevant in one country and less relevant in another country. I think that's completely normal. It's also always the case that just because a website is very popular in one country, it doesn't automatically make it popular in the other countries. So I would kind of treat that a little bit cautiously and not think that there's something really weird happening there. This might just be a normal move with regards to the visibility of a website where we think, well, maybe this is targeting more German users, so we'll show it to more German users rather than more Swiss users. I tested a no-index directive in the robots text, and it's not working. I submitted them in Search Console, and they got indexed. I know it's not a standard method, but Google did Google drop support for this, or if there are any updates. So like you mentioned, it's not officially supported, so any changes there that can happen at any particular time. I'm not aware of us making any changes there, and I think if we did change anything significant in that regard, then there are lots of other sites that have been using the no-index robots text directive as well. Probably they would see significant changes there too. But just in general, I would not rely on something that is not documented and clearly not something that is officially supported for significant crawling or indexing changes within your website. So if you want to try this directive out and say, well, this is a really neat way to solve this one particular problem that I have, then I would always make sure that you have a backup. So for the case that this directive that ends up not working anymore, or maybe it doesn't work now, I don't really know for sure here. But in the case that this unsupported directive doesn't work, that you at least have a backup that does pretty much the same thing as what you wanted the other method to do. So that's kind of what I would recommend doing there. How do you handle split testing when A-B test is generated with JavaScript only? In this case, blocking JavaScript seems best way not to lose search engines. In general, with A-B testing, we recommend that you treat Google as any other user that kind of comes into this test. So ideally, we would also see these A-B tests. Some people do A-B tests by country or by different browsers or kind of create a hash based on other factors and then split those into different groups. That's something that I think kind of makes sense in a case like this, because that way you're not pushing users from one version to the other accidentally. And with Googlebot, essentially what would also happen in that case is we would just see one of those versions, which is perfectly fine. Because ideally, those versions would be equivalent with regards to search anyway. So that's generally not a problem. We just recommend doing this kind of A-B testing for a limited period of time. If you're doing a bunch of different tests after each other, that's fine, but just don't set up this situation where you have an A-B test that's running for years and years and years, and they're just these two versions. And half of the users go here and half of the users go here. And you can't really tell ahead of time as a user which one of these versions you'll end up seeing. So I kind of avoid that situation, but otherwise letting Googlebot see the A-B test is a good idea if you want to block the JavaScript because you think that the A-B test would cause problems by having that be rendered. That's also a reasonable approach. What's the best practice if a large-scale website decided to remove a section of thousands of pages? So I think there are two aspects here. On the one hand, if this is a section of a website, if you can say this is clearly like the whole subdirectory here that needs to be gone, then you could use the URL removal tool for this. So you could use a directory level removal and say everything under the subdirectory should be removed quickly. That's one way to deal with it. That usually results in those URLs dropping out of our search results within less than a day, something like that. And that's kind of a really fast way to do it. The more general case where you just have a lot of different URLs that you need to remove, and they don't fall into any particular category or directory structure, or you have other URLs in those directories as well that you want to keep, for those, you essentially just make them 404, putting no index on the page, and then tell us that you've changed those pages recently. So telling us that you've changed those pages is easiest done with a sitemap file. So you could, for example, set up a sitemap file, put all of these URLs in here, add the last modification date to the date when you put the no index there, or made them 404. And with that, we know about these URLs. We know that they recently changed, and our systems can prioritize crawling those URLs as well. And if we crawl those URLs and we see that they're no longer valid pages, then we'll drop those out of our index. And this isn't something that happens from one day to the next. It really essentially takes time for us to re-crawl everything and to prioritize re-crawling all of those removed pages. But especially on a large website where you don't have a clear pattern, that's kind of the approach that you need to take. Hypothetical site sells to UK, Ireland, and the US and Australia. The product is identical. However, the site owner really wants to make sure that they only see the local price. Because I don't know. As a Swiss user, I always find this frustrating, because usually, it ends up being Swiss people end up paying high price, and everyone else gets a low price. But anyway, back to the question. In the past, you mentioned that currency price. If currency price is the only difference, splitting the site might not be the best solution. And to me, it seems more like a sledgehammer, combined with the need to IP redirect to keep users within their pricing. I can see a world of pain ahead. So yeah, so I think there are those two approaches generally. On the one hand, if you have separate landing pages for each country with their own pricing on it, those landing pages would get indexed. And with the hreflang, we'll try to show the right version, but it's possible that a user ends up on the wrong version as well. And because of the way that we do crawling, you wouldn't be able to redirect all users that currently map to the US and just redirect them to the US version if they look at maybe the UK version or the Irish version, for example. The reason for that is because we primarily crawl from the US. So if you always redirect US users who go to the Irish version to the US version, then we would never be able to index these local versions anyway. So the redirect is something that I would tend to avoid. Having separate versions, like I mentioned, means that they have to be indexed separately, which also means that they have to be able to stand on their own. So if you take one strong page and you split it up into these five different variants for different languages or countries, then you're kind of diluting the value of that page a bit, which makes it a bit harder for them to link. So as much as possible, I try to keep that together. However, if you keep that together on one page, then even though you might be showing different currencies and numbers to users, the problem there will also be that Googlebot crawls from the US again. So we'll probably only see the US pricing there. Maybe that's OK with you. What could happen is that users see the US pricing in the snippet. They click on the result. And they see their, I don't know, UK pound prices instead. Maybe that's acceptable for you. Maybe not. It's something where you kind of have to make a call almost from a business point of view. It's like, which of these compromises do I want to get into? With regards to rich snippets, rich results where we can pull out the prices and show them as a kind of a richer result on the page, there we would also only support one currency. So we would probably just show the US version. Since we're crawling from the US, then we would see the US version on the page. So there, again, if you're using the markup to highlight the price, then we would probably pull that price out and show it to everyone even more visibly. So somewhere in between there, there's perhaps a compromise that works for you. What I've also seen is some sites set the pricing with JavaScript, and then they block the JavaScript from being crawled completely. That might also be an approach. That way, we would be able to crawl the normal part of the page, all of the content on the page. We would just never see the price. So if someone is explicitly looking for the price, we wouldn't be able to find that. If you want to have the price rich result on your page, we wouldn't be able to pull that in. But that might let you get away with having a single URL that automatically adjusts the price based on the user's location. There are multiple ways to deploy Atraflang. Which option is better and why? What's your recommendation between subdomain and subdirectory for multilingual sites? Oh, boy, subdomain and subdirectory question. Luckily, this one is limited to multilingual sites. From our point of view, both of those work. That's something. Some people use subdomain. Some people use subdirectories. When it comes to Atraflang, in particular, where you have content in multiple languages, you can also use parameters in the URL. Anything that differentiates the individual URLs would work for Atraflang. When it comes to geo-targeting, where you target specific countries, there we really need a clear section of a page, of a site. So we need something like a clear subdomain or a clear subdirectory to actually do geo-targeting. But Atraflang, the language targeting side, that can be set up in any way that you want. Usually, I recommend thinking about what else you want to achieve by having localized content and how you want to track and monitor that. So sometimes it's easier to have localized content in the subdirectory or in the subdomain, because then you can look at analytics and just filter by subdomain, for example, or filter by subdirectory. And you have that information right away. Whereas if you have it somewhere in the parameters or somewhere at the end of the URL, then that makes it a lot harder for you to drill down and see which of my language versions are actually most popular, which ones do I need to work on, those kind of things. What causes a site to be temporarily unreachable in Search Console? What causes a site's crawl stats to flatline in Search Console? So temporarily unreachable generally means that we're seeing a 503 result code, which usually is something that from the server or from the network side is sent to us and tells us currently this website is not available. You should try again a little bit later. On our side, what happens when we see that is we'll slow crawling. If we see that for the robots text file, we'll stop crawling completely because we don't know what we can crawl and what we can't crawl. If we see that for other URLs, we'll slow crawling a little bit because we're kind of worried that maybe our crawling is causing this 503 error. Over time, when that 503 error goes away, we'll speed up crawling again. All of that is done fairly automatically. If you're seeing that your crawl stats are flatlining in Search Console, at the same time that you're seeing these errors, then I suspect we're still seeing a lot of 503 errors for that website. And we feel we're kind of limited with regards to what we can crawl with. So what I would do there is take those URLs and the dates and rough dates that we have in Search Console and go to your hosting provider and figure out what is actually happening with these requests. Why would Googlebot see a temporarily unavailable or temporarily unreachable error in Search Console? Is there perhaps some kind of DDoS protection that is switched on that's a little bit kind of too edgy to kind of let Googlebot through and block the real DDoS attacks? All of those things are aspects that you might want to look at together with your hosting provider. And especially when it comes to these type of codes, we don't really see what is happening on your side because we just see the end result. We send you a request. And you don't send us an answer, or you send us an error code. We don't see kind of what your thinking was or the thinking of the website and the network in general was when it comes to why they're sending this result code back. And sometimes it's as easy as some Googlebot IP addresses being blacklisted automatically by some kind of trigger-happy protection system, which from our point of view, I can understand because maybe we do crawl a lot if you see that we do crawl too much. What I would recommend doing is reducing the crawl rate that you allow within Search Console just to make it clearer to us that, hey, you're crawling way too much of my website, and I prefer if you back off a little bit. Are noindex in 404 pages all time stored in Google for the scheduler for as long as they're stored in Google? I don't think you could say it like this. What usually happens is we will let's see. So assume a page used to exist, and now it turns into noindex or 404 page. What generally happens is we will keep that URL in our systems and we will say, well, this is a 404. Noindex page. We don't have any content. We can't show it in the search results. But from time to time, we'll retry it. So it's not that it's waiting in the scheduler to be retried, but rather it's just from time to time, our systems will think, well, maybe we should double check these old pages and see if there's really nothing useful there, and we'll retry them again. And after some time, we might drop those pages completely and say, well, we haven't seen anything useful from here for a really, really long time, maybe 10 years, maybe longer. Perhaps we can essentially garbage collect this URL in our systems and not even have to worry about it anymore. But those are things that are kind of internal to Google, which you usually don't need to worry about. Sometimes we'll crawl 404 pages that have been 404 for a really long time. Sometimes we'll see a link to a page that has never existed, and we'll crawl that, and we'll think, oh, there's a 404. And maybe we'll retry that if we see more links to that. But in general, this wouldn't negatively affect the rest of your website. So having 404 pages doesn't push out the other pages from the index or reduce their visibility in search. After I set no crawl in URL parameters, selecting some query parameters, will it affect the new URLs or also the old URLs as well? So if you change the URL parameter handling settings, it only affects the new URLs. So the new ones, as we move forward from there, they will take that into account. Also, the URL parameter handling settings are not a strong directive in the sense of a robots text file that tells us never, ever crawl these URLs. But rather, we'll still kind of sample those from time to time just to make sure that you didn't get confused by the UI, which admittedly is kind of confusing, and that these are really URLs that shouldn't be crawled at all. How will the mobile first scroll out, change the search results? Ideally, not so much. Apart from mobile content being used for the snippet and for indexing and for ranking the pages, but otherwise, for a normal user, when they look at the search results, it shouldn't really change that much. I think the most visible effect would be that if they're using a mobile phone to search, then the snippet that we show in the search results is much more likely to match the content that they would actually see on the page with a mobile device. And mobile devices are obviously everywhere nowadays. In the new Search Console, if submitted an index, both have been clearly mentioned. It would have been much more convenient for us. I don't quite know how you mean there. But I know the Search Console team is reviewing all of the feedback that they're getting with the new Search Console tool. So if there's something specific on a specific report where you're like, this is confusing or this would be helpful to have this information here as well, I would definitely submit feedback in Search Console. What signals and hints would Google recognize if a website serves locale specific content? So different content on the same URL based on the user's location. So at the moment, we do this in an extremely limited way. So for the most part, like I mentioned, we crawl from the US. That's where most of our data centers are. That's where most of the Googlebot IP address is mapped to. And I think that mapping of the Googlebot IP addresses to locations is kind of artificial anyway. But in general, most of these IP addresses on most IP kind of location databases, they map back to the US. They're very rare cases where we do crawl some content in some countries with a local IP address. That's particularly the case for countries where we've seen that the US IP addresses tend to be blocked much more frequently. So I believe South Korea is maybe one of those. But it's really rare. So it's not the case that we would take one URL and crawl it from multiple countries to see what kind of content is there. But rather, we would take that one URL and crawl it from maybe only IP addresses from that country or only IP addresses from the US. So it's not like we would crawl one IP address in multiple ways and store that content separately. It's essentially just picking that one location that we would crawl from. So if you are using the same URLs and serving different content based on the user's location, then most likely we would not pick that up. And we would not be able to index that content. I'd really strongly recommend that instead you use something like hreflang to let us know about the individual versions and let us crawl the individual versions and then show the individual version best fitting to the users when they're searching. So that's kind of what I would recommend doing there. The same thing applies for language. It's very common or used to be common that sites would install fancy plugins, I would say. Try to recognize the language of the user's browser and swap all of the content against localized versions there, and that would have the same effect for us. Where we would crawl from the US, we would only see the US content, the English content. If the content is also available in French for users with a French browser, then we would never know about that. And we would never be able to index the French version. We would never be able to kind of bubble this site up in the French search results. So ideally, really make sure that you have one URL for one piece of the content. If you have that content in different languages or for different countries, make sure it has separate URLs. Let's see. An issue with the Google de-index issue in, oh, I think in a forum thread. I took a quick look at this, I think, yesterday or the day before, and it looked like things were settling down again normally. So I believe that should be OK now. What's the difference between hybrid rendering and dynamic rendering? Is hybrid rendering an isomorphic application the same? OK, lots of big, big complicated words. So let me see where to start. So hybrid rendering, so the general problem that both of these are trying to address is that you have a JavaScript-based website that shows content through JavaScript. And you have crawlers or users who are trying to access the page, but they don't support JavaScript or don't support it that well. And you'd like to serve them kind of a static version. And with hybrid rendering, I think this is probably one of the approaches that will be more relevant in the future. What happens is the first time that any user comes to this page, they will get a static HTML version, because that's something that can be cached. That's something that can be served very quickly. That doesn't require any CPU resources or minimal CPU resources on the user side to actually load. That would work for Google as well, because we would crawl the page. We would see a static HTML version. We can index that immediately. And as a second step, when they're on this page, if they click on anything, then it switches to the JavaScript version of the website. So that kind of combines the best of both worlds and that search engines have the content immediately. Users see the content immediately, yet they still have the ability to go into this interactive mode where everything is handled with JavaScript. Dynamic rendering takes a little bit of, I'd say, easier approach there in that they say that doing the switch between a static version and the dynamic JavaScript version is very tricky, and it is very tricky to do well. So the approach with dynamic rendering is just to dynamically recognize the user agent that is coming. And if it's a crawler like Googlebot or some social media services, then it'll serve a static version to that crawler. Whereas if it's a user, it'll just show the normal JavaScript version. That lets you avoid that difficulty of having to switch between a static version and a JavaScript version while the user is on the page. But rather, you just only serve a static version to one set of users, like search engines. And you only serve the dynamic version to normal browser users that would support that. So those are kind of the differences there. Hybrid rendering and isomorphic is, I believe, the same. I believe it's called isomorphic JavaScript. And that's kind of the name that the folks from the React framework gave it on the Angular side. I believe it's Angular Universal. These are essentially just different names of trying to implement the setup with hybrid rendering. I'm pretty sure there are other names as well. Let's see. Yeah, I think that was pretty much the question, kind of what the difference is and what would happen there. From our point of view, both of these approaches would work well for search. Because for both of these, we would see the static version that we'd be able to index immediately. So we wouldn't have this extra time that's needed to kind of reprocess and render the page to actually get to the content. So I think both of these are kind of OK. Which of these approaches you take depends a little bit on the capabilities of the setup that you're using. I believe dynamic rendering is often a lot easier and a lot less prone to errors, because you don't have to worry about this mixed state of someone having a static page and then shifting over to the dynamic version. But if you can do that in a clean way, then that's probably a really nice way to get that win out of the initial static serving to all users. Let's see. Few questions left. Let me see if I can simplify them and run through them very briefly. I'd like to use video tag of our web pages as a design element. How can I put it on my page, but not signal to Google that it should be a video play page? So there are two things I believe you can do there or three things that you could do there. One is you could block out the video file with robots.x so that we can't crawl and index the video file. That's probably a pretty easy way to do that. Another is that you can do the same for the thumbnail images that you have specified there, which also tells us, like, we can't use this as a video page. And the third thing is to use a video site map extension to tell us about this video tag you have on your page. But to explicitly tell us, actually, this video is expired and we shouldn't index it. So there's an expiration date that you can specify in a video site map that tells us this video is no longer relevant or no longer relevant. So those are the three approaches I will take there. I see video tags on web pages more and more. I think it's a great way to get animations into a page as a design element. So I would definitely look into these options to see which one of these would work for your specific case. Then there's a lot of information about a website that had a problem that I believe you also posted in the forum. I passed that on to the team that's working on this here. I haven't heard anything back from them yet, but I'm going to imagine that they'll find a solution there or something that we can pass on back to you and maybe post in the forum thread as well. I also saw the top contributors in the forum have escalated that to us too. So it's reached us in a number of ways. I noticed that Google updated all of the international health pages. Yes, we added a bunch more information there to make it a little bit clearer, hopefully. If there are still parts that are confusing, feel free to let us know. We can update those as well. We have a large server which unfortunately has been too expensive for us based on a downturn in traffic. Through word of mouth, a number of sites are willing to share our server as well, but they're far less reputable than us. We're letting them share our server, help defray our costs, affect our SEO. Using a shared server is perfectly fine. So shared hosting is very common on the web. That's absolutely no problem. So I would not worry about that at all. OK, wow, we made it to the end. I think. Let me just see. Nothing more coming. Oh, here it is. Yeah, I think that's it. Cool. All right. What else is left from you? Mihai. I have a question, John. I actually, I think it's a Search Console bug. I'm not sure if that's the case. Let me try to share the screen too. OK, so this is my robots.txt file, if you can see it right now. Kind of, yeah. And let's say I have this line, which is disallow. And it uses a colon, something, colon, something. And this is the live one, just to be clear. So let's say I have a URL like test.txt. And there's a parameter that might use colon instead of the equal sign. So if I test it, it says blocked, which is fine. That's the rule. If I use the encoded version, which some platforms may use when creating those URLs, and I test it, it says blocked as well, which kind of makes sense. But if I go to fetch as Google and I test the encoded version, I do a fetch on it, it doesn't actually block it. And the colon version, the non-encoded version, it does actually get blocked. So it seems that the tester uses both the normal and the encoded version does respect that rule. But the fetch as Google does not. And it seems that Googlebot, in fact, does not consider the encoded version as the non-encoded version when it comes to crawling aspects of Googlebot. I have no idea. I will need to double check on that. So I think one of the tricky parts there is depending on which part of the URL that you're looking at, doing that kind of encoding is valid or not valid. I'm not sure about the specifics of that, though. So I need to double check with the Googlebot team on that. So I just showed you this test because I can't actually show the client side. But the client side did have that rule. And all of their parameters, it's an e-commerce website, so it has a lot of filter pages. All categories have filters. And instead of the equals sign, they use the colon sign. And they thought they blocked all filters, which is what they wanted to do with the colon. But then they got the notice that mobile indexing has been enabled for their site. And the CROLA stats went up from 5,000 pages to 700,000 pages. And they noticed a lot of those URLs were being indexed. So this is how we kind of reversed engineering and got to the tester and saw that they are indeed not being blocked. Because we also see in the access logs that Googlebot does access them. Yeah, I don't know. So I need to double check how technically, it's valid to use encoding in the URL parameters like that. So I believe there's a difference between obviously the domain name and the normal file name part of the URL, as well as the URL parameters, everything after the question mark, with regards to how the character sets are handled. But I don't know offhand if that would affect something like the colon. That seems like something that we should at least have consistent across those tools. So I'll double check on that. And what I've noticed is that if you use the encoded, you add a disallow for the encoded version of the URL and you test it in the tester tool, it doesn't work. If you use the exact same percentage point 3A in the tester tool and you use that percentage point 3A URL and you test it, it doesn't show it's being blocked. Yeah, I don't know. I suspect it kind of comes down to where we're allowed to do the kind of encoding and decoding part. But I don't know offhand how those three three parts of the URL are actually parsed separately there. Sounds like a fun edge case. Sounds like you had to dig in to figure that one out. I think that general rise in crawl activity when we switch to mobile first indexing, that's to be expected. That's something that we do in general with sites when we switch them over. Because we want to make sure that as quickly as possible, we have the full mobile state in our index rather than this mixed state of some desktop, some mobile. So you should, for the most part, see that kind of jump in crawling. But seeing us crawl and index the wrong pages, it seems like something that we should make a little bit at least consistent with the testing tool to make it easier to figure out what is allowed and what is not allowed. Right, they only have like 10,000, 15,000 URLs. I mean product URLs categories and things like that. So when we saw 700,000 URLs as being crawled per day, I was kind of scary. And also, we didn't see in the access logs, we only saw like 20,000, 15,000. But the crawl stats show 700,000. Is there a reason why we might see that there but not in the access logs? I don't know. That should be pretty consistent. Yeah, weird. OK. Maybe we're logging things not correctly. I don't know. I'd love to have to blame on you. But I do want to kind of double check what is actually happening first. Yeah, well, what we do know is that Googlebot is crawling those URLs that we thought were being blocked by that rule. So it's not doing the encoding the equivalent. No, I think in general, if you have this kind of key and value pairs in the URL parameters, I would strongly recommend doing it with the standard setup, because that's something that we can pick up and extract on our systems automatically, where you can do kind of the URL parameter handling stuff. And we can understand, well, these can be interchangeable in location and things like that. So I'd strongly recommend not coming up with a custom scheme for that. But still, we should be able to catch this a little bit better. Well, unfortunately, it's an enterprise platform, e-commerce platform. So changing things around is. I know. Yeah, those are always fun. I'll send you some more details. Cool. All right. Fantastic. Cool. All right, so let's take a break here. It's been great having you here. Lots of good questions. Lots of good questions submitted as well. I wish you all a great day, even if it's Friday at 13th and a good weekend, too. Good to go. Thanks, Joe. Ciao. Thank you. Bye-bye.