 All right. Welcome, everyone, to today's Webmaster Central Office Hours Hangouts. My name is Joe Mueller. I'm a webmaster trends analyst at Google in Switzerland. And part of what we do are these Office Hour Hangouts, where people can jump in and ask their questions around their website and web search. And we'll try to come up with an answer. A bunch of stuff was already submitted on YouTube. But if any of you want to get started with their first question, you're welcome to jump on in. Joe, I have a couple of questions regarding internal linking. OK. So one would be whether, let's say you have an architecture of categories that you use in your main navigation. And it's one way on the desktop side. So for example, you might have some dropdowns, some mega menus, things like that, that you can hover with your mouse, the dropdown shows. But you can also click on the link and go to that top category if you'd like. However, on the mobile, since you're kind of limited both interface and functionality, maybe that top category is still there with our HTML link. But when you tap on it, it just lowers a dropdown or something like that. So my question is, if Google treats the fact that, well, there's a link there in the navigation. But when users tap on it, it doesn't actually go to that page. It just launches a dropdown or something like that, even though there's an HTML link. So the behavior of the link is not actually acting like a link. It just shows you a dropdown and things like that. Does Google treat those in any way? I guess it depends on how that is modified on mobile. Because if we render that page and essentially that HTML link is swapped out with a JavaScript event and that HTML link is no longer there on the rendered page, then we might not see that internal link. But if the HTML, if the link element is still there, if it's just something like a span on top of the link element and that catches the click and does the JavaScript fanciness, that's perfectly fine. OK, so there's no worry that, oh, it's not taking into account or anything like that in terms of crawling. OK, a second question is kind of related to how there's been this information that when Google sees multiple links going to the same page on a given page, it only takes into account the anchor text for the first link that sees on the page and kind of stores the rest. I'm not sure if that's kind of exactly true. And whether, so for example, if you have a link in the menu, but then you write an article about it, then you might reference the same page with a different anchor text. Does Google understand that and see, well, it's in the main content? So even if you have it in the menu, I'll take a look at that anchor text in the main content because it seems more important. Does that play a role? I don't think we have that defined, like that behavior. So it can go either way. And it can be that we take kind of the multiple links that we find and we combine the signals from that. So that's something where that kind of misconception that if you have multiple links to the same page on one page, then you need to make sure that the most keyboard-wrench anchor text is the first one on the page that is not taking place. So from that point of view, it's not that you need to artificially tweak the order of the links on a page. But it's also not the case that we have kind of this one defined way that we always treat things when we find multiple links on a page. OK, I was mainly asking for e-commerce website. They usually have a big menu with categories and things like that. They don't have a lot of space to use kind of very keyboard-wrench anchors in the menu. So you just have women or men or anything like that. But in the content, let's say you have a blog and in those articles, you kind of reference those pages with more relevant anchor text. So I was wondering if Google could also take those anchors into account. No, from my point of view, we could take those into account. It's just not, I clearly define, that we will do exactly one-to-one. I think for the most part, with normal websites, there are so many different links going to every page with internal linking that it's not critical, like which anchor text you use from this one page to this one other page, because we have so much other information from the rest of the site. OK, last one. So let's assume that certain pages are linked from multiple sections that are site-wide. Like you have one in the main navigation, but you also have one in the footer, like let's say through the contact page or through the FAQ page or anything like that. And since page rank usually works by also counting the number of links on the page matter and the time you link to a certain page also matter, is that something like, webmaster should take that into account, like maybe don't have four links to the privacy policy page on every page on your site, because that kind of takes away from the rest of the links you might have in your navigation, or is that too little of a factor for the father? Yeah, I don't think that would play a role. So I mean, I'm tempted to see if I can play with something like that, maybe on our blog or something, where we put links to, I don't know, maybe a privacy policy 10 times instead of one time. But my feeling is that wouldn't change anything at all. I'm only asking since certain tools like Screaming Frog, for example, that calculate the link score in a very basic way, similar to how page rank works. So you might see like a link, the FAQ page or the privacy policy page having a very, very high link score over some of the pages that you might want the better internal link. So I'm just wondering if Google, I know it's a very basic calculation that they do versus what Google does, but I was just wondering if it's something that you could kind of try to tweak maybe if you're in there or not really bother because Google kind of doesn't really care about that. Yeah, I think for the most part, it's wasted time to do that. I think it's important that these tools show this kind of score internally because sometimes you do kind of lose track of pages and you think this one thing is really important, but actually your only link to it once on your whole website. And then kind of getting that highlighted is really useful. But for the most part, I wouldn't really worry about those details. Also, the follow-up question from there is sometimes, should I no follow the links to my privacy policy page because I don't want it to rank? And we have a lot of practice with privacy policy pages and similar pages where we understand they're linked from everywhere within the website, but they're not the most important piece of content on the website. So we kind of understand these kind of relationships. But I think kind of having some kind of score, having a way to look at how a crawler would look at a website in a naive way I think is really useful. So I wouldn't discount those tools just because they don't map one-to-one to what Google does. Got you. OK, thanks. I'm done. Sure. Cool. Anyone else before we get started? Hi, Joe. I have one. OK. So yeah, yeah, that's me. So I asked you last hangout question about that one of our websites kind of duplicating home pages, where I'm from one Japan one from Brazil, but they're kind of holding together and Google thinks they are duplicating. So just want to follow that part. Like, I mean, is there something that we can do from our and that can make it Google for quite, you could say, visible-wise, the other different pages, or how we should go about that part? I don't remember the details there, but was that something like you have the same English content on the page for Japan as well as on a page for another country? No, they are different. Both pages are different. OK. I don't know. I probably need to take a look at the examples again. If you want to drop the links into the chat, I can pick that up afterwards. Sure, thank you so much. Maybe add a comment as well, then I know exactly what to watch out for. All right, cool. Thanks. And links in the chat don't pass any page rank. All right. OK, someone else had another question as well. Hey, John, how are you? Hi. OK, I have two very, very small questions for you. One is regarding this Search Console inside that Google recently launched on Google Search Console. It seems that data that Google shows is based on Google Search Console plus Google Analytics. So just wanted to confirm if your website doesn't have Google Analytics account. So will that insights will still be there, or you need to have Google Analytics account for that? My understanding is you need to have it tied in with Google Analytics because the Chrome. So I haven't been following all of the details with Search Console insights. But my understanding is that the idea is to provide a mix of the data from Analytics and Search Console in a way that it's a little bit easier to understand for people who don't spend their whole life in Google Analytics or in Search Console. So that's something where we really kind of need to have both of those data sources so that we can show that simplified data there. OK. The second question is regarding nofollow. Like, it is believed that Google doesn't pass PageRank to a nofollow link. But recently, Google decided to crawl if the link is even tagged as nofollow. So if this Google decides to crawl that URL, will it pass PageRank to that URL after crawling? Or if Google thinks that this URL is good enough to be crawled and indexed, but we should pass the PageRank or not? I don't think you can simplify it that much. There are multiple aspects with regards to nofollow there. On the one hand, like you mentioned, we decided to start trying to treat it as a hint rather than a clear directive. And that can result in us following links that have a nofollow to discover new URLs. So especially if there's something that we haven't seen before and we see there is a nofollow link there, then we might go off and try to crawl that page. And if we think it's worthwhile, maybe we will index that page. So that's kind of the one aspect. But the aspect with regards to passing PageRank and passing signals, that's something that's totally independent of that. That's something where we also need to take into account a lot more rather than just, is this a new page or not? So just because something that crawled that has a nofollow link doesn't mean that we're going to start passing all kinds of linking signals to that page. It can, but it's not necessarily required. Trying to follow up on that, last I heard from Gary was that it's just currently a policy change around the nofollow, but nothing really has changed practically yet. Has that changed? This is possible. I don't know. Gary would be following up there more. But I mean, this is essentially the direction that we could be heading, where we have kind of those nofollow links, we could theoretically use those to pass signals. I believe we are using them for discovery already now. But I don't know, Gary would know more. All right, thank you. Maybe I should drop him a tweet and find out. OK, let's jump into the questions. We have the top question from Barry. Can you comment more on the August 10th indexing system failure and maybe what happened this past Saturday? See your favorite SEO blog for more details. How do you know what my favorite SEO blog is, Barry? It's like, are you reading my email? I have a little mail where I'm still on the computer. OK. OK, well, it's just that. Bye. I don't have any more information. So my feeling is with the August 10 issue, that's something where it got resolved reasonably quickly. And I don't know if the team would have more information to share on that publicly. And with regards to the one on the 15th, I don't know the details of what happened there. But that also seemed like something where even you kind of didn't really see what exactly was changing, just like everyone was complaining and then suddenly it was gone. So maybe that was just something really short. I think technically it was only short because I didn't get back online until Saturday night. But I think it happened Saturday morning. OK. And then there were some screenshots from like Glenn Gave and other people where it shows some pretty significant changes. And then things went back sometime or at night, at least Eastern time. Are you aware of anything that went wrong? Not specifically telling me what went wrong, but did something go wrong? I don't know. It's like there are lots of systems like Google. I don't know that I can't comment or I don't know. I don't know. I really don't have any additional information on that. So I mean, there are lots of systems at Google. And sometimes when something kind of quirky happens for a really short period of time, they just go off and fix it without letting everyone know. I think the earlier issue seemed to have been a bigger one. And that's kind of also why it took a little bit longer to get everything redone. And we ended up tweeting about it briefly. OK. Thank you very much. Now, sometimes we don't have a lot of internal details to share. OK. Now, question about pagination, or five questions. What's the importance of pagination? Are there issues with pagination? What about infinite scrolls? Should paginated pages be indexed? What is the best practice of pagination? So lots of questions. I don't know. Let me see if I can run through them briefly. So essentially, when we talk about pagination, we mean you have one thing that is really large and it doesn't fit on one page. So you split it across multiple pages. And that could be maybe one long article where you say, well, it's worthwhile splitting this. It could be a list of individual products, maybe in a category, where you say, well, I have 5,000 products. I can't put them all on one page. I will put them on five pages or 1,000 pages or whatever. So that's kind of where pagination comes in. And from our point of view, it's important that we can recognize this kind of pagination and essentially index those individual pages so that we can pick up all the content or all the links to individual items that you have on the paginated pages. Usually, with pagination, one of the questions that comes up is, well, this creates a lot of new URLs because you have to go through all of these pages to get all of the content. And yes, it does generate a lot of URLs. And we have to index a lot of different pages. But if we want that content in our index, if we want to understand those internal links to your other pieces of content, we kind of have to do that. Infinite Scroll is a way of doing pagination without the user having to click Next. The important part there is that you do Infinite Scroll in a way that works for search with regards to crawling and rendering in particular. And we have some guidelines on how to do that. So in particular, one recommendation is to have separate URLs for each page so that you can still go to the individual pages to have links to those individual pages. And for the user, if they scroll down to the bottom, then it's fine to load the next page kind of thing. Should all paginated pages be indexed? Yes, kind of like I mentioned before, if there's something on there that you want to have known by Google, which could be a link to a different product, which could be a part of a longer piece of content, then it has to be indexed. And the best practice is kind of following up on the other steps there. We have to know about these paginated pages so you have to link to them, usually with a Next and a previous link. And with normal HTML links, we can pick that up fairly easily. You don't have to do anything special for this kind of pagination. So just kind of link from one page to the next and link to the previous page as well, then we can crawl through all of those paginated pages. I can't around through this. And I know there is a lot more depending on what kind of setup you might have. So this is something where we will probably have a bit more information, in particular, for e-commerce sites to make it a little bit easier to understand what exactly should be indexed with pagination. In general, we find that most sites implement pagination in a way that just works. So we've kind of stepped back from saying, we need to define exactly what people need to do for pagination. But rather, if you understand these pages need to be indexable, then you understand they need to be linked. You can test them with a local crawler. And usually, that just works out. Google is indexing pages with parameters. Is this considered duplicate content? Should I use the URL parameter tool in Search Console to fix it as a method? So in general, pages with parameters aren't necessarily bad. It's something where in the past, maybe going back, I don't know, 15, 20 years, like a really long time, search engines were kind of reluctant to index URLs with question marks in them because it's easy to create a lot of URLs that way. But that has since changed. Since a long time, and URLs with parameters in them are perfectly fine. Sometimes, they can lead to duplicate content. And in most cases, we figure that out ourselves. We recognize this URL with this parameter is the same content as the different URL with a different parameter. And from our point of view, that essentially just works out. If you find that within your website, you have a significant amount of duplicate content and a significant amount of pages so that you can assume that crawling is really hard within your website. So say, for example, you have, I don't know, 100 million pages. And you recognize with the parameters that you're linking within your website, you're creating 10 times as many URLs as you have pages, then those numbers are really, really big. And that's something where it definitely makes sense to go off into the parameter handling tool to resolve that, also to think about your internal navigation, where are these parameters being picked up on, and to improve that. So that's kind of, I guess, the extreme situation. Most sites are somewhere in between. Some have just a few pages with parameters. And at that scale, I really wouldn't worry about those. Any comments on Google blocking new publishers since December 2019? So I saw a bunch of your tweets as well. I don't have any insight on that. It's definitely not the case that we're blocking any new websites from appearing in search. There have been lots of new websites since then, and they do appear normally in search. But it is something where you mentioned a bunch of examples where you thought things weren't working as well as they should, specifically for Google News. And I forwarded that out on to the Google News team to double check to see if there's anything that we could be doing better there. One of my websites. I think that was related to the News Publisher Center. If you submit it to your bear, it's not getting. I'm not even sure. But it sounds like it's all around the new Google News Publisher Center. So I don't know if that provides more of something for you to let me do. I don't know. I don't really have any insight into the news side. But since crawling and indexing kind of is combined when it comes to news, if we can pick it up for search, then we should be able to pick it up for news as well. I think most sites don't do anything special to be picked up in news. Right, yeah. No, I'm just, yeah. I don't know if there's anything specifically buggy with the News Publisher Center, like the Google Search Console for News Publishers. I'm not sure. It's always possible that there's something buggy. But it would surprise me if it were completely broken since December 2019. It seems like a pretty long time. Yes. One of my websites got backlinks from a domain that embedded my Twitter page feed. And the backlinks are coming from that tweet link. I don't think embedded feed content links are followed by Googlebot. Please enlighten me. I don't know how these tweets are embedded. But in general, when we see things that are embedded with JavaScript or with an iframe on our website, it is possible for us to say that this part of the content could be seen as a part of that page. And it is possible that we will pick up links in tweets like that. That said, my understanding is that Twitter pretty much uses nofollow links everywhere. So essentially, those would be nofollow links pointing to your website there where, for the most part, we wouldn't be passing any particular signals there. So from that point of view, I don't see anything particularly positive or negative with regards to SEO if you get a link from one of your tweets that was embedded on another person's website. Usually, it's a good sign that people like the tweets that you're writing, and maybe are OK with the content that you're providing on your website, assuming they're writing about it in a positive way. But essentially, that's independent of any kind of SEO effect from those individual links. The thing to also keep in mind is that, in particular, in Search Console, the links that we show there are just all links or a sample of all of the links that we know from the web for your website. So it's not that we would only be showing you links there that pass any signals or that have any special weight. It's very possible to show things there that are maybe even disavowed, maybe that have a nofollow, anything like that, too. How does Google know about the category of a new URL being added later? So the URL wasn't mentioned in the sitemap. I don't know how you mean the category of a new URL. But assuming it's just generally, how does Google know about this new URL that I added to my website? And I didn't put it in my sitemap file. We use lots of things to pick up new URLs, but we don't make them up. So essentially, somewhere, there must have been a link to this page within your website that could be within your own content that could be within a sitemap file, within an RSS feed, maybe someone else, maybe a tweet, anywhere, essentially. And if we spot a URL and we think, oh, this might be something that's useful, we might go off and look at that page and see if we can index it. And if we can index it and we think it's something useful, then maybe we will index it. So that's essentially the story of where the URLs come from. It's not that we have any kind of magic backdoor to your server and can look at what you're doing on your server or anything like that. We really kind of have to follow those links. And sometimes the links that we find are a bit surprising to people in that they don't realize that maybe a link is included in their RSS feed, even if they don't manually include it in a sitemap file, or maybe they don't realize that someone else has tweeted a link, or maybe you shared a link by email and the email address you shared it to is actually a public mailing list. And suddenly that link is in a public mailing list where we could also pick that up. Our website and our blog have an English and a Spanish version. In Search Console, we have a property for each language where we upload the sitemap separately. While the pages related to our service are translation and therefore through hreflang, we indicated clearly the English blog and the Spanish blog are not linked as they have different contents. And we try to track traffic by offering different contents and different experience for each language. In the blogs, we implemented AMP versions for the articles. Here are some questions. Is it necessary to define the articles as canonical URL since I commented there the AMP version? Yes. So for connected AMP pages, I assume this is the case that you have one traditional page and the AMP page and you link to them. With that kind of a setup, you need to use the rel canonical. That's kind of the definition of these pages. Even if you have standalone AMP pages where you just have AMP pages and no traditional content, you need to have a rel canonical on that page pointing to itself. So that's independent of any translations, independent of any hreflang. For the AMP setup, you need to have the rel canonical. Is this configuration correct? It's essentially fine to have it set up like that. So if you have some pages that are localized and you have the hreflang links between them and other pages that are just in different languages but not localized versions of each other, then you don't have the hreflang there. That's perfectly fine. hreflang is a per-page annotation. You don't have to do it on all pages. You can do it just on the home page, do it just on an about us page. You can pick and choose however you want. We have some concern that URLs don't have an AMP page, can't be fast enough. Canvas generate problems for the ranking of the blog articles. So we do use speed as a ranking factor, but it's a fairly small factor. And you can make really fast pages that are not AMP. And you can also make slow pages that are using AMP. So just because one part of your site uses AMP and another part doesn't necessarily mean that the part without AMP is slower or in any way treated less favorably compared to the rest of your content. So I wouldn't primarily worry about this difference AMP and not AMP, but rather think about speed overall. And you can test speed on your site as well using the various speed testing tools that are out there. So instead of just purely AMP or not AMP, I would look at the speed overall. Are the AMP versions used to measure and rate the URL speed performance of our articles? So in particular, with the new core web vitals that is coming to search, we don't have a date for that yet, when that will be used as a ranking factor. We'll let you know at least six months ahead of time. But with that, we look at the page that users actually see. So if users see the AMP version of a page, and that's essentially the main version they see, maybe on mobile, then we will use that when it comes to our speed calculations. On the other hand, if people see the traditional HTML version, they have to click a link to go to the AMP version or something crazy like that. Then we would use that page that we showed the users, which in that case would be the traditional HTML version. We used to have pagination on our main blog, but we realized that they were indexed in Search Console by mistake. The same pages with different URLs for pagination. We removed the pagination, and it looks like the issue is solved. We need to de-index those mistaken URLs. So wow, more pagination questions. No, you generally don't need to do that. Usually what happens when you have pagination activated like that on kind of a blog setup, then it's just like paginating through different previews of articles on your site, and it's just a parameter that's added to the page. And if you disable that, then essentially going to those URLs just shows the home page again. So if we re-crawl and reprocess those URLs over time, we will see the home page, and everything will be fine. If on the other hand, it results in those URLs returning 404, then we will re-crawl and reprocess them and drop those URLs out of the index as well, which is also fine. So either way, it's not that you necessarily need to do anything to remove those URLs from the index. Can monetizing your website with an ad partner potentially hurt your rankings in Google? Because after signing up with these partners, what I realize is they like to show more ads on your website. So I'm just worried if it will cause a drop in my ranking. Well, I guess in general, if you add monetization to your web pages, then that monetization has to be visible somehow, which can result in ads being shown on your pages. So that's, on the one hand, kind of to be expected. On the other hand, with regards to Google rankings, we do have some things that we watch out for. So in particular, the above-the-fold content is something where we want to see some actual content, not just an ad. Depending on the way that you have monetization set up, you might need to watch out for that. Then there's a better ad standard, which is something that, particularly from Chrome, they look at for some pages, where if they realize that a site is significantly not compliant with the better ad standard, then Chrome might decide not to show ads on that site at all. So that's something to look at. That's not necessarily related to SEO, but it kind of falls into the same category of types of issues. And I think that's pretty much it. I think, in general, when it comes to monetization on your site, it's important that you do that in a way which is long-term sustainable so that you don't drive users away. Because if you drive users away from your website, then they're not going to be out there recommending your website to other people, which is something that indirectly we might pick up on in search with regards to kind of SEO things. So if you do decide to work with monetization on your website, which, from our point of view, is perfectly fine, it's like you have to pay for your website somehow. You have to pay for your time and your work somehow. So it's not that monetization is bad. But if you do decide to implement some kind of monetization, kind of make sure that it's implemented in a way that you can stand behind, where you can say, well, this is really the way I want my website to be presented in search with the way I want my website to be presented to new users when they come and visit from the search results. Second question there, cross-posting my article on another forum with a canonical URL. Will that downgrade my ranking? My blog is all about programming, and I do like sharing my posts with a canonical URL on other developer forums. So is that going to hurt my ranking? Because the other forums are very authoritative and ranks very well on Google. So in general, there are, I think, two situations that can occur here. And it's something which our systems can't guarantee one outcome or the other. It's possible that we will index both of these pages individually. If we look at these pages overall and it looks to our systems like these are significantly different pages, then we may index them individually. And we may end up showing them in the search results separately, which could mean that the website that is syndicating your content is ranking above you. So that's something that theoretically can happen. The other alternative is that we recognize that these pages are significantly the same. And in that case, we will try to pick one canonical URL. And the URL canonical does help us to understand which of these pages you prefer to have being chosen as a canonical. And in that case, we will concentrate all of the signals on that one canonical URL. And we will use that one for indexing and ranking. The thing to keep in mind is that the URL canonical is just one of the signals that we use for canonicalization. So there are also things like internal and external links, sitemap files, the hidden links within a website, all of those things which play a role in determining which URL we should choose as a canonical URL. So it's not always guaranteed that we will pick your URL as a canonical. But those are the two situations that can occur. And like I said, it's not guaranteed that it'll happen one way or the other. So if you do choose to syndicate your content like this, I think that's just something to keep in mind in that it's possible that your information will be shared more broadly and your information will be findable on other people's websites and that maybe we will show the other website as the one in search results ranking a little bit higher. So that's something where you kind of have to think about, is it important that my page is visible in search? And if that's the case, then maybe make sure that you're just publishing content on your website. Or is it important to you that my information is available in search? And in that case, maybe it's fine to publish it on multiple different websites. So those are kind of strategic decisions that you can make there. And I don't think there's one answer that works for all websites. If you have security reports in Google Search Console, after some submissions, Google will start to review them after a couple of weeks. Will this reflect other crawlbots, like desktop, mobile, et cetera? So I'm not quite sure how you mean there. So it's really kind of hard to say. Usually, if there are security issues, for example, reported in Search Console, that would mean, for example, that maybe your website is hacked. Maybe there is malware that was found on some of your pages. Maybe there was some phishing that someone hosted within your website. Then those issues wouldn't affect how we crawl the rest of your website. It can affect how we show it in search, in that if we understand your website is hacked, then maybe we need to be more careful with what we show your pages for. So that's something that can play a role there. But I don't think it would affect how we would crawl your website overall. But regardless of any effect on crawling or indexing, if your website does have security issues, I would recommend resolving those as quickly as possible and trying to figure out where they came from. So if your website got hacked, then don't just fix that hack, but rather think about how we're hackers able to actually get into my website and add that hacked content. And what can I do to prevent that in the future? Do backlinks from guest posts have any ranking value? Or are we wasting our time for the sake of ranking and not for traffic? I don't know. It feels like this topic comes up every couple of weeks. We've kind of had our stance on this since a number of years now. And essentially, the idea is if you're doing guest posts just for those links, then for the most part, I would assume that those links have no value at all. So that's essentially our stance there. And there are lots of kind of fine details there. But we've talked about them so often. There's lots of information out there to kind of check into as well. I work on a UGC website with more than 30 million pages of content. Main content of the pages does not change much, but the auxiliary content, like reviews and comments, are added frequently. We don't track very well when each content is updated. Our site maps are updated daily with current value URLs, but setting last mod for them as the current day. We also set priority and change frequency to static values. We suspect we're being limited by crawl budget, counter-site map structure, negatively be affecting how Google is crawling our website. So yeah, I think there are multiple things that come together here. In general, if you always have the same date in your site map file for all of your URLs, we're going to be ignoring that date. So if you're doing something like you mentioned here, where you're taking 30 million pages and saying all of them changed today, then we're probably going to be looking at your site map and saying, well, we're just going to look for new URLs within the site map file. We're not going to look at the date because the date doesn't give us any more information. It's not that we can go off and recrawl your whole website every day. So that's kind of the main thing here. The priority and change frequency settings are also settings that we generally ignore because we found that they don't provide a lot of extra information. If we have a date that we can use, then we don't need to know how frequency of page might be changing. With regards to kind of setting a date as a change date in a site map file, we recommend using the date that is really significant for a change on your pages. So if things like just a number changes on your web page, that's not usually a sign that this page needs to be recrawled and re-indexed. But if something significant changes on your website, on those pages, then that's something where I'd say it's worthwhile picking that up and using that as a new change date. So that's kind of the recommendation with regards to change dates in general. The last modification, if you don't do that, if you have it set up like you have here, we will ignore the last modification date, essentially. And we will just use the site map file to try to recognize new URLs on a website. But we don't kind of penalize a site for this kind of site map file. It's not that we would crawl less frequently or we would crawl worse if you have a bad site map file like this. We would just crawl naturally like we would any other website. And usually what that means is we will just go off and crawl pages on your website, try to refresh them, and cycles that we think kind of make sense. And we will just use the site map file to recognize new things on your website. This doesn't affect crawl budget at all. So crawl budget is specifically more of a kind of foundational technical thing, which is based, on the one hand, on the demand that we need, that we have on our side with regards to crawling and indexing, like how many pages do we think we need to re-crawl from this website every day? That's kind of the one thing. We would like to do this much. And on the other hand, we kind of have the limits, which can be the limit that you set in Search Console. It's also based on kind of the server capacity, how quickly do we think your server can respond, how much, I don't know, how many requests can we send it every day, those kind of things. And those are all independent of the site map file. So just because just if you have a bad site map file, doesn't necessarily mean we will crawl. Less frequently, we'll just crawl a little bit kind of more organically rather than direct, based on your site map file. Is canonicalization a best practice, or is not having duplicate content at all a best practice? I think a little bit of both. So not having duplicate content makes it easier for us to crawl and index kind of the primary content that you want on your site. But it's kind of impractical for all websites to not have any duplicate content at all. It's essentially normal part of the web. So because it's such a common thing on the web using the rel canonical, using proper canonicalization, just makes it a lot easier for us to focus on the good parts of your website. So reducing the duplicate content on your site is good. Using rel canonical is also good. Wow, lots of questions left. Let me just double check to see. There's one other crawl rate question. Can you suggest an optimal crawl limit for Googlebot? If I set a limit of 30 requests per second, is that enough? That's totally up to you on your website. So we don't have any specific limits that we recommend. Usually, with regards to the search console setting, we suggest that you just leave it as, like, Google aside. Because the search console setting is more of an upper bound limit. It's not that you're saying Google will crawl this much, but it is that Google can crawl at most this much. So for most websites, you don't need to set an upper limit. We will figure that out ourselves. If you do notice that we're killing your server and causing you a high bandwidth bill or something like that, then setting that upper limit is definitely a good idea. Wow, still a bunch of questions left, but time is running kind of low. Maybe I'll switch to any questions from you all. Is there anything left from your side? I just wanted to quickly confirm that this person is saying not a single website since December 19, 2019 has been indexed or ranked that is new, like a new website. And I asked on Twitter, has any SEOs had any new websites indexed or ranked since December 2019? And I'm getting a bunch of yeses. So there you go. OK, OK. I mean, maybe it's not a general issue, but it's still worthwhile looking at these things. Sometimes they're weird quirks. John, can I ask another one on internet linking really quick? So assuming, let's say you've changed something like the CMS of your website or something like that, and you have to use a 3.1 redirect, and maybe because of some issues like the design or anything like that, you kind of have to link using those 3.1 redirects. You cannot link to the final version. Your navigation and most of your internal linking still goes to the old URL, which then 3.1 redirects the new one. For users, you kind of add that extra step of a 3.1 redirect. Does that affect Google in any way? I know that Google kind of can pass over it and doesn't. So I mean, it's not great, but we work around that. So essentially what happens there is we try to understand what the canonical is. So we'll follow that redirect, and usually we'll pick the destination of the redirect as the canonical URL, and then we will treat that link as being between the source page and the canonical destination. So just because there's a redirect in between doesn't mean that there's any value loss. It's still linked between the source and the canonical destination. And just to be sure of that 3.1 step, that 200 milliseconds, 300 milliseconds that the user gets in addition, that's not happening on Google's side because you kind of pick the URL separately. You don't follow it like a user. Yeah, yeah. So the one exception, I think, is if you have more than five steps, five redirect steps in between, then that's something where we'll have to recrawl that in the second round. But otherwise, we will follow those steps. We will index the destination page. Probably we'll pick the destination page as the canonical. And if we pick the destination page as the canonical, then users, when they come from search, they go directly to that canonical. They don't even follow those redirects. So even from a speed point of view, users would be going directly to that kind of canonical URL. And from that point of view, it's more that you're adding a little bit inefficiency and it's like you're keeping these things along where if you crawl the website on your own, then suddenly you have all of this crux that's kind of collected over the years. But it's not that it'll cause any problems with regards to search. OK. And one last thing is we have a weird case where it's also related to pagination, where the link to the next page is one URL and it goes to the next page. But within a few seconds, the link changes. I think it's using the history API, the push state, or replace. I don't remember exactly what it was. How does Google interpret that? Does Google see this URL change even though there's no refresh of any kind of gun? Or what exactly is happening on Google's site? And I'm not 100% sure. Yeah. So we do watch out for this kind of when we use a history API to change the URL, because we do understand that some JavaScript-based site, you click on a link, and then it uses the history API to change the URL. And in that case, we should treat that as a link. And similarly, on some sites, you load one URL and it uses the history API to change the URL, and that should be treated like a redirect. So that's something where we try to figure out what it is that you're trying to do there, assuming that we can process the JavaScript that is doing this. So that could be that we would treat that as a redirect to the URL that you're changing it to, which I don't know if that makes sense in your particular case, or if that simplifies the URL, or if that, you know. So that's it. We move some of the extra parameters that don't really are just for the CMS to understand you want the next page. And it can remove the only page it was to. Yeah. Probably you would see that if you use inspect URL, and you saw that we chose as a canonical the simpler URL. So that's something where you can check both of those URLs manually with inspect URL. If we've seen the simpler URL, perhaps we have chosen that as a canonical for that kind of setup. So if you check the more complicated URL, and you see we picked the simpler one as a canonical, then it's definitely the case. Cool. Thanks. I'll try that. Cool. Hi, John, if I may. Sorry. I thought if you remember me, we spoke a couple of weeks ago. And I'm just really simple chasing up, really. Sorry to do this. On the outbound link, unnatural linking reconsideration saga, which we're going through. And I just wanted to see if you managed to take a look for us. Which site was that? Oh, I think you posted it as well. The career site, right? Yeah, that's correct. Yeah. I passed that on to the search spam folks, but I didn't hear anything back. So I don't know. Is it still pending the reconsideration request? That was a long time ago, right? The reconsideration request. Initially from the 7th of May, the initial notification. So from the last status we received, which was the processed notification, it's been 14 weeks. OK. That seems too long. OK, I'll ping them again to see if they can double check to see what is happening there. All right, brilliant. What's your recommendation for how long we should wait, anyway, because it's been 14 weeks since we received the process notification. And we follow up a few times since. I think it's been around four weeks now since our last follow up, but we spoke about this on the last one, so I won't bring it up again. But what's your recommendation for the wait period at the moment? I don't know. I don't know what the current queue there is. I mean, your site, I think it was in English, right? So it's not. Yeah, it's in English. Yeah. Sometimes it's a bit different across different languages where someone from one of the local teams has to double check. But it feels like, especially with English, we should be fairly up to date. I'm kind of curious to see if those kind of extra submissions that you did after the main one kind of block things on our side, but even that shouldn't be making it take that long. All right, thanks for that. No, it's just because you said that you feared last time it could be stuck. This is why we're wondering still what to do. So yeah, if you could do something about it just to get some response would be brilliant. All right, WD40. Cool. OK, we're kind of at time. I'm sure there are still more questions, but feel free to maybe drop them into the next session on Friday. And hopefully I'll see some of you all there, or at least in the next Tuesday one, depending on the time zone, whatever works better for you. All right, thank you all for joining. And I wish you all a great week in the meantime. Thank you. Bye, everyone. Thank you, bye.