 to me. All right. Welcome, everyone, to today's Google Webmaster Central Office Hours Hangout. My name is John Mueller. I'm a webmaster trends analyst here at Google in Switzerland. And part of what we do are these Office Hours Hangouts, together with webmasters and publishers from around the world with regards to different kinds of search related questions around their websites. As always, a bunch of things were submitted already. But if any of you want to get started with a first question, feel free to jump on in. Hi, John. Can I jump in? Sure. OK. I submitted actually my question in the Google Plus. But I just thought that maybe talking through it would be better. So it was called to our attention that our website might be not entirely in compliance with GDPR, because we are not opting users into certain features, like making their profiles public, but rather just do that by default. So in order to, and our website has quite a strong social component to it. So there are quite a lot of users, quite a lot of user and so on. So in order to get more compliant with the GDPR, we were thinking that while we will exclude certain pages from the search, we will add them to the robots TXT. We will do all different things. But one of the things that we were thinking is whether there is any way to tell Google to just delete particular pages, to forget particular pages, to exclude these pages from index. And I noticed that there is this Google indexing API, very recent addition, but apparently it just works only with job pages. So I was wondering whether Google Search has any way of allowing webmasters to just delete pages, and especially in a bulk way, in a batch way, not just one page after another, but submitting a bunch of pages. We don't have any kind of removals API that's available publicly for bulk submissions like that. So the normal thing that I would recommend doing is if it's a clear part of your website, like a sub directory or something, or a specific file name that everything starts with, then that's something you could use a URL removal tool for. So in Search Console, you can tell us everything that starts with user profile.php, drop that from the search results. I think that's a really quick and first step if you have to do that virtually. Obviously, it's not granular. It's everything that starts with this URL. Pass that if you don't want to do them individually. It's essentially a matter of putting in no index on these pages, robots no index. And if you want to get those updated a little bit faster, you can tell us by sitemap file. And say, this page was changed recently with this change date. And we will go and look at that page. And if there is a no index on that page, then we will drop it out of our index. So that's kind of the faster way of doing that. But pass that something like, I don't know, a specific list of files that you can submit to us that we will take out immediately. That's not something that we publicly have available. Right. So I just follow up question to, when you say no index, do you mean a meta tag? And it is a more a quicker way to tell Google, or a better way to tell Google not to index this page than to include this page into a robots.txt file? Yes. Yes. So the difference between robots text and the no index meta tag, it's a robots meta tag. So it sounds kind of similar. But the difference is with robots text, you tell us not to crawl this page. And we don't know what the content is. And that could mean that we will index the URL without seeing the content. So sometimes you see that in the search results, we have a URL index, and it says there's no description available. And that's because it's blocked by robots text. And the no index tag, on the other hand, tells us we can crawl this page, but you don't want it indexed at all. So that's the deceptical difference there. And if it's something that you want to remove out of the search results, you need to make sure that we can crawl it. So don't block it in robots text, and instead use the no index. Using both at the same time doesn't make sense. Sure. And since you talked about sitemaps, if there are quite a lot of pages, what we noticed is that if we list lots of URLs in our sitemaps and by lots, I mean like thousands and tens of thousands and so on, Google is not so eager to just crawl those pages. It would, in Search Console in the sitemaps section, it would say that, OK, I am aware that you submitted those URLs, but I haven't crawled them. So what we're wondering is whether, again, is there something that we can influence? Is there something that we can convince Google to actually crawl these pages? Or is it just the whim of the crawler? It's not something that you can force to have Google crawl those pages, but what helps is the last modification date in the sitemap file. So not just listing the URLs, but using the last modification date to really tell us this page changed on this date. And it's important that you have the last modification date set in a way that's realistic for your website. So sometimes we'll see sitemap files that say, last modification date is always now. And that's kind of like they're saying, well, always crawl me. But our systems, when they look at that and they see last modification date isn't set to now for all of these URLs, it basically tells us we shouldn't trust this data. So using a sitemap file with realistic last modification date tells us, well, here's really when this page changed. And we can kind of trust that, and we can use that to prioritize our crawling. So it doesn't mean we'll automatically go crawl those URLs right away, but we can prioritize them when we do have time to crawl more URLs from your website. Is there one that's realistic? It's a really interesting point because we kind of switched from a very bad way of updating sitemaps when we just created a sitemap and basically forgot about it. And it stayed there for like a year unupdated. Then we switched to a method when we are basically updating our sitemaps every day. And you are probably saying that the date is always now and it is also bad. Is there a middle ground? Because we cannot, since our content is in a large way user-generated and in a large way it's automatic, it's rather hard to be sure when the content actually updated, but we want our pages to be fresh and to be relevant. Is there any good heuristic of how frequently the sitemap should update? I would definitely update the sitemap file whenever you have a chance. So that's something like if you can do that daily, I think that's great. Oh, daily is OK. Yeah, yeah, updating the sitemap file is fine. But the data in the sitemap file should be kind of reasonable. So if you have a forum, for example, with forum threads, then maybe just use the last state of the last post in that thread. And then that works for us. Or if you have user profiles and you know this user profile was changed recently, then use that date. If they switch the privacy setting, then use that date. If they didn't make any changes for a while, then use the old date. Or you can even not use a date if you want. OK, thanks. Thanks a lot. Sure. Oh, I can't hear you. Somehow I see your mute button going on and off, but I can't hear anything. Otherwise, feel free to type something into the chat and I can get to that in between. All right, let me run through some of the other questions. If when your microphone works, feel free to jump on in again. Let's see. Dynamic rendering, a part of the page has JavaScript. When should I use dynamic rendering, essentially? So I think this is kind of, it seems like a simple question, but it's kind of tricky. So I think first of all, you need to determine a few things which are not always trivial, where it probably makes sense to get help from your developer or to dig into the details of how JavaScript works. First of all, you need to kind of double check that what kind of content is created with JavaScript. So in particular, is JavaScript required to show the content that's important for your page, or is JavaScript essentially just an enhancement of the page? So for example, if you add extra functionality, like fancy buttons, fancy transitions between different states using JavaScript, then maybe your primary content is available in the normal HTML. And you don't really need to worry about JavaScript. Another really common situation is you have your primary content in HTML, and you use JavaScript to pull in advertising. And again, your primary content is there, and JavaScript just does some additional work to kind of make it appear. And that's perfectly fine too. Then when you shift more towards kind of the JavaScript-based website situation, where if you turn JavaScript off in your browser, then you don't see any of the content. Maybe you see kind of a structure of the page, but none of the content. Or maybe you see nothing at all. Then that's the situation where this page, that page that you're on that you're looking at, requires JavaScript in order to show the critical content on the page. And at that stage, you're in the situation where you need to think about what do I need to do to make this page work for search engines. The first thing to keep in mind is that Googlebot can execute a lot of JavaScript. We're pretty good at that. And it might be that the JavaScript that you use on your page works perfectly fine in Googlebot. So a simple way to test that is just to search for some of the text and to see if Googlebot is able to bring up those pages. If those pages show up, then probably things are working well. If those pages don't show up in search, then it might be that we have trouble executing the JavaScript there. And there are two approaches you can take here. One is to go down the dynamic rendering route. The other is to figure out what is blocking Googlebot. Maybe it's something simple like a JavaScript file, blocked by robots text, and to try to fix that so that Googlebot can render those pages itself. That's kind of the first step between dynamic rendering or letting Googlebot do it. The next part is that, for a large part, when Googlebot has to render a page, it takes a little bit longer for us to pick that up and put it into our index. So first, we index the static HTML. And then, at a second step, a little bit later, we do the rendering to render the page, to look at all the links that are on there with JavaScript, and to process the page like that. For most websites, that's perfectly fine. If you have a website that changes very quickly or that needs to be indexed as quickly as possible, for example, if you have a news website, then that might be a bit problematic. If Googlebot takes a week or a couple of days even just to actually render the page, then probably that's not something that's so optimal in your situation. And that would be another situation where you might go into the dynamic rendering realm. And with dynamic rendering, you do add a lot of complexity because you have to do the rendering part yourself. So this is not something that I would do as a trivial thing to just turn on and off and try out. But rather, it's something that you really need to think about where does that play in with my website? Is that required for my website? And if it is important for my website, how do I set that up? How do I monitor that what is happening with dynamic rendering is actually really the content on my page and not something breaking on my side now, rather than on Google's side? So there are lots of questions there. Some of these are quite technical to figure out. So if you're an SEO and you don't have that much experience with JavaScript, it's really important to work together with your developers. I think that's one of the themes of probably the next year or so that will come up more and more is that as an SEO, you need to be able to speak to developers and you need to be able to explain what the issues are and to listen to them, to also guide them to implementing something that works well for the website and that works well for search as well. John, can I just ask a follow-up? The advice is just given there is that I'm not a technical person. So is that for all versions of Java? We've got a partner that's building a site in React Java and they're having indexing issues. And can you now read as well as basic HTML or they're going to have a lot of content that's going to be updated with reviews and stuff fairly quickly. But we're still at an early stage. So is React just as good as going the HTML route or not? Is it just as good? That's always kind of a tricky question. It depends a bit on how you set that up. But I think for a website like yours where you do have a lot of content, where you have a lot of reviews coming and going, I could imagine that doing some kind of pre-rendering, like with dynamic rendering, would make a lot of sense. It makes it a lot easier for us to crawl through your pages to pick up the content as quickly as possible. So using React or Angular or any other JavaScript framework, then that would be a situation where I would look into dynamic rendering. I think for both React and for Angular, they're pretty pre-built setups already that you can just activate and use that work fairly well. So that's something where I assume that these developers are on top of the game. They'll know about these things. But it's also something that you can point out to them. OK. Thanks. Hi, John. Hi. Yes, it works. Right, OK, perfect. So I treated you the other day. And what we've seen since some of the recent updates is that we're a medical company. So we go about 3,500 doctors in India and Poland, and Ukraine, and various other countries. So what we found recently is actually that the results for a city plus specialization search, some of the ranking review websites are now taking up three or four of the top places in the results. And it's more feedback than a question, because they're basically all the same content, kind of spun four times. And we're finding this in India, and we're finding it in Poland as well. So it's more of some feedback that we're noticing this and actually not making a good user experience. I think in general, I wouldn't say that's a bad thing. But it really depends on the specific situation. So what I would do there is, if you can send me some explicit examples where you say, I'm searching for this, and it's a fairly generic query, ideally, and Google is showing me these results, and these are bad, because it's all the same, or it's just shuffling around of the same thing, then that's a lot easier for us to take to the team and say, hey, look, this is clearly wrong. We need to fix this. So if you can send me a couple of those examples, you could do that by Google+, or post them here in the chat if you want. And then that's something I can take back to the team. Whereas if you just say, well, review websites are taking up a lot of space, then that's something, if I go to your team, they're like, oh, I don't know what that means, because sometimes review websites can be pretty good as well. They can be useful. So as specific as possible, if you have examples that makes a lot of sense and makes it so that we can really go to your team and say, hey, look at this search query. This is clearly a bad result. We need to figure out a way to improve that. So I sent it to you the other day on Twitter with some screen grants and search terms. All right. I'll send it again for you. Cool. OK. I'll try to get that out. It's not saying it's a bad result. It's just basically it's all the same content spun in a different way because it's taking up the city search plus all the districts. So it's kind of, this is how it's kind of good, but all the same data, basically, to the users. I think those are always kind of tricky situations where it's like, well, it's not completely wrong, but it's not great. But maybe you have some really, I don't know, clear examples that I can pass on to you. I'll try to dig that up on Twitter. Twitter is always a bit crazy to monitor. Yeah, too much, yeah, sure. Yeah. But also relating to the last updates as well, where we've actually, all of our content is written by our 3 and 1 half thousand doctors, then we've taken a little bit of a hit and it's not too major. But we've seen that we've lost out where we've had really quality content written by our doctors have lost out to kind of something that we would consider unprofessional and kind of non-quality content. And I can send you some examples of that as well. That would be fantastic, yeah. Cool. All right, back to some of the submitted questions. Is hreflang an indexing signal like canonical? Is there a problem when all hreflang pages of a set have a no index? It happens when our product is sold out and there is no new stock within six or eight weeks. So we do try to use hreflang as a very small signal when it comes to picking a canonical because we need to have the hreflang link between canonical pages. So if you tell us this is between these two pages and we otherwise wouldn't pick those pages as canonical, then that hreflang link kind of disappears. So we use it as a very small signal in canonicalization. However, you really need to kind of back that up with all of your other usual canonicalization signals. So the rel canonical, internal linking, sitemap files, external linking, all of these things ideally need to align. And then we'll more likely pick those URLs as canonicals and then we'll be able to do the hreflang link connection between them. The other part with regards to no index, if a page has a no index, then we wouldn't pick that as canonical in general. So we wouldn't use that for the hreflang there. What happens in a situation like this where you have some pages within a set that are no index and others that are indexable is we will just focus on the ones that work. So it's not that the website will be seen as lower quality or that we will drop all of the hreflang links from that website or from that set of pages, but instead we'll just focus on the ones that work and kind of ignore the ones that don't work for us. And usually that works out fine. So if in your case, a product is sold out and these pages have a no index and an hreflang on them, we'll see the no index. We'll drop the page out of our index. We won't show it in the search results. And if we don't show it in the search results, we don't have to worry about the hreflang connections. So that shouldn't be a problem. For Google Images, is there a difference between the optimal aspect ratio for mobile versus desktop search? What is the optimal aspect ratio? I don't have any optimal aspect ratio. For either of these, I think it really depends on your images and what you're trying to present there. So it's not the case that we would say that an image that has the right aspect ratio will rank higher. We look at a number of things when it comes to Google Images. We have, I think, an image publishing guidelines document that's pretty comprehensive in the meantime that covers a lot of the different aspects that are involved. So I definitely take a look at that. Since you mentioned mobile versus desktop, one thing I do want to mention is that as we shift over to mobile-first indexing, where we use a smartphone to index pages, one of the problems we have seen is that some sites that serve different contents to smartphones don't use the alt text for their images. And this is a big problem for Google Images, because we use the alt text as a really strong signal to understand the image a little bit better. So that's something where I'd recommend double-checking that your smartphone versions definitely also have an alt text. And again, the aspect ratio is generally not something that I would worry about there. One place where the aspect ratio does come into play, maybe I should mention this just to be complete, is specific types of structured data, depending on how you want your images to be shown. If, for example, it's a part of a recipe or a part of maybe an AMP article that you have on your page, then for some types of structured data and rich results, there are suggestions with regards to the optimal aspect ratio that would work well there. And also with regards to the minimal size, so the number of pixels that you have in your image. So that if you're aiming for a specific type of rich result in our search results, then I double-check the guidelines there and make sure that the images that you're providing match that format so that we can pick those up and use them optimally. But that's not for Google Images. That's really just for the rich result in the normal web search. All right. If a manual action unnatural links comes back after two months, and then after two months it comes back for the blog subdomain, should we file two separate reconsideration requests in this case? Obviously, the blog has different pages and many links. If you have two manual actions on your site on completely different subdomains, I would submit the reconsideration request separately. Also, if these are on separate hosts, then make sure that if you're using a disavow file that you update both of those files accordingly, that you take the problematic links that you weren't able to resolve. And for the www version, put those in that disavow file. And if you have a blog subdomain, then do it for that one as well. So the disavow file is really per host name or per site that you have in Search Console. So I would update that there. You can, of course, use the same disavow file for both if you just collect links in one big bucket and you go through them like that. That's perfectly fine as well. Can we return 503 response code for Googlebot and 200 for the user for one day to assure that things are fine in the case of a migration or changing a site structure? Yes, you can do that. I generally wouldn't recommend doing that, though, because that's kind of you're masking the problem by kind of hiding it behind a 503. So if this is a situation where one day you need to return 503 because you have technical issues or you have some problems where you're migrating the server from one place to another, then that's perfectly fine. We'll come back and try to recrawl those URLs later. We won't treat that as a permanent error. If we see the 503 for a longer period of time, we will see that as a permanent error. And we may start dropping these pages from our index. And longer period of time is usually somewhere a couple of days, maybe a week or so. It depends a little bit on the website itself. So if this is really a longer period of time, then we might look at that and say, well, this error is not going away. Maybe we need to take care that users don't actually accidentally go there and land on the same error. So that's kind of what I would watch out for there. With regards to serving Googlebot a 503 and users a 200, that's fine as well. That's something I would, again, I would do that only for a very limited time. And for most cases where you're doing a site migration, I would try to serve Googlebot and users the same thing because it makes it easier in your log files to figure out, is this migration actually working well? Or are there maybe redirects missing? Are there errors that are showing up for individual URLs? If you're conditionally serving errors automatically to certain user agents, then that makes it a lot harder to track the progress there. But if for whatever reason you need to take down your site for a day or you need to block things for a day, 503 is definitely the right way to do that. Is having multiple sites on the same platform was shared hosting detrimental to SEO? Could this result in a manual action or algorithmic penalty? No, that's perfectly fine. That's something that's extremely common. It's not something where our systems would say, you need to do something special to avoid that. For example, if you're using Blogger, then obviously all of these blogs are using the same platform. And even if there are some blogs on there that are spammy or low quality, that doesn't mean that your blog is necessarily spammy or low quality. Similar to WordPress, there are tons of WordPress sites out there that use the same templates that host themselves or that host on a shared hosting. That doesn't mean that these sites are all problematic or that they're lower quality just because they're using a shared infrastructure. So that's certainly not a problem there. On my search console, I see a lot of links that are soft 404 and they're excluded from crawling. However, these are actually real pages. My website has an archive of things that are no longer there. My website catalogs things in the real world. It seems everything in the archive is being excluded from crawling. How can I have Googlebot index these pages? Here's an example. So I looked at the example. I think this is a pretty cool thing to track and have on a website. I can, however, see how our algorithms might look at that and say, well, this is kind of like an out of stock message that you're giving us. So in this case, it says, archived on top. This attraction no longer exists in large letters and bold. So I could see our algorithms looking at that and saying, well, you're telling us this page doesn't, this content no longer exists, so we will follow your lead and just drop it from our index. So that's something where I could see our systems getting a little bit confused there. I think this is something we could probably do a better job of internally. But it's probably also something that could be handled a little bit differently on your side as well. I don't know how much of your content is set up like that, where it no longer exists. But maybe there is a way to frame it so that it doesn't come across as, I don't know, like an out of stock message, essentially. So what I'll do here is I'll definitely pass that on to the team so that they can take a look at that. It might be that they have a way of excluding these pages specifically from the software for detection there. But I would also try to think about ways that you could highlight the situation with regards to that particular attraction, that particular thing that you're logging on your website so that it doesn't come across as something that's deleted, that's no longer there. It turned out our website is not compliant with GDPR. I think we looked at this briefly before. If Google crawls page A of a website and stores a cache of it, but then the site rolls back the page to a version from 48 hours before, and Google recrawls that page and notices it's an older version, would that send any negative signals to Google? No, that wouldn't send any negative signals to Google. What would happen here is we would just index the other version of the page again. And for us, it's not a case of us seeing one version of a page and then noticing it's going back to a previous version. But rather, what we see is it's serving one version of a page and then it's serving a different version of the page. So we'll go off and index that. And maybe it'll serve another version of a page a day or so later again. And we'll go off and index that version when we have a chance to crawl that. So it's not the case that we would try to figure out, like, oh, it's going back in time. We need to treat this page differently. It's more that, well, the content is changing. So we'll reflect the change content in our index. And usually, if that's a matter of shifting between templates, for example, and the primary content is the same, then that's perfectly fine. So that's not something where I'd say, you need to do anything particular there. For the most part, however, I would try to avoid this situation. So I try to avoid having it such that your page is, for technical reasons, kind of shifting all the time. But rather, try to have it such that your pages are kind of stable. And if there are elements on the page that are dynamic, like sidebar, or footer, or related articles, those kind of things, that's perfectly fine. But the primary content should ideally be kind of the same. So that when we index this page, we can kind of understand that what users will see will reflect what we've seen for the indexing as well. But I think in your particular case where you're rolling kind of between different templates because you have a technical issue on your server or something like that, then generally wouldn't be a big problem. I mean, it sounds like you're trying to avoid this anyway, so you're probably on top of this. We have a news website with a ribbon above the fold, which contains the most recent articles. Since these articles are always being replaced after a few hours, all of our pages are indexed with the ribbon content. As a consequence, our pages end up ranking for these queries related to keywords which can't be found on the content of the pages. What would you recommend? So I see two aspects there. On the one hand, this is something where you could use technical means to prevent this ribbon from being crawled and indexed. So that's something that theoretically, if you wanted to, you can use various ways of blocking that content from being seen by Google. So you could use JavaScript that's blocked by robots.txt and iframe with the iframe content blocked by robots.txt. All of these things are things that you could theoretically do. In practice, you generally don't need to do that. So in general, your pages should be able to pick up the primary content on your pages properly by recognizing that this ribbon is part of the boilerplate on your page. And we should be able to kind of ignore that. So if we're not showing your normal articles for the primary keywords on those pages, then that sounds more like we're actually having trouble picking up those articles and recognizing the normal content on there. So that's something where if you're seeing that we're showing the wrong pages in search and you do actually have good pages to match those queries, then that's something where I would dig into more of the technical side of why isn't Google able to recognize these pages. On the other hand, if you're doing something like a site query and you also see, in addition to the actual pages, you also see other pages showing up for those queries, then that's not something I would worry about. So again, if for the normal searches for your content, if we're showing the right pages, then that's perfectly fine. I would not worry about other pages also ranking. If, on the other hand, your normal pages are not ranking at all and just random other pages are ranking because of this kind of shared dynamic element on the page, then that sounds more like a technical issue that we can't actually pick up the primary content of your normal pages. So that's what I would try to differentiate there. And again, there are technical ways that you can block this kind of dynamic content from being picked up, but for the most part, you shouldn't need to do that. And if you feel that you do need to do that, then usually there are other bigger problems that you need to solve instead. When Google tries to determine a canonical, does Google also take into account the boilerplate with the navigation and all links or just the content without the boilerplate? I'm asking because we had a big navigation with more words than a navigation than on some content pages. After we changed the navigation to a smaller one, we lost many canonicals. We take into account a number of things when we try to figure out whether or not pages are the same with regards to being chosen for canonicalization. So the easiest way, of course, to recognize that they're the same is if they're really, one to one, exactly the same. So that's kind of the simplest approach there. A lot of times there is dynamic content on these pages. And sometimes that's in the menu, in the navigation, or in the sidebar, in the footer somewhere. And because of that, we do try to focus on the primary content of the page. And we try to use that as a means of recognizing which of these pages should be seen as being equivalent. So that's something where we try to focus on the primary part as much as we can. When you change your site's navigation, when you change the boilerplate significantly, then sometimes that takes a bit of time for us to pick it up and to recognize, oh, actually this, which happened here on the side, is part of the boilerplate and not a part of the primary content. And that's something that sometimes can take a little bit of time. But with regards to kind of the last part of the question, after we changed the navigation, we lost many canonicals. That sounds more like, I don't know, a different problem where if we're not indexing these pages at all, then that seems more like something that you could dig into separately. I don't think that would be related to the boilerplate or kind of the canonicalization in general. Recently, due to server coding errors, faulty blank pages were served to Googlebot over one and a half days, and that resulted in a significant number of soft 404s. And consequently, many pages were de-indexed. The technical problem has since been fixed, and the pages are now serving correctly. Will these pages get recalled and re-indexed eventually? How long does that take? So yes, these pages will come back. This is something that our systems do for soft 404s, for normal 404s. We go back and re-crawl those pages every now and then to make sure that we're not missing it. Sometimes people complain about this, but I think in your case, that's kind of useful, because we understand that sometimes things go to 404, and then they come back again at some later stage. We have a technical problem. We should be able to deal with that and get those pages back. If you want to get those pages back a little bit faster, one thing you can do is a sitemap file with the last modification date that matches when you made those significant changes on those pages. So in your particular case here, when you started serving normal content again for those URLs, with the last modification date, we can try to prioritize crawling of those URLs and get those back a little bit faster. Another thing that you can do, which is probably even faster, is within the new Search Console, when we recognize issues like a no-index or empty pages, then in the, I think, what is it called, the indexing report, in there, we will often flag these as errors. And you can go there, and you can say, I fixed this problem. And you can have Search Console verified. And in that case, what Search Console does is it will check a sample of the pages from your website to see if this problem really resolved. And then based on the results of that sample, Search Console will go off and quickly try to recrawl as much as possible of the affected URLs and reprocess them as quickly as possible. So in particular, if you had a technical issue on the site like this, and Search Console recognizes that, then this is probably the fastest way to get those URLs back into the index and back into kind of a clean state. So I definitely check that out as well. Just on that, John. Aran, so if we add in, we've kind of done some no-index, no follow on some of the older pages that we don't want indexed anymore to give priority to other pages, how quickly would you feel that those changes would come through and what is the best approach to kind of change this? Because we're seeing stuff that's like a long time ago on one of our sites where we've changed the no-index, no follow, but we're still seeing it in the index. And it's several months after we've changed this. You think there's a more of a crawl problem or is there kind of? I think the hard part here is always that we don't crawl URLs with the same frequency all the time. So some URLs, we will crawl daily. Some URLs, maybe weekly. Other URLs every couple of months, maybe even once every half year or so. So this is something that we try to find the right balance for so that we don't overload your server. And if you made significant changes on your website across the board, then probably a lot of those changes are picked up fairly quickly, but there will be some leftover ones. So in particular, if you do things like site queries, then there's a chance that you'll see those URLs that get crawled once every half year. They'll still be there after a couple of months. And that's kind of, I think, the normal time for us to kind of reprocess, re-crawl things. So it's not necessarily a sign that something is technically completely broken. But it does mean that if you think that these URLs should really not be indexed at all, then maybe you can kind of back that up and say, well, here's a sitemap file with the last modification date so that Google goes off and tries to double check these a little bit faster than otherwise. OK, thanks. All right. I think I'm kind of headed to the bottom here. Oh, there's somewhere in the chat. That's great. When I first create a page and submit it to Google Index, sometimes I see it in the search results as the tag name and say archives after the tag name. And it does not really go to the direct page that the information is on. It just goes to the tag archive with that name. I put the tag of what the topic is on the page. You can click it, and it will go to the page, but not directly, since Google is picking up the archive, not the actual page. Is it better to not use a tag so the archive doesn't show? Or will Google find the right page after a while? I've not noticed it before, but I've noticed it lately on some pages. I think what you're seeing here is kind of related to what we just talked about in that some pages we recrawl fairly quickly and some pages that take a little bit longer for us to pick up and crawl. So probably what is happening is that these tag pages, these category pages are ones that we recrawl a little bit more frequently, because maybe they're linked prominently within your website, or we've seen in the past that the content there is changing fairly quickly, so we need to keep up with that. So it might happen that we pick up these tag pages, category pages fairly quickly, and the pages that are linked from those category pages, they come in the second step. So first, we update the category page, and we see, oh, there's a link to a new article here. And then we'll follow that link and go off and crawl your article and try to index that as a second step. So that's, I think, kind of the normal approach to how these things update. What makes sense for a lot of sites as well is to try to link the newer content a little bit more prominently within the website. So instead of just having the category pages being linked prominently, maybe also highlight the new content that you think is important. And for example, what a lot of people do is they have on their home page, they have maybe something in the sidebar saying like recent articles, or maybe important products, or something else to kind of highlight, this is something new, and I'd like to show it to a lot of people. And if you present it in a way that you're saying, well, this is new and important, then Google, when we crawl those pages, will also say, oh, this looks new and important. We will follow this link and try to index that page a little bit faster. So that's kind of what I would recommend doing there. First up, I don't think anything is broken the way that it is now with your website. But if these new pages are things that you find important, then make sure that you flag them as important to Google as well so that we can go off and find those and crawl those and index those quickly as possible, too. All right, let's see in the chat. If we canonicalize our German, Austrian, Swiss e-commerce pages to the German domain, should we stop sending site maps for the 80 and CH domains? Probably, probably. So for canonicalization, we use the rel canonical. We use internal linking. We also use site maps. So if you're telling us, on the one hand, the Swiss page should be canonicalized through the German page. And on the other hand, you're saying, well, here is a Swiss page and a site map file. Then we're kind of in a conflicting situation in that we see the rel canonical. But we also see, oh, but you're also submitting the Swiss page directly. So we index both. Should we kind of fold them together? Should we pick the Swiss page? Because you're sending that by site map file? Or should we pick the German page? Because it has a rel canonical. It's kind of a conflicting situation. So the clearer you can make it so that we understand exactly what you want to have done in search results, the more likely we'll be able to follow them. We have a project built on Magento, but have some dynamic content brought in via Java. But we know there's latency on this. How would you suggest we can deal with this? We're working on the latency issue. But is there some way that we can separate the JavaScript from the rendering that Google crawls? I think it depends quite a bit on what exactly you're trying to achieve there. So I think maybe first off, there's a really common confusion between Java and JavaScript. JavaScript is something completely different than Java. So you're talking with developers. Yeah, I think just if you're talking with developers, it makes sense to be as exact as possible there. I don't really care too much about the difference there. The other thing to think about here is what is really the critical content on these pages? And if the critical content is the normal static HTML and you're using JavaScript to add additional enhancements to add maybe extra flavor and maybe in a sidebar you have something that's not critical, but that's useful for users, then that's something where maybe it doesn't really matter too much if there's a time delay between us picking up the JavaScript part and the rest of the page. So we'll still index the normal static HTML as quickly as we can crawl it. So if there's something critical on there, we'll pick that up right away. If there's something in the JavaScript that generates additional content that might take a few days for us to generate that and also see that index. But it's not that the JavaScript would slow down the indexing of the rest of the page. It's essentially kind of separate there. OK, because it's around bringing in appointment data from a more legacy kind of system. And for the whole page to render fully, it takes a lot of time. But it takes longer than what we want it to do. And it's how we can separate that to make sure that page speed is faster. Yeah, with regards to speed, there are probably different optimizations that you can do with regards to caching maybe some of this content, caching the full page in a pre-rendered way. There are different approaches that you can take there to make that as fast as possible. So that's something where it's probably worth trying out a few things and thinking about what you could do to kind of speed that up in general. But that's something from my point of view that if you're sure that the primary content is being picked up quickly, then that's kind of nice to have rather than a required thing. All right. Wow, I think we made it through all of the questions. That's amazing. Right on time as well. Is there anything else from your side that I can help with in the meantime? No? OK. Well, that's fantastic too. Cool. Then thank you all for joining in. Thanks for submitting all of these questions. I'll set up the next batch of office hours probably later today or early next week. So if anything else comes up, feel free to drop those in there. Or in the meantime, feel free to also go to the Webmaster Help Forum and kind of ask the folks there for advice as well. Oh, wow. I just saw. Long question. Let me run through that quickly. Let's see. Two pages, a city page with a user-declared canonical and without a city, a Google-selected canonical. Is Google picking canonical on the basis of first-come, first-serve, or page duplicate is the issue because the pages are almost identical except for the URL? So I think in a case like this, where the pages are almost identical, it's possible that we would choose to canonicalize these pages together. And again, we use different factors for canonicalization. So it's not first-come, first-serve. It's more like, which of these two pages that we think are equivalent should we choose to show in the search results? It's also not the case that we will demote a website for this or that we will show the website lower in search in general. It's really a matter of a technical thing. We think these pages are the same. Which one of the URLs do we show? And we use the rel canonical on the pages. If you have one, we use redirects if you use them. We use internal linking. Which of these pages are more prominently linked internally? We use external linking. We use site map files. And to a small extent, we also use hflang links between these pages to try to figure out which of the URLs should be canonical. And in an ideal situation, you give us all of these signals as clearly as possible. So you say, this is exactly what I want. And then we'll be able to follow that. If you give us conflicting information where you say, for example, here's the rel canonical, but internally I link this page, then that means we have to make a decision there. And again, it's not the case that these pages would rank lower. It's more a case of, well, we have these two. Which of the ones should we show? So that's something where it's usually not critical that we pick exactly the right canonical as you would. But the clearer you have it, the more likely we'll follow your lead. And the easier it will be for you to do tracking and to kind of monitor what is actually happening there. So that's kind of the direction I would head to is if it sounds like you have clear kind of conception of what you want to have happen, so just make sure that you're telling Google as clearly as possible as well. OK, so I have one more question. Is having two meta robots, hello? Am I audible? Yes, barely, yes. Yeah, sorry. So yeah, I just wanted to ask, is having two meta robots tagged in every page actually creating any issue for Google or for the indexing? Because I have read somewhere like two meta robots, like one is index and another is follow. And then another one is meta robots, meta content is for all. The two meta robots are actually, I think they are creating problem, but I haven't seen this thing before. So I just wanted to confirm this thing. Is it creating any problem? No, that should be fine. So if these meta robot tags are not conflicting, that's perfectly fine. With the robots meta tags, we take the most restrictive ones that we find and we use those. So if you have one that says meta robots index and the other one says meta robots no index, then we will choose the no index. But if one says meta robots index and the other says meta robots, I don't know, no archive or no follow, then that's something where we say, well, these two work together, that's perfectly fine. OK, because currently, as I am seeing these updates in the Google remaster, so these updates, I was actually trying to explain these things to one of my team members. So that's why I just wanted to know. Thank you. All right. Cool. Then with that, let's take a break here. Thank you all for coming. And I wish you all a fantastic weekend. Thanks very much. Bye. Bye, everyone.