 All right. Welcome, everyone, to today's Webmaster Central Office Hour Hangouts. My name is John Mueller. I am a webmaster trends analyst at Google in Switzerland. And part of what we do are these Office Hour Hangouts, where people can jump in and ask their questions around their website and web search. Looks like a bunch of questions were submitted already. But as always, if any of you want to get started with a question of your own, feel free to jump on it or not. That's fine, too. Maybe something will come up along the way. Hi. Go for it. Go ahead, Sanjay. OK, so John actually has a question. So I am seeing that still my site primary caller is a desktop. So it's still not changed. We have a huge number of pages, around 20 million traffic. So it's a bit concerned for me. So how to solve that? And what is the problem that you're seeing? In my primary caller, in the Webmaster tool, it's showing as desktop. Indexed as desktop. Yes. Yeah, so I mean, from a practical point of view, that doesn't mean that there's anything broken. We're indexing the site as it is. I think what I would really look into is whether or not the mobile and the desktop pages are really the same with regards to the content and that the embedded content that you have on those pages is similar. So some of the things we've sometimes seen when people check for the text, if the text is OK, if the structure data is OK, then sometimes there is still an issue that maybe the headings are not actually visible on mobile. If you don't have any headings on mobile, if they're just marked up to be more visible text, but not using the heading tags, that might be something that we noticed. Another thing that's really common is if you have thumbnails that you use as links to related items on your pages. If you have a larger number of thumbnails on your desktop pages than on your mobile pages, then our systems might think that you're missing some important images. Probably those thumbnails don't matter so much, but it's something where our systems might say, well, there are 20 images on the desktop and there are only 10 on mobile. Maybe there's something that the designer would want to take care of first. So those are, I guess, the more tricky topics that are kind of hard to spot automatically. There's also the issue of sometimes if we think that a page should be a landing page for an image or a video, and on mobile that image in the video is not so prominent, then we might think that actually this is not such a good video or image landing page anymore. And that's also something that our mobile-first indexing systems will look at and say, well, it looks like this would be problematic for image search or problematic for video search. Therefore, we want to be a bit more cautious here. So those are kind of the things I would take a look at. I don't think it's a case that you urgently need to move or that you will have any kind of a ranking or search advantage if you move or you get your site moved to mobile-first indexing. But it's good to take care of these things. I'll be sure. Thank you. Sure. Hey, my question is regarding, so I know there is no magic formula for ranking. However, should we be concerned about the search console's error and be obsessed with it or still focus on the user experience or the content on the site as one of the factor? So how do you mean focus on the search console? So we are going through a major redesign on the site. And we probably have 3,000 pages or so. This continued. And we are running through a list of 404 errors and all. And then there is also core vitals showing up a little bit where there is a CLS issue and stuff like that. So should we be obsessed with it? Or it's like, yeah, eventually it's going to flash out and kind of hope that things will settle in terms of ranking. It's really hard to say offhand, because there are so many different kinds of issues that can be reported on in Search Console. And not all of them are immediately critical issues that you need to take care of right away. So for example, if we can't reach your site, if we can't index your pages, that's pretty critical. However, if it's a matter of, well, your page is not as fast as it could be, then that's something where it's OK. It's good to take care of these things, because they are things that users might see. But it's not the same level of criticalness as we can't index your pages at all. So with regards to that, it's something where you almost need to be able to take a step back and think about what is the effect of this issue that I'm seeing reported? And is that effect something that is critical for my web presence at the moment? Or is that kind of a nice to have? Or is that something that's maybe worthwhile cleaning up? I don't know, when you have a bit of downtime and you don't need to worry about urgent issues anymore. So that's kind of the way that I would look at it there. I don't think there is, by definition, any automatic way to recognize which of these issues are the most critical ones. We try to bubble up the things that we think are critical, but there are so many different things that can go wrong. So it's kind of hard to say what the most critical thing is. You should take care of first. OK, thanks. Let me look at some of the questions that were submitted. And as always, if any of you have questions along the way or comments or anything to add, feel free to jump on in. The first one is, let's say I have two strong URLs about cheese in my website. One is an e-commerce page where you can buy cheese. The other is a complete guide about cheese. So two different pages talking about the same topic, but both really relevant. What's the best practice for internal linking? Is it OK to link both pages using the same anchor text cheese or should one be linked differently? What are some suggestions? So essentially, internal linking helps us, on the one hand, to find pages. So that's really important. It also helps us to get a bit of context about that specific page. And we get some of that through the anchor text from the internal linking. And some, of course, from understanding where these pages are linked within your website. So with regards to that, I'm thinking specifically about the anchor text here. I don't think you need to do anything specific there. If you're already linking to those pages, if you're using a reasonable anchor text for cheese in this case, that sounds perfectly fine. I don't think you need to change the anchor text to be buy your cheese online here. And it's like the ultimate guide to all types of cheese here. It's something you could do if you wanted to, if you think it makes sense for your users. But it's not something where I think you would see a visible effect in search. So I hope that helps in that regard. I think it's always kind of a tricky situation when you have multiple pages on the same topic or multiple pages for the same keyword in that people often worry about cannibalization, which essentially means that you have multiple pages ranking for the same term. And maybe you would be ranking better if you had just had one page ranking for that term. That's definitely something legitimate to think about. And from my point of view, I generally prefer to have fewer pages rather than more pages. But these are completely different topics, completely different types of pages. So it feels perfectly fine, from my point of view, to have two pages that are kind of different like this. And oftentimes you will have this situation where you have informational pages on a topic and maybe transactional pages on a topic, maybe something in between, maybe some category pages that are also on this topic. And that kind of setup is completely normal. And that's something that our systems try to work with, where we try to show the relevant page in the search results. OK, the best experts at a country level review our articles from the health section of our website. We want to be sure that not only users, but also Google appreciates it. What advice can you give us? So essentially, when it comes to things like this, where you're saying, well, we do a lot of background work to make sure that the information that you get is really valid, it's correct, it's relevant, then that's something that you essentially need to show to users primarily. And our systems will be able to pick up on that over time. It's not that there's a specific structured data element that you can put on a page and say, well, my page is correct or my information is correct. It seems like you should trust me, I did a lot of work. But really who you want to kind of persuade is the users. And if we can tell that there's a lot of background information there, if our systems recognize that this is clear to users as well, then that's something that we will try to reflect in search. So from that point of view, I wouldn't focus so much on structured data or meta tags or anything like that, but really make sure that it's clear for users. And then we should be able to pick that up too. I think the next question is kind of related. We have a health section on our news website. We want to implement reviewed by schema markup on the pages. It looks like this parameter is only for web page descendants, medical page, for example. Should we implement reviewed by in the news article that works right now, or should we change news article to medical page for all articles in the health rubric? So I think this kind of goes into the same direction. Essentially, you need to make sure that these things are clear for users and not primarily focus on what search engines would see with regards to structured data. In a case like this, if you have news content, then I would mark it up as news content. That's kind of the technically correct way to do that with schema.org markup. I believe there are also some elements in Google search that specifically watch out for news article markup to recognize that actually this is a news article, which, from my point of view, would kind of make sense. If there is no way to add reviewed by schema markup to news article pages, I don't know, then there's no way to do that. So that's something where if you have news articles and you can't add another kind of markup specifically for something else, then that's something you can't do purely from the schema.org setup. It might be that there are alternate variations that you can do for this kind of site. I don't think when it comes to search that you would see any difference if you kind of implemented the markup in a hacky way or didn't implement it at all. Most probably our systems just watch out for those elements that we explicitly have documented in the developer documentation, where we have all of the different types kind of documented that we use in search. That's specifically what we watch out for. And everything else is a little bit more, like while you're providing a bit more context. So in a case like this, I wouldn't worry too much about being able to add reviewed by markup if that's technically not something you could do when it comes to news article pages. Is it against Google guidelines if you cloak but show almost the same thing? Let's see, for context, due to a data contract, I can't show an exact results page due to scrapers. So they have to be no index. However, I can show cash results. So if you're looking for a four bedroom house as a user, you'll technically be served a different page than a search engine. So I think when talking with the engineers or the quality team about topics like this, they generally say, well, it should be exactly the same. From a practical point of view, there will almost always be differences with regards to what Google sees and what users see, especially when it comes to content that's very dynamic. So for example, it might be that you have a news page and when Google crawls it has one article on top and then an hour later when a user goes to that page, there's a different article on top. Technically, it's slightly different content. From a practical point of view, the reason for the page is essentially just the same. So it's not something where I think our systems would watch out for that and say, oh, this is against the webmaster guidelines. With regards to the web spam team, they would also look at this and say, well, it's essentially the same page. Maybe the data that's shown is slightly different with regards to what is cash and what is served live, but it's essentially the same page. So that generally would be fine. Usually, the place where we run into more problems with this kind of a setup is when you have a technical issue on the part of your site that is only visible to Googlebot, then it happens every now and then that something kind of severely goes wrong there. Maybe there's a server error showing or maybe the cache is super stale and it's from last year and not from this week. Those are the kind of things where, as a site owner, you might not notice them immediately because when you look at the page, when you use your monitoring tools to access those pages, you see the live version that a user would see. So often, the issues that come up here are more about, well, if you implement it in a bad way or something breaks along the way and you don't realize it, then you could be shooting yourself in the foot there. It's not so much that from kind of the web spam guidelines or the web spam team that they would look at this and say, oh, the text is slightly different on this version than on that version. Therefore, we have to take a manual launch. I'm getting this error in Search Console. Reference AMP URL is not an AMP. What's causing this error and how can I fix it? It's essentially impossible to say without knowing the URLs. This is one of those cases where I would recommend going to the Webmaster Help Forum and getting some help from other folks who have a little bit of experience in Search Console. Especially mentioned the URLs that you're seeing, maybe take a screenshot as well so that people can take a look and see, is there really an issue here or is this something where maybe Search Console has a weird error that's a bit confusing. Maybe it's also something where there's an error on our side. And the folks in the Webmaster Help Forum, they can escalate these issues to us. So that's kind of the direction I would head there. My guess is that just purely from the text that you have a page that looks like an AMP page, but for whatever technical or theoretical reasons, it's not a valid AMP page. And that's something that happens every now and then. It could be that maybe there's a script tag that's kind of lost on the page. Maybe something briefly went wrong with a plugin that you're using. All of these things can happen, but the folks in the Webmaster Help Forum can generally help you to kind of narrow things down. Is there a concept in Google of types of pages in a site? In other words, on an e-commerce store, for example, would a category listing page be identified? Product detail type pages be identified? Is there a concept of this, perhaps, for understanding crawling or canonicalization? I don't know if we'd have an explicit kind of concept for this. It's very likely that we would have different kind of signals that we try to compile for different types of pages to figure out how and when we should show this page appropriately. But I don't know if we would have something kind of as explicit as this, where we'd say, oh, this is a category page on an e-commerce site, and it lists shoes in the category. Or if it's more something that's kind of more abstracted and more useful just for our algorithms. Part two is, if there is such a thing, does it pay to keep the basic template of common page types similar to help Google to understand, for instance, products in a div id equals my product list or whatever the structure is, common across category pages? So sometimes that does help us to understand which pages belong together. So that is something that we try to do when we crawl an index of website, is to figure out what pages are kind of a part of the same group. And for the most part, we do that with regards to crawling, to understand, well, these are all product pages. I mean, not specifically that we'd say these are product pages, but kind of this URL pattern looks like a kind of set of pages that are all very similar. And based on that, we can prioritize our crawling a little bit. So that's kind of the direction that we go there. It's less about kind of the ranking side and understanding exactly what page type it is, but more understanding, well, this set of pages belongs together in one group. And maybe we've seen 90% of the pages in this group are no index, for example. Then if we see new pages that belong into the same group with a similar URL pattern, then probably we can say, well, it's very likely that these new pages will also be no index, even without us crawling them. So we could theoretically deprioritize those pages a little bit with regards to crawling. And similarly, if we recognize that actually this group of pages is really important for this website, if we find new pages that fit into that group, then that's something we can say, well, maybe we should prioritize these a little bit with regards to crawling and pick them up a little bit faster. So that's kind of the direction that we go there. It's less about the HTML exactly on the page and really more about the general bucket of pages. And I think for the most part, we organize this by URL patterns where our systems automatically create a little bit of an understanding of the URL patterns within a website and understand which parameters are relevant where part of that you see in Search Console and the Parameter Handling Tool. I started a blog last year following all your tips on that blog. Within a few months, I was getting 800 daily visitors from Google, and that's huge for me. But sadly, to say for some reasons, I was inconsistent on the blog for three months. After my inconsistency on the blog, I was getting zero visitors from Google. But after realizing my mistake, I started again in January and migrated to a new hosting provider. And now I'm writing posts and building links daily. I'm still only getting 10 to 30 visitors from Google. What should I do? So it's really hard to say what exactly you've been seeing. My general sense is that the differences in the traffic that you're seeing from the search results is not due to your consistency on the blog or inconsistency on the blog, but rather that's something where our systems have tried to figure out how relevant this page is in the context of the overall plan. And that can change over time. And it can be such that you might be getting 800 visitors a day at one point in our algorithms update, and we understand, well, maybe this isn't as relevant as we thought it was, and you'd get fewer visitors afterwards. It can also happen that our algorithms get updated and say, well, this is really a fantastic page. Why are we only sending 800 people here? We should be sending thousands of people here. And that can also happen. So that's kind of where my guess is this is more of an algorithm change thing and less a matter of you being consistent or not. The other thing that I kind of reading between the lines here is your writing posts daily and your building links daily. And it kind of makes me wonder what you consider to be building links, for example. It's very easy to get sucked up into kind of the link building world, and essentially you're sending emails to everyone or you're dropping comments on other people's blogs or you're trying to find kind of those sites where you can drop your link and it's not a no follow. And everyone goes there or automatically goes there and drops their links there. It's very easy to get sucked into that and to spend a lot of time kind of building links, in a sense, but building a lot of links that are essentially our systems ignore. And that could be time that you could be spending to improve your content overall, which is something that our systems will kind of appreciate more over time. So that's something where I think it's important that you find a balance between promoting your work and doing your work, but primarily you should be doing your work really well and not spending that much time promoting it altogether. Let's see. I think the Search Console Inspectorial Tool, we talked about that briefly. What makes posts eligible for Google Discover? According to my Search Console reports, my posts were getting featured for some time, but then it completely stopped. So Discover, I think it is pretty cool in that it shows your content to people who are not explicitly looking for it, but who we think might have interest in that topic. So overall, I think that's pretty cool. On the other hand, that also means that our systems have to be a little bit more cautious with regards to the content that we promote there, because we realize that people aren't explicitly looking for that content. So we should maybe be a little bit more cautious there than we otherwise would elsewhere in Search with regards to eligibility in Discover. I believe the Google News publisher guidelines apply to Discover, not 100% sure. I think that's what we link from the Discover Help Center page. So that's kind of the baseline eligibility. But as I mentioned, we do try to make sure that the content that we provide through Google Discover or that we link to is something that we can really fully stand behind. So my guess is if for a while your posts were being shown in Discover, maybe they're not being shown as much anymore, then our systems are a little bit more cautious with regards to whether or not this is really fantastic content that we should be promoting in Discover. That also means that probably you just need to step up a little bit more to kind of reach that target again. It is really hard to say what exactly you should be doing differently, because there's no query assigned to Discover. But it is something where working with your audience, the people who are going to your site already and figuring out what is it that really interests them and how can you provide that in a way that gives a lot of value to everyone who looks at that content, that's something you can try to tweak over time. Is using schema to link between other related schemas on the same page between schemas on pages of the website, basically creating a network just like internally linking? Does it have any benefit for Google's understanding of a website or any benefit to SEO? I don't think so. So primarily, when it comes to structured data, we try to use the structured data that we can use for specific search features. We have that explicitly documented in the developer's guide. So that's kind of our primarily used case of structured data. Additionally, we do try to look at the structured data that's otherwise on the website to understand a little bit better. Is there any context that we're perhaps missing here? Is there any additional information that helps us to better understand how this particular page stands within the website? But at the point where you're kind of referring from one type of not directly supported structured data to other pages with not directly supported structured data, at that point, I think you're probably in the area of, well, our systems see this, but we can't really do anything useful with it, even if we can understand the relationships between those specific items that are marked up with structured data. It's probably so vague that it's really not something that would have any visible effect in ranking at all. Do you think that errors in a domain might affect the web positioning of subpages, for instance, in relevance? It is really hard to say without knowing what kind of errors you're looking at there. So when it comes to technical errors, for the most part, I don't think that would apply. If you have some pages that are accidentally 404, I don't think that would apply at all to anything else on the website. The one exception that I can think of is if it looks to us like this domain is not in use at all, then that might be something that would apply to the other subdomains there. So for instance, if the www and the non-www version of the domain, if they both just don't resolve to a normal web page, then our systems might think, well, maybe this website went down. Maybe it's no longer accessible. And at that point, they might kind of apply that to the other subdomains as well and say, well, it looks like the whole domain is gone. Maybe we don't need to index all of these subdomains either. But that's an extremely rare situation that I ever see something like that. So that's usually not something that people worry about. Because if it looks like your website is down, then that seems like a more general problem than is like, how does it affect my relevance in the search results? Other types of errors, like, for example, if you have structured data issues on a website, if you have issues with regards to speed or with regards to even simple things like, I don't know, grammar or spelling on a website, technically those are errors. But that's not something that would affect the other pages or the other subdomains on the website. So that's generally not something that I would see as being an issue. So if you have a setup with different subdomains, obviously making sure that your website doesn't look like it's down is kind of a good thing. But otherwise, I wouldn't worry too much about how you managed the main domain versus the subdomains. What could be the cause behind Google activating a website's review snippets enhancement followed by massive upwards and downwards fluctuations in clicks and impression and a deactivation one month later? We've been experiencing such quick switches since last year after having review snippets activated constantly for years. I don't know. It's hard to say from just this question. It sounds like there might be some kind of a technical issue involved here. It might also just be that our algorithms are a bit on the edge with regards to how to treat this website. But if you've been having this kind of on and off situation, kind of on weekly or monthly level, then that feels kind of weird, like something that probably shouldn't be happening, where it might be useful to have your domain or the queries where you're seeing this. What you could do is maybe post in the webmaster help forum to see if other people have anything obvious that they can spot there. Otherwise, you can also send me a note on Twitter, and I can take a look at that too. I'm doing SEO on and off page for some packers and mover websites, approximately two months. Backlinks and domain authority have increasing right now. I don't know. I've been increasing, I'm guessing. But pages are not ranking. What should I do to rank all of my pages? So I think, first of all, we don't use domain authority in search. I think that's kind of a misconception that's out there that comes up every now and then. I think it's really cool to have these kind of tools out there that help you to understand how my site kind of fits in with regards to the rest of the web, but we don't use domain authority. So it's not something where I would focus on domain authority and say, this is what I need to do, but rather think about your website on a whole and think about what you should be doing overall to kind of improve your website. The same thing applies to backlinks here, as I mentioned before, with the other case of someone who is just building backlinks. It's something where it is very easy to get kind of pulled down into this world of building backlinks, where you have to drop your links everywhere to increase the number of links to your website. And that's definitely not what our algorithms are looking for. If you just drop random links on other people's blogs or drop links in forums or in other places where it might get picked up, that's not something that we would look at and say, oh, this is a sign of a high-quality website that we should be showing more. That's not how it works. So that's something where if you've been working on this website for approximately two months, then I would recommend maybe spending a bit more time on the website overall, especially if you're active in a very competitive area. So I don't know how packers and mover websites are where you are located, but I've seen that as one of the more competitive areas. And coming in for two months and spending a bit of time to build up a website and drop some links in various places and focus on domain authority, I don't think that would be enough to actually make a dent there. So that seems more like something where you'd have to come up with a long-term strategy and figure out ways that you can differentiate yourself from other people, from all of the other competitors that are out there, and really show the additional value that you provide, and to make sure that Google and other search engines are able to recognize that additional value. And we would recognize that not by random backlinks that are dropped or domain authority, but rather by looking at the whole picture. All right, can you share any more information about the possible bug with discuss comments not being indexed? I can see some pages are definitely having their comments indexed, while others aren't. I've seen this for a long time with discuss, and I thought Google was selectively indexing comments from certain pages based on quality. Martin's split explained the other day that there might be a bug. Since many sites use discuss, I would be great to know if more discuss comments are going to be indexed moving forward. Martin said this could happen. Well, I guess if Martin says it could happen, then that seems like a pretty strong case. I personally haven't looked into this. I don't know exactly what is happening there. I know some sites use discuss comments extensively. Some sites implement them in a way that they're kind of cached within the static HTML that served with the website. It makes it, I suspect, a bit hard to understand exactly what is happening there if there are so many different ways that you can kind of embed these comments within a website. So that, I don't know. It sounds like from this comment and from, I think the brief exchange I saw on Twitter, that there is something that we could be doing to improve indexing of these comments, but I don't know how big of an effect that would be if that's something where we can just index a little bit more or if it's kind of a really big jump that we can suddenly index a lot more. I don't know. We'll see. I don't know. Barry, you have discussed comments on your site, right? Yes, I do. It seems to index fine, so I don't know. OK. But Glenn did a blog post months and months ago saying how some sites Google has no problem indexing these comments on and some sites they don't. And he has a real long story on that. I'm not sure. I haven't looked into it either, but maybe there's just the way that those publishers have implemented, discussed that it's kind of being blocked. I kind of want to block mine, but I'm not sure. I used to block it. Then you guys got better at indexing, so now I have to find another way to block it. Yeah. That's always a struggle. It's like a bunch of people come to us and say, how do I get myself indexed? And other people come to us, how do I get it unindexed? Yeah. Well, you can't win, right? Always something to do. That's good. Yeah. I don't know. We'll see what Martin comes up with. I know he's been in touch with the rendering and the indexing team on that, so it sounds like there is something, but I don't know how big of an issue it is. It was just a weird response, because Martin was like, yeah, maybe there's an issue on our end. So I mean, he basically said there is an issue on our end. So I mean, unless he just took a quick look, and it was like, maybe there is an issue, but wasn't sure. So if you could follow up with Martin and have him tweet something out that would make for a good headline, that would be great. OK, something that people can comment on, yeah. Yeah. I think it's sometimes tricky with a lot of these JavaScript implementations in that when you test them manually, it looks like things are working, or it can look like things are not working. But our indexing systems try to be a little bit more resilient and catch things that even the offhand testing tools don't notice. So it's sometimes a bit tricky in that regard. But we'll see what comes out. Yeah, I mean, I wrote about it, and then I put some weird long phrase in my comments area, waited, I don't know, 20 minutes. Did a search for that weird long phrase, and you guys found it. So clearly, at least on my site, you seem to be indexing it well and fast. So if you could slow that down, let me know. I mean, clearly, it's a sign that the comments on your site are high quality, and we need to pick those up quickly, right? Yes. OK, we'll see what happens there. Stay tuned. Is it normal for the last crawl of a page to be the first time Googlebot has indexed it? I'm using React plus Next.js with server-side rendering, and I think there might be a problem for Googlebot in reaching my site. When I request a recall, it gets a spike in impressions the first day, and then it falls off. This happens for all pages. Am I missing something from a technical point of view? I don't know if there's anything by default that's wrong in a case like this. It's something where just looking at the relevance or the visibility of a page in search that can be kind of normal, in that when you publish something new, then suddenly it's very relevant and new for everyone to be shown in search. And then over time, that kind of drops off. So that could be kind of normal. However, it sounds like you have some very specific technical kind of specifics that you're seeing there, and that feels like something where maybe it would be worthwhile to post some clear example URLs and maybe even some screenshots of what exactly you're seeing there with regards to the date of the crawling or the last crawl or the first crawl of these pages. So just purely judging from the names that you're using there with React, Next.js, server-side rendering, that seems totally unproblematic. But it is really hard to say what exactly we're seeing there. What I would recommend doing there is maybe posting in the Webmaster Help Forum if you think this is a kind of a general crawling, indexing type of confusion. If you think that it's based on your JavaScript setup, so maybe server-side rendering isn't working well, or we're doing client-side rendering for your pages for whatever reason, then we have a special JavaScript sites in Search Working Group, which is kind of a semi-private forum that you can just join and you can ask your JavaScript sites related questions there. That's something where we tend not to index that content from the, well, I don't think we index it at all from that forum. So it's a little bit more of a private area where you can discuss things that are specific to JavaScript sites just because we want to make sure that these kind of newer technologies are indexable reasonably well as well. Posted in this Friday central office hours as well, but maybe we can get to it sooner. We're facing a strange issue with canonical selected by Googlebot. We have two pages that from our point of view have clearly distinct content, but Googlebot sees one as a canonical for the other. And then the two URLs, this affects all such similar pages, so weather for a specific location. We initially believe that this is because Googlebot was not reliably fetching the CSS and have since inline the critical CSS, but we're still facing the issue. Are we doing something clearly wrong that we've missed? Could this be because of a shared path in the URL structure? I don't know. I need to take a look at the specific setup. It's kind of hard to say offhand. In general, I wouldn't assume that there's a CSS issue that would cause these pages to be as canonical. So I'm guessing just from the URLs alone that the content there is significantly different. So just purely not having access to the CSS would not make us think that the content is the same. So just because of the CSS reasons, I don't think that would be a reason why we would think that these pages are the same. What I have seen in some situations, especially with sites, I'm guessing kind of like this, where you have a city name in the URL itself, is that sometimes we see that you can specify any name as a city name, and we'll see essentially the same content. We won't get a 404 for cities that don't exist, for example. And in a case like that, we might think that actually this part of the path or the parameter that has a city name is irrelevant for this page, because we can enter random words there and it leads us to the same content. So that's something that we've sometimes seen there. Just to add on that, basically no, we do 404 cities that are not available. And what led us to believe that this would be a CSS issue is when we're doing live testing, in some instances, we saw a properly rendered page. And in some instances, we didn't. And when we were checking our logs on server side, we were actually not seeing any requests for the CSS from the Googlebot. And so this is what led us to believe that this might be a CSS issue. But anyhow, now the critical CSS is delivered with the initial HTML, so it should load properly. And we still can't figure out how to solve this. It's still the majority of our pages are sort of marked as non-canonical. And so we've lost basically all of our search traffic because of that. OK. And these are all city names? Is it? Yeah, these are all basically the idea of such a page is that we give you the weather for the specific location. So all of those are going to be city names. And basically, most of them are going to be in the region of Austria. What we did find is a very similar challenge that IKEA posted here, where they're also getting improperly selected canonicals. And they seem to have resolved the issue because you can still find things, but we've never seen any resolution, anything in the forums. Some people have raised similar issues, but we couldn't figure out anything that solves it. OK. And you have kind of the country or the language code in the URL as well. Do you have the same issue with other languages or is it just German? It's actually just for German that we see this issue. And it's weirdly enough because we are differentiating between DE, which is our main language, and DE minus DE for German-German. And this should be proper, actually, if you read the specifications, but it only affects the pages with the URL DE so far, as we can tell. But it's a weird issue. Do you have the same content in other languages then? Or it's like, these are the only kind of pages for Germany that you have there. But we've basically prepared ourselves to offer multi-language specifically in the Swiss region, for example, and that's why we also have this. But really, we are not utilizing it right now. So there's no different languages being offered for these pages. OK. I'll check with the team to see what might be happening there. It might be that our systems are confused with regards to that URL path and somehow thought that maybe this path is irrelevant. But if that's the case, then that's usually something that we can resolve on ourselves. But I'll check. The weird thing is the canonical makes no sense. It's like a news page, and the other one is a web app page. So it's very clearly distinct. Yeah. Usually what happens in cases like that, where our systems get confused about this kind of thing, is that we try to understand the URL structure. And it's not so much at that point anymore what is actually shown on this URL, but rather it's like, our systems have determined this part of the path is irrelevant, so we don't even look at it. And then we say, to simplify things, we will fold it together as this canonical. So that's something where my guess is something like that is happening there. And maybe our systems pick that up at some point when maybe it wasn't as clear within the website how these things were structured. But this is usually something that we can resolve on ourselves. So I'll check with the team. Thanks. Sure. All right. We're kind of running low on time. Let me see. One last question in the list, and then we'd have it kind of complete unless someone submitted something new. What would you suggest as a better solution for users and machines, having a permanent redirect of all 404s via meta refresh of five seconds to the home page so that users can find a search box and helpful links, such as popular stores and categories or having a designated 404 page showing that the page is no longer exist and helpful resources? A real 404 page is always preferred. So if you remove content, serve us a 404 page, we can understand that the content is gone. We can reduce our crawling of those URLs. It is completely fine. Redirecting people to the home page is confusing for users. It's confusing for us as well, because for us, we see that and say, oh, this looks like a soft 404. So we will generally end up still crawling a little bit more, but still treating it as a 404. So you might as well just give us a clear 404 and make that 404 page user-friendly. All right. I'm sure there are some more questions that got submitted along the way. But maybe we'll switch to live questions from you all. So anything that I can help with? Hi, John. My name is Ruchit, and I posted on channel also, but I just joined later on. So we have an event website which have around 250 million events. And currently, we are facing a huge issue of crawl budget, especially as we are having 250 million events. But most of the events are past. So it doesn't make sense for Google and our first two. So we get our most of traffic on the 5% to 10% events which are upcoming events. So we want to de-index all these past events. But what we are seeing, for that, we added that no index tag in the pages. But what is happening is every day, we are seeing that Google is crawling three to four million pages. And most of them are past event page. So our upcoming events are taking longer time to get indexed. And it's like how to de-index these pages faster. So we don't have to wait Google to crawl all of these 200 million events pages and then de-index it. OK. So I think there are two things you can do. On the one hand, there is the unavailable after matter tag that you can use, which tells us ahead of time when this page will no longer be available or when it will be irrelevant. So that's something you can use going forward. If you have an event that takes place at some point, you can say unavailable after maybe, I don't know, a month later or whatever you decide. That helps us to understand that these pages are going to go away fairly soon. And to help us to focus our crawling on other ones. And the other part is if you just kind of relatively recently added the no-index to the old events, then that's something we just have to process. And that can take a bit of time. I wouldn't worry so much about it because usually this kind of refresh crawling that we do is something that is usually lower priority within our system. So we do try to prioritize the crawling of new pages that we discover on your website. And if we recognize we still have room to re-crawl old pages, we will re-crawl the old pages as well. From an absolute numbers point of view, it can look like we're crawling 10% of the new pages and we're crawling like 90% of the URLs are really old pages. But that's essentially just because the 10% crawls from the new pages we think are kind of the important parts that we should be picking up for indexing. So I would try to let this settle down over the course of maybe, I don't know, half a year or a year. And I assume that at that point we will be able to focus a lot more on the newer pages and we'll have recognized that the old ones are really gone in the meantime. But even when we recognize that they're gone, because you have so many of these old pages, we will still occasionally try them. So if you look at the absolute numbers in your server logs later on, you will probably still see that we crawl a lot of old pages. But again, that's not because we think these are important pages, but rather just because we know they exist and we just want to double check. Another thing you can do to make sure that it's clear to us which pages we should be crawling is to have a clear internal site structure in that regard and that maybe you take the really old events and you move them into an archive section of your website that isn't linked so prominently within your website. And you really prominently focus your internal linking on the new events that are upcoming. So make it as clear as possible when we crawl your website. This is what you want us to focus on. And oh, I also have all of these other old ones here, but you can find your way there, but it's not the most visible part of the website. Thanks, John. That helps. Sure. John, can we talk about updates? I haven't asked you about two and a half years. OK, I don't know. Could you name what's going on this week? The past couple of weeks, June 10th, June 18th today, lots of little fluctuations. We call it something. I have no idea what you're referring to. So I don't think we have a core update at the moment. Right. Nothing was confirmed. Danny didn't tweet anything. You're not aware of any tweaks. It seems to be a lot of fluctuations going on. I know Google's always fluctuating, constant changes, yada, yada, yada, but no idea. I don't think it's. I mean, I don't know. There are always things happening within Google, so it's really hard to say. But I'm not aware of anything specific where it's like, oh, we're doing something crazy here, and people might see this kind of thing. My guess is these are just kind of the normal things that are happening. OK. Last chance. No change. Do you want to name it? I don't even know what I'm naming if it's like a good thing or a bad thing, so it's hard. No. I didn't see a lot of stuff on Twitter where people were seeing changes, so I don't know. I guess I'll watch out for your blog and see what you say. I really don't know. OK. Hey, John, can I ask a bit of a follow-up question to what Paul had to speak about earlier about you can sort of look through URL structures and see if pages belong together there. So would that mean that something that has quite a flat URL structure might be a bit of a disadvantage in that respect? I think we would probably still be able to figure out parts of a URL like that. But if you have an extreme case where you have no folders at all, then it does make it a little bit trickier for us to understand which parts are really relevant or not. So for example, compared to something where you have clear URL parameters, where you have CD name equals this and language equals that, then it's a lot easier for us to understand, oh, this parameter here varies and changes the content. Whereas if it's a really flat structure where you just have dashes in between all of the parts, then it's really hard for us to understand what happens when individual parts are different or don't exist. Like with parameters, we can drop them and see what happens if they still serve content or not. Thank you. Cool. OK. So maybe we can take a break here. I have the next one in English lined up on Friday. So if there's anything still on your mind, feel free to drop your questions in there. I'll see if I can figure out something about updates that are happening. I don't know if there's anything explicit we'd be able to announce. But who knows? Always something new and surprising. And we'll see what happens with all of those discuss comments if suddenly there is like a giant surge of indexing of comments on the web. We'll see. All right. Thanks, everyone, for joining in. And I hope you found this useful. And hopefully, see you all again in one of the future Hangouts. Bye, everyone. Thank you. Bye.