 All right. Welcome, everyone, to today's Google SEO Office Hours. My name is John Mueller. I'm a search advocate at Google here in Switzerland. And part of what we do are these Office Hour Hangouts, where people can jump in and ask their questions around their website and web search. And we can try to find some answers. As always, a bunch of stuff was submitted on YouTube. So we can go through some of that. But if any of you want to get started with the first question, you're welcome to jump in. Hi, John. I have a question. So this question is regarding to e-commerce. So there is a lot of e-commerce pages. They have like a refined pages, right? So they will be a pages that they will put a certain amount of product in the page based on a predefined logic to show up those products in those pages. And those pages that, in my case, I have a client. It's a very good pages to capture long-term keywords because they are like refined, saying that, for example, computer and at the same time have AMD chips, things like that. And when people are searching for a long-term keyword, like computer, AMD processor, those pages are a very good place to capture them. However, so here comes my question is for those pages, they are some kind of decide that people will, they are some kind of decide that because they're automatically a side product, right? And you can't put all the product in the first rendering of those kind of pages. So every time when there's new product adding into the inventory or some product is removed from the inventory, the page content of first rendering product will change. And I will send a sample here in the comment section. For example, this is not my client, because I'm not allowed to show my client's website but it's a similar situation. So when you click into that link, you can see that they are a lot of product in the page but when you scroll down to the bottom, they will have load more. So people only see the product before a load more, right? And whenever we add inventory into the product or remove something, the first rendering product will change. So Google will constantly seeing different content of that page. So they confuse Google, how do we solve this problem? That's essentially fine. That's totally normal. I think with e-commerce, with a very busy site, you have those kind of shifts all the time. With news websites, you have it similar, that you have news articles all the time. And when you look at the home page of a news site, like there are always different articles that are linked there. And from our point of view, that's fine. The important part, I think, especially with e-commerce, is that we're able to find the individual product pages themselves. So somewhere along the line, we need to have kind of persistent links to those products. So that could be on that page. It could be on page two or page three or page four of that listing, something like that. So that's kind of the important part there. I wouldn't worry that the pages change from load to load because what will happen from a search point of view is we will recognize there's specific content for this topic on this page. And we'll try to bring queries to the page that kind of match the general topic. And if a computer model one or a computer model two is shown there, and they're essentially equivalent because they're in the same category of product, then that doesn't really change much for us. So what I heard is that as long as the logic of the signing product is consistent, and the product is showing up in the first rendering, match, for example, title tag on the page, then that is fine. But the logic is like not consistent, and they suddenly have other product, and there will be a problem. Yeah, so for example, if you have a clothing store and you have a category that's just blue, and the category blue has everything from socks to jackets and everything in between, then it's really hard for us to say, this is a landing page for this type of product. So we will constantly be confused by a page like that. Whereas if it's a landing page about, I don't know, blue jackets, for example, like category jackets and color blue, then it doesn't really matter which jackets you show there. They all match that intent, and it's pretty clear to the user, they fit into that category. So as long as this product, new adding product, are still blue jacket, even if different blue jacket in the first rendering and the content exchange is OK. Yeah, yeah. But if there is like one time there's a red jacket adding to that product, there will be a problem. I think individual cases is absolutely no problem. If it's always something random in there, then it gets hard for us to understand the pattern. OK, thank you so much. Sorry for taking so long. Hey, John. Thank you. My question is regarding image search and why one image might be shown preference over another. Specifically, it's on a product page that uses an image slider to display pictures of the product. And considering that pretty much everything is nearly identical, like alt text, file name, nearby text dimensions, weight, things of that nature, why might a seemingly random image from within the slider sequence, maybe 3rd or 4th thumbnail, be shown preference over a featured image, usually the first image that you would see on a product page? I don't know. It's hard to say. I don't think so. So we have various things that kind of go into image search. And on the one hand, it is kind of the aspects that you mentioned, like the titles of the page, the image file name, the captions, alt text, things like that. But we do also have some logic that tries to understand is this a high quality image or not. And it's possible. I don't know those images that our systems are either getting confused by the contents of the image or that they clearly see one image is significantly higher quality than the other. And for us, then maybe we would show it a little bit more visibility like that. But it's something where I think there are always a number of different factors that play into that. And even for multiple images on the same page, which are kind of in the same category of things, it's possible that we kind of show them in one order once and show them in a different order another time. OK, so are you able to comment on this? I imagine that Cloud Vision has something to do with that, trying to match similarities with machine learning to the entities. Am I on the right track here? I don't know how far we would use something like that. I do think, at least as far as I understand, we've talked about doing that in the past, specifically for image search. But it's something where just purely based on the contents of the image alone, it's sometimes really hard to determine how the relevance should be for a specific query. So for example, you might have, I don't know, a picture of a beach. And we could recognize, oh, it's a beach. There's water here, things like that. But if someone is searching for a hotel, is a picture of the beach the relevant thing to show? Or is that, I don't know, a couple miles away from the hotel? It's really hard to judge just based on the contents of the image alone. So I imagine if or when we do use machine learning to understand the contents of an image, it's something auxiliary to the other factors that we have. So it's not that it would completely override everything else. Gotcha, thank you. Sure. John, just one follow-up on this. Does Google have any plan of machine learning auto detection of what is happening in what is there in picture? Because I am saying that different devices also have this kind of feature. Does Google also have any plan of implementing this kind of feature? What is happening within the image? I don't know. Kind of like with the previous question, it's something where it's certainly possible to some extent to pull out some additional information from an image, which could be objects in the image or what is happening in the image. But I don't know if that would override any of the other factors that we have there. So my understanding is this is probably something that would be more kind of on the side if we have multiple images that we think are kind of equivalent. And we can clearly tell somehow that this one is more relevant because it has, I don't know, the objects or the actions that someone is searching for. And maybe we would use that. But I honestly don't know what we've announced in that area or what we're actually using for search there. Because the thing to keep in mind is that there are a lot of different elements that are theoretically possible that might be done kind of in consumer devices. There are lots of things that are patented, that are out there that are kind of theoretically possible. But just because it's possible in some instances doesn't mean that it makes sense for search. And we see that a lot with patents when it comes to search where someone will patent a really cool algorithm or setup that could have an implication for search. But just because it's patented from Google and maybe even from someone who works on search doesn't mean that we actually use it in search. Yeah. OK. Thank you. Sure. OK. Let me run through some of the submitted questions. And if you have questions along the way, feel free to jump in. And we'll almost certainly have time towards the end for more questions from all of you. All right. The first question is about Google Discover. One of the sites I'm running is about anime, fan arts, cosplay, fan fiction, was performing fairly well in Discover. But one day to another, the traffic dropped to zero without any significant change on the site. In Google Search, it's growing before and after that. What kind of problems could bring that situation? I don't know. It's really hard to say without looking at the site. But in general, when it comes to Google Discover, one of the things that I've noticed from feedback from folks like you all is that the traffic tends to be very kind of on or off in that our systems might think, well, it makes sense to show this more in Discover. And then suddenly, you get a lot of traffic from Discover. And then our algorithms might at some point say, well, it doesn't make sense to show it that much in Discover anymore, and the traffic goes away. And especially with Discover, it's something which is not tied to a specific query. So it's really hard to say what you should be expecting because you don't know how many people are interested in this topic or where we would potentially be able to show that. So that's something where if you do see a lot of visibility from Google Discover, I think that's fantastic. I just would be careful and kind of realize that this is something that can change fairly quickly. Additionally, we also, for Discover, we have a Help Center article that goes into pretty much detail what kind of things we watch out for, and in particular what kind of things we don't want to show in Discover. So that's something that you might want to double check. Depending on, I guess, the site that you have, that's something that might be more relevant or less relevant there. But I would definitely check that out. What are the levels of site quality demotions? Is there a first level where everything site-wide looks fine, no demotion, second level that you demote some pages that are not relevant, or third level site-wide is not good at all? So from my understanding is we don't have these different levels of site-wide demotion where we would say we need to demote everything on the website, or we don't need to demote anything on the website. I think, depending on the website, you might see aspects like this or might feel like aspects like this. But for the most part, we do try to look at it as granular as possible. And in some cases, we can't look at it as granular as possible. So we'll look at different chunks of a website. So that's something where, from our side, it's not so much that we have different categories. We just say it's like in this category or in that category, it's just there is almost like a fluid transition across the web. And also, when it comes to things where our algorithms might say, oh, we don't really know how to trust this, for the most part, it's not a matter of trust is there or trust is not there. It's like yes or no. But rather, we have this really kind of fluid transition where we think, well, we're not completely sure about this, but it makes sense for these kind of queries, for example, or it makes sense for these kind of queries. So that's something where there's a lot of room. Let's see, I have a question about omitted results. We published two large.coms, Horoscope and Astrology, each with each own URL and content teams, after ranking on the first page for Astrology queries for multiple years. In February last year, only one of the sites began to show up for normal search results at a time. Whichever site has the highest ranking for a given query that will show up with the other site being classified as an omitted result. There is no duplicate content or cross links between the sites. I'm curious why this is happening. It's really hard to say without looking at the specific sites and looking at the specific situation. So usually with two websites, if they're not completely the same, then we would rank them individually, even if there is kind of like an ownership relationship there. So from that point of view, it might also just be something that is kind of not related to what you're suspecting in that algorithms think that it's like the same site. We should only show one of these at the same time. I have seen situations where if there are a large number of sites that are involved, a large number of domains, that our algorithms might say, well, all of these domains are essentially the same content. And we should just pick one of these to show rather than all of these. But usually if there are two websites and they're kind of unique in their own ways, then that's something where we would try to show them individually. So I think from a practical point of view, what I would do here is go to the Webmaster Help forums and post the details of what you're seeing here, maybe some screenshots, specific URLs and queries where you're seeing this happening. And the folks there can take a look at that and maybe guide you into, I don't know, if there's something specific that you could be doing differently there, maybe they can point you at that, or maybe they can point you in the direction of saying, well, it is how it is. That's nothing kind of unnatural that's happening there. But also the folks active in the help forums have the ability to escalate things to Google teams. So if they think this is really weird and maybe something weird is happening on Google's side, then they can escalate that to someone at Google. Let's see. Does Google Search consider each URL of a website individually? For example, does a low score on a domain homepage have any effect on the other pages which have a high score? So yeah, like I mentioned before, we try to be as granular as possible, as fine-grained as possible, in the sense that we try to focus on individual pages. But especially within a website, you're kind of always linking to the other pages of your website. So there is kind of a connection between all of these pages. And if one page is really bad and we think that's the most important page for your website, then obviously that will have an effect on the other pages within your website, because they're all kind of in context of that one main page, for example. Whereas if one page on your website is something that we would consider not so good and it's some random part of your website, then that's not going to be the central point where everything evolves around. Then from our point of view, that's like, well, this one page is not so great, but that's fine. It doesn't really affect the rest. The mobile section of Core Web Vitals in Search Console shows a bad URL on original link while the AMP version of the same URL is a good URL. Why are these two considered separately? So essentially, what happens there is that we don't focus so much on the theoretical aspect of, this is an AMP page and there's a canonical here, but rather we focus on the data that we see from actual users that go to these pages that navigate to them. So that's something where you might see an effect of lots of users are going to your website directly and they're going to the non-AMP URLs, maybe, depending on how you have your website set up. And in Search, you have your AMP URLs. Then we probably will get signals, or enough signals, that we track them individually for both of those versions. So on the one hand, people going to Search, going to the AMP versions, and people maybe going to your website directly, going to the non-AMP versions. And in a case like that, we might see information separately from those two versions. And then kind of like we have those two versions and the data there. So we'll show that in Search Console like that. Whereas if you set up your website in a way that you're consistently always working with the AMP version, that maybe all mobile users go to the AMP version of your website, then that's something where we can clearly say, well, this is the primary version. We'll focus all of our signals on that version. The next question there is, since AMP is enabled, will Google Mobile Search consider only the AMP version, which passes the Core Web Vitals test when ranking the website, or will the original link also be considered? So I mean, on the one hand, there is the aspect of, is it a valid AMP or not? If it's not a valid AMP, then we wouldn't show it. That's one aspect that goes into play there. But I think in the theoretical situation that we have data for the non-AMP version and data for the AMP version, and we would show the AMP version in the search results, then we would use the data for the version that we show in Search as the basis for the ranking. So in a case like that, we clearly have data for both of these versions, and we would pick one of those versions to show, then we would use the data for that version. That's similar, I think, also with international websites where you have different URLs for individual countries. And if we have data for one version, we show that, or if we have one version that we would show in the search results, and we have data for that version, then we'll use the data for that version, even if we have other data for the other language or other country versions. Now, the only case where I know where we would fold things together is with regards to the AMP cache, because theoretically, the AMP cache is also located in yet another place, another set of URLs. But with the AMP cache, we know how we should fold that back to the AMP version and track that data there. So that's a little bit of an exception. But if you have separate AMP versions and separate mobile versions on your site, then it's very possible that we could track those individually. Does Google Wave the exact match title tag more in comparison to the title tag focus more on users? So let's say the phrase I want to rank for is Audi A3, and one version of the title tag is this exact match. The other version is this car for sale, 152 great models. Would this title be scored less relevant for the Query Audi A3 just because it is longer and not exact match? I don't think we have any exact definition on how that would pan out in practice. So there's certainly an aspect of, does this title match the query? But we also try to understand the relevance of the query. We try to understand things like synonyms or more context around the query, around the titles as well. So I don't think there's a simple exact match to the query is better or not exact match to the query is better there. So my recommendation there would be to test this out and just try it out. And not so much in terms of SEO, which one will rank better, but kind of think about which one of these would work better in the search results. And that's something you could try out on one page. You could try out on multiple pages that are kind of set up in a similar way. And then based on that, you can determine, well, this one kind of attracts more clicks from users. It matches the intent that the user has better somehow. So I'll stick to that model and use that across the rest of my website. So that's kind of my recommendation there. I think with kind of the general information retrieval point of view of which one of these would be the better fit, I imagine you could get into really long arguments with the people that are working on information retrieval on which one is better or not. So I don't think there is like this one clear answer. Comments below the blog post. Are comments still a ranking factor migrating to another CMS and would like to get rid of all comments about one to three not really relevant comments below many blog posts? Can I delete them safely without losing any ranking? I think it's ultimately up to you. From our point of view, we do see comments as a part of the content. We do also, in many cases, recognize that this is actually a comment section, so we need to treat it slightly differently. But ultimately, if people are finding your pages based on the comments there, then if you delete those comments, then obviously we wouldn't be able to find your pages based on it. So that's something where, depending on the type of comments that you have there, the amount of comments that you have, it can be the case that they provide significant value to your pages. And it can be a source of additional kind of information about your pages, but it's not always the case. So that's something where I think you kind of need to look at the contents of your pages overall, the queries that are leading to your pages, and think about which of these queries might go away if my comments were not on those pages anymore. And based on that, you can try to figure out what you need to do there. It's certainly not the case that we completely ignore all of the comments on a site, so just blindly going off and deleting all of your comments in the hope that nothing will change, I don't think that will happen. When using intracitials as product pages, does Google index the content on those intracitials, or does it only index the content on the static pages? So I wasn't quite sure how you use intracitials as product pages. It seems like a kind of unique setup. But anyway, I think it's less a matter of intracitials or not, but more a matter of what content is actually shown when we load those pages. So if we load this HTML page, and by default, it always doesn't show any product information, then we wouldn't have that product information to actually index. Whereas if you kind of load that page, and it takes a second, and then it pops up the full content, the full product information, then essentially by loading the page, we have that information. And we can use that to index and to rank those pages. So it's kind of a simple way to double check. What we would be able to pick up with regards to indexing is to take that URL and use something like the mobile friendly test or the URL inspection tool in Search Console, and copy it in there, and to see if Google is able to bring up the full product information or not. If Google can bring up the product information, then probably that's OK. Whereas if Google only shows you kind of the static page behind that, then probably it's a case that we wouldn't be able to pick up the product information. So that's kind of one thing to watch out for. I think what threw me off with this question initially is also the word interstitials there in the sense that usually interstitials are something that are kind of between the content that you're looking for and maybe kind of like what is actually loaded on a browser. So if you go to a page and instead of the product page it shows a big interstitial showing something else, that's kind of the usual setup for interstitials. And from our point of view, those kind of interstitials, if they're intrusive interstitials in the sense that they get in the way of the user actually interacting with the page, then that would be something that we would consider a negative ranking factor. So if it's really a case that when you go to your pages it just takes a bit and then your product page pops up, then I wouldn't call those interstitials. Maybe use some other word for that because if you ask around like in the help forums or elsewhere and you say, oh, my interstitials and I want to rank for my interstitials, then probably a lot of people will be confused. It seems like Google Images crawls some SVG files in SVG and then some it renders into PNG while serving in the search results. What is the reason for that? Is there a way that we can dictate this behavior for a Google image crawler? I don't know. I wasn't aware of the kind of like how this is happening. So I'm not 100% sure what exactly you're referring to. My understanding is that when it comes to images, especially the vector formats like SVG, which don't always have a well-defined size, what we do internally is we convert that into a normal pixel image so that we can treat it the same way as we can treat other kinds of images. That means for all of the normal processing internally and all of that. And also, specifically with regards to the thumbnail images that we can show so that we can scale it down using the normal pixel scaling functions and get it to the right size and get it into an equal resolution to the other thumbnails that we show. So probably that is something that is happening there and that's not something that you can easily change because we kind of have our system set up to deal with pixel-based images. And that's what we would do there. With regards to kind of the next step from there, the expanding of the image, when you click on it in the image search results, I don't know how that would be handled with regards to SVGs or if we do some kind of in pixel-based bigger preview or SVG-based bigger previews. I don't quite know how we would handle that there. If you have any examples where this is causing problems, I would love to see them. So feel free to send me anything that you run across in that regard, especially when you see that it's causing weird problems that could be avoided by doing it slightly differently. Is there anything that we can do in terms of SEO to improve user journey? I think those are kind of separate topics. So it's not something that you would do SEO to improve user journey, but rather you have your user journeys that you use to kind of analyze your products and try to find the best approaches that you can do there. And then based on that, you would also try to do some SEO to improve things in the search results. So one is kind of improving things for the user and the other is kind of improving things for search engines. Sometimes, if things line well, then there is enough overlap that they work together. But essentially, they're separate topics. What is the best way to treat syndicated content on my site? If the content is already in other sites, too, and I have to know and I can page or do I canonicalize to original source? Do I no follow all the internal links of that page? Yeah, good question. I don't think we have exact guidelines on syndicated content. Generally, we do recommend using something like a rel canonical to the original source. I know that's not always possible in all cases. So sometimes what can happen is we just recognize that there is syndicated content on a website and then we essentially try to rank that appropriate thing. So if you're syndicating your content to other sites, then it's theoretically possible that those other sites also show up in the search results. It's possible that maybe they even show up above you in the search results, depending on a situation. So that's something to kind of keep in mind if you're syndicating content. If you're hosting syndicated content on your website, then that's kind of similar to keep in mind in the sense that most of the time we would try to show the original source. And just because you have a syndicated version of that content on your site as well doesn't mean we will also show your website in the search results. So usually what I recommend there is to make sure that you have significant unique and compelling content of your own on your website. So if you're using syndicated content to kind of fill out additional facets of information for your users, then that's perfectly fine. I wouldn't expect to rank for those additional kind of facets or filler content that you have there. It can happen, but it's not something I would count on. And instead, for the SEO side of things, for the ranking side of things, I would really make sure that you have significant kind of unique content of your own so that when our systems look at your website, they don't just see all of this content that everyone else has, but rather they see a lot of additional value that you provide that is not on the other sites as well. When we want to rank for a specific topic on Google, is it a good practice to also cover related topics? For example, if we sell laptops or you want to rank for that, is it useful to create posts like reviewing laptops, introducing the best new laptops, those kind of things. And if it's useful, then it doesn't have to be done in any special way. So I think this is always useful because what you're essentially doing is, on the one hand, for search engines, you're kind of building out your reputation of knowledge on that specific topic area and for users as well. It provides a little bit more context on kind of like why they should trust you. If they see that you have all of this knowledge on this general topic area and you show that and you kind of present that regularly, then it makes it a lot easier for them to trust you on something very specific that you're also providing on your website. So that's something where I think that always kind of makes sense. And for search engines as well, it's something where if we can recognize that this website is really good for this broader topic area, then if someone is searching for that broader topic area, we can try to show that website as well. We don't have to purely focus on individual pages, but we'll say, oh, it looks like you're looking for a new laptop. This website has a lot of information on various facets around laptops. How long should we wait for Search Console manual action response? It's been months in. I avoided resubmitting because that's not nice, but do these ever get lost? If we don't get new replies, what should we be doing as an extent? Depending on the type of manual action, it can take quite a bit of time. So in particular, I think the link-based manual actions are things that can take quite a bit of time to be reviewed properly. And it can happen in some cases that it takes a few months. So usually what happens is if you resubmit the reconsideration request, then we will drop the second reconsideration request because we think it's a duplicate. The team internally will still be able to look at it. And if you have additional information there, that's perfectly fine. If it's essentially just copy and paste of the same thing, then I don't think that changes anything. It's also not the case that you would have a negative effect from resubmitting a reconsideration request. So in particular, if you're not sure that you actually sent the last one, and you're like, oh, it's like someone on my team sent it, and now I'm not sure if they actually sent it or not, then resubmitting it is perfectly fine. It's not that there will be an additional penalty for resubmitting the reconsideration request. It's just when the team sees that one is still pending, they'll focus on that pending one rather than the additional ones. If you don't see any response with regards to manual actions, specifically around the link manual actions, I would recommend maybe also checking in with the help forums or checking in with other people who have worked on link-based manual actions. Because when it takes so long to be reprocessed like this, it's something where you really want to make sure that you have everything covered really well. So that's something where if you're seeing it taking a long time, you're like, oh, I don't know if I needed to do more or needed to do something different, then going to the help forums is a really good way to get additional feedback from people. And it's very likely that you'll go to the help forums and nobody's like, oh, you should have submitted these 500 other things. And it's not the case that you have to do whatever feedback comes back from the help forum, but rather it's additional input to take in. And you can review that and say, OK, I will take into account maybe a part of this feedback and maybe skip another part of this feedback. Because the folks in the help forum are very experienced with tons of topics, but they don't have the absolute answers. I don't think anyone really has that. So I think it's great to get all of this feedback, but you still have to kind of judge it and weigh it out yourself, as with anything on the internet. Does PageSpeed Insights use the Googlebot? I wonder because when I'm looking at the rendered screenshots in PageSpeed Insights based on our site behavior, it looks like those weren't rendered by Googlebot. You're probably right. So in particular, PageSpeed Insights is something which is based on the Chrome setup. So that's something where, as far as I know, the server-based system that does the PageSpeed Insights screenshots and calculations and metrics, all of that, is just purely based on Chrome. And Googlebot also uses Chrome to render pages, but there are some kind of unique aspects with regards to Googlebot that don't apply to PageSpeed Insights, for example, robots text. So when Google renders a page, it has to comply with the robots text of all of the embedded content there. And if you have maybe a CSS file or a JavaScript file blocked by robots text, we wouldn't be able to process that from Googlebot point of view, but PageSpeed Insights would still be able to review that and show that. So that's probably where you're seeing those differences. I think the difference is more and more blurred because Googlebot does use Chrome as well. So it is very similar. But you can certainly find situations where there are differences. And you can certainly construct situations where there are differences. Like I mentioned, with robots text is a really simple way to kind of see those differences. With regards to the way that we calculate speed for search, though, we kind of, I guess, looking forward at the core web vitals. At the moment, I don't really know often how we do that. But with regards to core web vitals, we use what users actually see. So it's not the case that Googlebot renders a page very quickly, and then it gets a good score. Or Chrome in PageSpeed Insights renders a page very quickly, therefore it gets a good score. But rather, we look at what users actually saw. And I think that's really important because that's kind of a measure of what the real world performance is. And all of these tools that kind of render it in more of a lab environment, like Googlebot when it renders a page, or PageSpeed Insights when it renders a page, is something that is almost more of a prediction than an actual kind of measurement. Because there are lots of assumptions in play there. And whenever you run something like a rendering of a page within a data center, then you have a very different setup than the average user has. With regards to network connectivity, with regards to caches, and all of that, it's just very different. So these tools, when you run them and when you look at the measurements that they show, you need to keep in mind this is more of a prediction rather than an actual value that users will see. And I'm sorry. No, I was going to ask a follow-up, but I'm sorry. You continue. And it's something that these tools also try to build in in the sense that they will say, well, I run in a data center, but I will act like I have a 3G phone and kind of a slow connection. And they'll try to emulate that, but it's still very different than actual user. Go ahead. I'm sorry about that. So in terms of assessing it, I was reading a site, seemed like it had a lot of ads. So I decided, OK, let me see how this is scored on page speed insights. And it rendered a score with the circle of 21, which was in the red, not very good. But below that was a visual and below that numerical and visual representation was a sentence that read that based on field data, the page passed in green, the assessment. And then below that, there were these measurement bars for cumulative layout shift, first input delay, et cetera. And those were all mostly in the green. So where's the disconnect? And what should one be paying attention to that first visual circle, or the fact that it says it passed the Core Web Vitals assessment? I need to keep in mind how the page speed insights looked. I think it has that one overview score on top, right? Yeah, I actually, is there a way to present, because I did a screenshot and I redacted the name of the website, if that makes it easier. So I think what happens in page speed insights is we take the various metrics there, and we try to calculate one single number out of that. And sometimes that's useful to work on or to give you a rough overview of what the overall score would be. But it all depends on how strongly you weigh the individual factors. So it can certainly be the case that, overall, when users see a page, it's pretty fast and sleek. But when our systems test it, they're like, oh, these are some theoretical problems that could be causing issues. And they'll kind of calculate that into the score. So I think the overall score is a really good way to get a rough estimate. And the actual field data is a really good way to see what people actually see. And usually what I recommend is using those as a basis to determine, should I be focusing on improving the speed of a page or not? And then use the lab testing tools out there for determining the individual values and for tweaking them with the work that you're doing. So using the overall score and using the field data as a way to determine, should I be doing something on this or not? And then using the lab data with the individual tools to improve things and check that you're going the right direction. Because the issue is also the field data is delayed, I think, by about 30 days. So any changes that you make, and if you're waiting for the field data to update, it's always 30 days behind. And if you're unsure that you're going in the right direction or that you've improved things enough, then waiting 30 days is kind of annoying. Thank you. Hey, John, can I add a follow up on that as well? Sure. With regards to see Core Web Vitals, a field data is going to be the one to pay attention to correct, in terms of ranking signals? Or is it going to be added? Yes. Yes. It's the field data. OK. While we are in this Core Web Vitals topics, I have a small question in this regard is that when this becomes a ranking signal, CLS, and all the other friends, is it going to be page level or domain level? Good question. So essentially, what happens with the field data is we don't have data points for every page. So for the most part, we need to have kind of groupings of individual pages. And depending on the amount of data that we have, that can be a grouping of the whole website, kind of the domain. Or I think in the Chrome User Experience Report, they use the origin, which would be the subdomain and the protocol there. So that would be kind of the overarching kind of grouping. And if we have more data for individual parts of a website, then we'll try to use that. And I believe that's something you also see in Search Console, where we'll show one URL and say there's so many other pages that are associated with that. And that's kind of the grouping that we would use there. Just why I ask this, we have this set of pages that they are slow. They exist for a different purpose than our other pages on the site. And these, we have a no index on them, but they are very slow. And that's why we don't want it to be accounted for. Yeah, I don't know for sure how we would do things with a no index there. But it's not something you can easily determine ahead of time. Will we see this as one website? Or will we see it as different groupings there? Sometimes with the Chrome User Experience Report data, you can see, does Google have data points for those no index pages? Does Google have data points for the other pages there? And then you can figure out, OK, it can recognize that there are separate kinds of pages and can treat them individually. And if that's the case, then I don't see a problem with that. If it's a smaller website where we just don't have a lot of signals for the website, then those no index pages could be playing a role there as well. So my understanding, I'm not 100% sure, but my understanding is that in the Chrome User Experience Report data, we do include all kinds of pages that users access. So there's no specific kind of, will this page be indexed like this or not check that happens there? Because the indexability is sometimes quite complex with regards to canonicals and all of that. So it's not trivial to determine on the Chrome side if this page will be indexed or not. It might be the case that if a page has a clear no index that even in Chrome, we will be able to recognize that, but I'm not 100% sure if we actually do that. All right, thank you. I'll follow up on the reader. Yeah, I would also check the Chrome User Experience Report data. I think you can download data into BigQuery and you can play with that a little bit and figure out how is that happening for other sites, for similar sites that kind of fall in the same category as the site that you're working on. Cool, more questions from any of you? Yes, John, hi. I suddenly see as well as it started all at the mid of January, I suddenly saw in Search Console that there are a lot of old URLs popping up, especially in the 404 subcategory under excluded and in the URL inspection tool. These old URLs are, for example, old HTTP versions of URLs and it's even old domains because the websites were moved to a new domain like three years ago. So my question is, why is that? Should I be worried? And if yes, how can I fix it? So these are showing up as 404 errors? These are showing up as 404 errors and for some URLs, if I use the URL inspection tool, they also show up as referrers in the URL inspection tool. OK, I think if they're just showing as 404s, I would completely ignore that. What happens in our systems is that pages which are 404 are essentially still tracked on our side and from time to time, we will double check to see that they still have a 404. And that can happen, that a site has changed significantly, doesn't have these pages for years now and still, from time to time, our systems say, well, we will double check those old URLs and see if they still return 404. And that's not a sign that anything is stuck with those pages. It's just kind of our systems trying to make sure that we're not missing anything from your website. And if they show up as referring URLs in the URL inspection tool? So how do you mean as referring URLs? Like they link to another page or? Yes, for example, I used the URL inspection tool on a URL that's still present. And then in the URL inspection tool, you see where Google knows this page from. And there it says, for example, it knows it from the site map. And then there are like four URLs listed below that. And in that list, this list contains, for example, an old HTTP version. It contains the same file name. But from the old URL of the website, which are all URLs that don't exist anymore. So this is also something that makes me worry or shouldn't it? That's completely normal. That's something where I am not 100% sure which data we show there in Search Console. But we have a concept of kind of the first scene location of a link to a specific page. And we might have seen that URL from that page at some point, way in the past. And if that page doesn't exist anymore, it's still like, this is where we first saw it. OK, so it's basically just make sure that if the original page doesn't exist anymore, it returns a proper 404. If it's redirected, then make sure it's a proper redirect. And in other cases, just ignore it if you get the question. Exactly, yeah. So usually, if you have an older website, then over the years, you will collect more and more of these 404 pages. And our system is, even when they rarely check a 404 page, it's just like the amount of URLs that could be returning 404 grows. So if you look at your server statistics and you look at what Googlebot is requesting, then it can look like, oh, Google is spending so much time on 404s. But for us, it's just checking it maybe once a year or so. But because we have so many that we check once a year, it overall looks like a lot. But that's this web. Yeah. Yeah, sorry, this website is like 10 years old. And last question for that, because the websites I'm talking about, they were also moved to new domains. We used the address change tool. And so basically, just make sure that the old domain still redirects to the new website. This would be then the perfect, good setup. And we shouldn't worry about anything further. Yeah, that sounds great. One place where people also get confused with that, which is kind of similar, I guess, with old URLs, is that when we recognize that pages have moved, we still have some association of the old location. So we will know that this page on the new website used to be located as a page on the old website in some sense. So if in the search results you do a site query for the old domain, then even after a couple of years, you'll still see a lot of URLs that are shown there. And it's not the case that we have them indexed there, but rather, we know they used to be there. And it looks like a user is explicitly looking for the old location, so we'll show them. So if you look at the cached version of the page in a case like that, then you'll see that actually it shows a new domain. So it's a little bit confusing if you look at it like that, but essentially it should be working properly. OK, thank you. Sure. So I have a related question, for example, if we set the proper 301 redirects. And I was just trying to understand the relation of backlinks over the ones, if Google has the history of the old links, is it possible that it passes some sort of page rank to new URLs that we set 301 redirects for? For example, we have a site with the backlinks, and we decided to change the URL and the proper 301s. And backlinks are always there. We normally change a few of those, but they're still there. So if you say Google has some sort of history there, so would it possible that it passes some sort of rank juice or page rank juice to new URLs? Yes, yes. So essentially what happens there is we will have kind of the old URL on your website that has some signals from the links that go to the old URL. And we have the new URL on your website. And with a redirect, you're basically telling us these are equivalent, and you'd probably prefer the new URL to be shown. So what we will do is we will put both of those URLs from your website into a group and say, this is kind of a group of URLs that have kind of collected signals. And then with the redirect, we will pick usually the destination URL and say, this is the canonical for that group. And the canonical page will then kind of inherit all of the signals that go to that group. So if there are links to the old version of a page, if there are links to a copy of that page, then all of that will be kind of combined together in the canonical version. So that's something that kind of gets passed on there. Specifically, when you're talking about site moves, we still recommend making sure that you, as much as possible, can update the old links anyway. Because what happens there is we will put those URLs in the same group, kind of like I mentioned. But we use various factors that determine which of these URLs is the right one to show, which one is the canonical one. And redirect is one factor, but also links are another factor. So if all of the links, like internal and external links, go to the old version of your URL and you redirect to a new version, we might pick the old version of the URL to show in search. So that's something kind of to keep in mind, that if you want to move everything to a new URL, then make sure that everything is aligned with the new URL. So the redirect, the sitemap files, the internal linking, as much as possible, also the external linking so that everything just fits together with that new kind of URL that you want. Thanks for this question. I have another question. Most of the tools, the SEO checkup tools, they pop up with a warning which says, load text to HTML ratio, which means code is more than the original text. Would it be something we need to worry about, or is it OK for Google to pick the right text? We don't have a notion of text to HTML ratio for search. So that's something where I think a lot of these tools are able to calculate this. And they think, oh, it's worthwhile showing, but it's not an SEO ranking factor kind of thing. There are two places where it could play a role on the one hand with regards to speed. So if you have a lot of HTML and you have very little text, then obviously we have to load a lot of content to display the page. So that's like one small factor. The other one is with regards to extreme situations where you have a lot and a lot of HTML and very little text. We have limits with regards to what the maximum page size is that we would download for an HTML page. And I think that's in the order of, I don't know, hundreds of megabytes, something like that. So if you have an HTML page that has hundreds of megabytes of HTML and very little text in it, then yes, that could be playing a role. But that's something that I suspect is extremely rare. And if you have that problem, then that's like a bigger problem than just like, oh, it's not perfect HTML to text ratio. Perfect, yeah, I got that. Thank you. Sure. Let me just pause the recording here. You're welcome to stick around a little bit longer if you like, but it's always good to kind of keep the recording limited to avoid it becoming super long. Thank you all for joining. Thanks for all of the questions that were submitted and that were asked from you along the way. And I'll set up the next office hours probably later today, which will also be next Friday, but evening, European time, more for the American folks. So Michael doesn't have to get up in the middle of the night. I don't know how he does it, but thank you. Cool, all right, let me just pause here and we can continue after that.