 All right, welcome, everyone, to today's Google Webmaster Central Office Hours Hangouts. My name is John Mueller. I'm a webmaster trends analyst here at Google in Switzerland. And part of what we do is these Office Hours Hangouts, where webmasters and publishers can join in and get their search-related questions answered. I see a bunch of new faces here, which is great. If any of you want to jump in with the first questions, feel free to go ahead now. Hey, John. Hi. I have a quick question. So I saw a few websites implementing the wrong schema. And since they're implementing schema for the listing pages, instead of an individual product, I think you'd like to recognize the Google guidance, right? Yeah. Sorry, can you, did you get me a question? Maybe you can repeat the question. I think I might have misunderstood. Sure. So the schema is implemented for an individual product. But we do not implement a schema for a listing page, which has a mix of products. We do not show the rating for a list of products. We do not show an aggregated products rating. I got rid of the Google. But few websites are implementing this aggregated schema for a listing page because of which they are getting star rating in the search results and because of which I saw a higher CTR and higher position. So basically, I think your question is, you see other sites implementing structured data in a way that isn't compliant with our guidelines, and you'd like to go to the organization. Throw them out of the index. So I guess there are a few aspects there. One is that this structured data doesn't change ranking. So it doesn't change the position where a site shows up. It just shows the way that we show the search result. So that's still a reasonable effect, I guess. But it doesn't change ranking. So you don't have to worry about that part. With regards to the general problem of other sites are doing something bad and why can't I do it bad as well, that's something where, from my point of view, if you're seeing other sites do something bad, then that's something you can avoid. You don't have to kind of go down that path. If you think that they're getting kind of unnecessary advantage from that, you can use the web spam report tool. So there's specifically a form for rich snippet spam, which I know the web spam team takes a look at as well. And that's something you could use for cases like this where you see other sites doing something that's not confined with our data. I complained there multiple times, but no action was taken by the team. And the second thing was, I agree that there's no direct correlation between the schema and the position, but the schema is helping in improving the CTRs. You know what happened, these websites are having star rating, which helped in their CTRs. I went very closely since they are my competitors. So indirectly, they are able to get a better position because of the schema elimination in the IR CTR also. Well, they wouldn't be getting a different position. So it wouldn't be changing the rank. That's at least one of the things there. I don't know about the CTR. I think sometimes it does help to highlight what your pages are really about for users to make it easier for them to get there. But in general, my advice for this type of situation is if you're seeing other people do something wrong, I would send that to the Web Spam Team, let them take a look at it, but focus your energy primarily on your site. So instead of worrying all the time, oh, these other guys are getting away with something, focus your energy on your site because that's where you can make a positive change. That's something that you have under control. That's something that you can improve. So I realize it's frustrating when other sites are ranking with techniques that aren't really that great, but after you've reported that, that's kind of what you can do there and then focus your energy on your site. That's really where you have a way of making a positive impact at last. Sure, sure. Thanks for that. Sure. All right. Any other questions before we get started with the ones that were submitted? Can I ask a question? Sure, go for it. Yeah, so if Googlebot indexes two pages, that are exactly the same at exact same time, I guess this question is asked, but it's not a problem if I ask once more. Yeah, but if Googlebot detects two pages, exactly the same, same content, same site structure, everything, and at the same time, they index the content, then which one will rank higher? Which one will rank higher? I don't think there's an absolute answer for that. So usually what happens when we recognize that two pages are the same is we try to pick one and we index that version, or if we index both versions because we think they're slightly different, then what happens is when we do the search results, we will say, oh, these are the same pages, we will just show one of them in the search results. And it's not so much a matter of which one will rank higher or which one will rank lower. It's more a matter of which one does Google show because it's the same thing. It wouldn't make sense to show both of these to two users in the search results. And for that, we have a number of different factors that we try to take into account, which include things like rel canonicals, redirects, internal linking, external linking to try to figure out which one of these pages is the right one because the most common situation that I see here is that within your own website, you have multiple pages that have the same content. And that's more of a technical issue from our side. It's not like a ranking issue or spam issue. It's more a matter of which URL do we show. Both of these go to the same site. They show the same information. Which one of these should we be showing? And for that, we try to guess as much as we can and take into account all of the information that the webmaster is giving us by looking at those various signals and trying to see which one is the right one. Okay, cool. Hello, John. Sorry to jump in this question. I reached out to you through Twitter a couple of days ago. I have a very similar issue with one of my clients. They have a .com.au on the .co.uk websites. They pretty much have the same content and we're working towards enhancing and making sure that the content is different. But the .co.uk website is not showing in the .co.uk index for Google. And the .com.au website is the one that is showing in there. I was wondering if you might just maybe give me a bit of an update on that perhaps. Sorry to hijack that question. No, no, that's perfect. So this is something that sometimes comes up. We sometimes see that, especially when the content is essentially the same. So I don't remember exactly which case that was because this is something that comes up every now and then. It's sometimes a bit tricky because sometimes the content is subtly different, like you have a different currency or a different phone number on the pages. But in general, when we recognize that the content is essentially the same, what will happen is we'll try to fold it together and use things like the hreflang markup between those pages to figure out which one to actually show to users. So that's one thing where it's useful for us to have this hreflang markup between those pages. I don't know if that's something you have on your pages already? Yeah, we already have the hreflang in those pages. We have the code for the US, the code for the UK, for obviously England, the code for Australia. It's still like, I'm looking through what Google Search Console, we were sort of getting indexed and then we got rankings dropped almost to the second or third page of Google for the code for the UK. And again, that the code that you were coming back again, I'm still kind of a bit possible about why is this happening. Yeah, I think for the most part, this is probably something on our side where we could be doing a better job of picking the right versions and making sure that we index all of those variations. So this is, I think, what was it? The Swimware site, right? Yeah, it was at bondaisans.co.uk. It was a tiny launch for you guys in the UK. It's just a new client and they do have sort of the same content in there which we're working towards it. But again, the hreflang, the double-level domain is completely different. And even some of the stock is, some of the content is a little bit different. So I would just be concerned why sometimes we get, indexed by Google and sometimes it will come to you or shows up in the first page. Yeah, so I passed this on to the team when I saw that and they're currently discussing what's happening there. So... Thanks John, appreciate it. So from my side, this is something where, or this kind of a report has been really useful. And obviously there's some things you could do on your side but I think this is something we should be doing better on our side as well. So I hope in the next couple of weeks or so we'll be able to figure that out a little bit better and get that resolved there. Thank you, thank you, appreciate it. Sure. All right, let me run through some of the submitted questions and as always, feel free to ask any more questions related to the answers, to the questions along the way and we can see where we go. All right. John, can I ask a question then on that last question? Sure. Just very quickly. On, I just had a quick look at the bonfiresans.eu website. So I can't go any further. I literally can't click anywhere because it's forcing me to go to click a box and say you need to go to your local site in the UK. Yeah, that is correct. I know something I wanted to do. Apparently they don't want to, they want to force people to go into, according to the information. And previously we've spoken about pop-ups and whether you're indexing the site or whether you're indexing the pop-up if you're forced to be in a pop-up immediately. And so I can't go anywhere, but I presume when you index, you can. Is that right? Yeah, we made sure that the road was on. No, John. No, I'm talking to John. So I can't go anywhere, but presumably when Google tries to index from here, your spiders can go somewhere. I didn't actually check. That's a good question. Because we had issues. From a usability point of view, that's definitely an issue. And especially if it looks like an interstitial then from our side we'll probably treat it as an interstitial. And the more problematic situation which I don't think is the case here is if there's an automatic redirect to always the local version. Because if there's an automatic redirect there then we would always index like the version that's better shown in the US. And we'd kind of skip all of the other localized versions. But I don't think that's really like causing this particular problem here. I think it's something that's definitely not that great. But we should be able to kind of... You should be able to go through that, I assume. Yeah. All right, okay. All right. Yeah, I definitely, that's a good point. I definitely recommend kind of cleaning that up and making sure that that actually works everywhere. I suspect maybe Googlebot doesn't see that interstitial. Maybe it's robotic or something like that. But it's still worth kind of making sure that all users can access all versions so that you don't run into any problems like that. I just know that this might not be related but when we've had issues with our sites are you seeing everything from California because we deliver via IP and we force one version on certain people? It means that you only see California and stuff. So I thought maybe you're only seeing this stuff. And that's causing an issue. Just to clarify guys, Googlebot and even Springfrog can crawl on the code of UK and the company. You have spoken with the developers and they made sure that the actual bot can actually crawl the content on the site. But again, we get index and then we get the index and it's just a bit strange. Yeah, I think you probably... So from a technical point of view when talking with engineers and I mentioned something like that, they're like, oh, the site is cloaking. So that's kind of one of the things to watch out for. I don't think like we would see this as similar thing as web spam type cloaking but the engineers are kind of critical about situations where users see one version and Googlebot sees a completely different version. So that's one thing I kind of watch out for. Yeah, definitely the same version for users on bots is exactly the same. Well, thank you, John. Sure. All right. So the next one is easier for me to answer. My site got added to Google News and now it's not showing up in Google News. So I can't help with anything around Google News because I don't have any information around that. So that's one thing there. I think it kind of goes on and says it never appears in the top stories block. That can also be the case. That's generally unrelated to Google News. So any website can appear in the top stories block. It's not that it's limited to Google News sites. We have an issue with a client on Google UK who are searching for their brand name. We get a directory tracking link instead of their brand name. We'd like to know a bit more about this. I don't know. I think you were here, right? Yeah. Hi, John. I'm here. Hi. Yeah, so I'll give you a bit of a background. Basically, about six weeks ago, just when you search for cozy homes in Google UK, it redirects to cozy without the E. And then it said, did you mean cozy homes? When we click on that, we then get property puppy appear as the brand URL. But then the four site links underneath are correct to cozy homes. When we look at the cache for property puppy, it shows the cozy homes domain and content. The steps we've taken so far are to ask property puppy to remove the URL. They didn't put a 404. It's still got 200. And then we've requested it as outdated content to be removed, which was accepted the first time. And then we re-crued the homepage URL through Search Console, which then put property puppy back in the index. So we requested property puppy removed from the outdated content and it got denied this week. So we're kind of stuck now. OK. So when I search here on Google Code at UK, I don't see that URL anymore. OK. Still a problem? Or is that? I did check just before. Let me double check again. I mean, I didn't change anything manually. So it's. Yeah, I still see property puppy from here. OK. And they removed that URL? Yeah, so what happened is that that replaced the home page was ranking fine about eight weeks ago. And then this property puppy kind of replaced it. So yeah, we're kind of stuck as to what it could be, whether it's a bug from Google Side or something that property puppy we've done to kind of force them to see them as the brand website. OK. So looking in our systems, it seems that that your home page is the one that we've been kind of indexing the while. So it's something where I suspect this is more of a temporary issue that's just like coming and going there. Because the client sees the correct domain name, but when they're not searching from their location, so anywhere else in the UK, it shows property puppy. OK. I'm in the UK, and I see the cosy homes. OK. I'm in Brighton. I'm behind Secrets, they're just going to write to me. Right, so I see cosy homes. The cache is cosy homes. OK. Is it .uk? Yeah, .co.uk of searching in Cognito. Right, and the domain itself is cosyhomes.uk, not .co. UK.com or? No, cosyhomes.co.uk with E, so it's not. So we're asking, so did you mean COSY? Yeah, yeah. So I guess the did you mean search results is something also worth mentioning there in that generally this is something that just catches up over time. As we see users searching for this term, we will understand that, OK, they didn't mean this other variation of the name, they meant this specific version. So that's not something that you could force from the website's point of view or in general kind of force Google to change that. That's something that catches up algorithmically as we see that users are actually. Yeah, because that appeared kind of the same time that the property puppy URL switched with the home page. So yeah, so should we just wait for a few more weeks and see if it switches itself out and then maybe in say four weeks time come back and have another chat with you if it's still not fixed. Yeah, yeah, definitely. I mean, I definitely keep coming back with this if it's something that kind of jumps in an hour. I think you also had a thread in the help form about this. Yeah, yeah, stuck on there to get a few more people to have a look at it. So I'll double check what's happening with those specific URLs. My guess is that this is kind of solved and it's just a matter of time before it kind of bubbles up everywhere and is resolved everywhere. I have double exception. All right. OK. All the tricky ones today. That's good. I have a website in.com and my target market is Middle East United Arab Emirates. I have a lot of Arabic keywords in the top surf, still getting near to zero visits from Qatar. Our content is in Arabic, our keywords in Arabic. When I check over keywords, planner, I find low search volume. How do I get traffic? So in general, there's nothing specific to making a website for an Arabic market. You can use a generic couple of domain like a.com. That's perfectly fine. It's really just a matter of normal search crawling indexing and ranking. There is nothing unique to targeting Arabic users, Arabic keywords. So that's something where I'd look at the general guidelines that we have around search and make sure that you're kind of aligned with that and keep working on it. It's not something that there's like a magic trick that you need to do to turn on Arabic users in the search results. Let's see. I have a few sites. I know a few sites that are implementing the wrong schema to increase click-through rates. I think we talked about this before. URL-wise, folders or domains and keywords, which is better for Google to crawl, all of these variations essentially work. So you can use domains if you need to kind of split these up completely. You can use subdomains. You can use directories. You can use URLs or URL parameters. Even that's essentially up to you. It's more a matter of the technical kind of setup that you have on your side than something from like a secret SEO point of view that you need to focus on. The one thing I would generally recommend is to avoid going down the route that you have a ton of different domain names for different variations of topics. Because on the one hand, that makes it a lot harder for us to understand which site is really relevant for these keywords. Because we have a little bit here, a little bit here, a little bit here. And on the other hand, it makes it a lot harder to maintain. So ideally, very few domains and just figuring out how you want to structure things on your side, which could be the way that your CMS does it by default. It could be a way that makes it easy for you to track where people are going, what people are buying, what people are doing on your website. Any variation essentially. If you have an error on site, how long will Googlebot understand this and increase crawls? So not really sure what kind of errors you mean there. When it comes to things like crawl errors, we do try to retry those from time to time. And even if we've seen this as a crawl error for many years, we might go back and say, oh, we have an extra chance to crawl a couple thousand URLs from this website. We'll just double check all of these old URLs as well to make sure that we're not missing anything new. So that's something where you don't need to kind of hide crawl errors from Google or to suppress them from Google. Essentially, if we have time, we'll take a look at those old URLs and see what's there. But just because we look at those old URLs doesn't mean you have any kind of disadvantage from that. So from that point of view, if you see old URLs being crawled in Search Console, perfectly fine. Not something that you need to kind of resolve. Self-referencing hreflang tags. Is it essential that these pages have self-referencing tags? No, it's not essential. From my side, this is something that we've recommended people do primarily to make it easy for you. So the idea being there that you can take this block of hreflang links and copy and paste them on all variations of those kind of interlink pages, and it'll just work. So you don't have to think about, oh, which URL is Googlebot currently pulling up? And how can I hide that in the hreflang block? You can kind of have the self-referencing URLs there. You don't need to have them there if you don't want them. If your CMS handles them automatically, you don't need them there. But it sometimes makes it easier just to maintain things or to kind of double-check that things are set up correctly. Why am I saying it's getting indexed and Google are removing the index pages in a few minutes? That's a good question. That's really hard to say without looking at specific URLs. So I guess there are a few common scenarios where I've seen something like this happening. The most common one that I see is either the page has a rel canonical to somewhere else, which would generally result in us indexing that page first and processing the rel canonical and saying, oh, rel canonical points here. And then we focus on the other page. Another common scenario is that there's a soft 404 involved. So we can crawl that page. We get a 200 result code. We get content. And we'll index that content. And then in a second step, we'll look at this content and say, hey, wait a minute, it says page not found. And then we'll actually drop that URL again from our index as soon as we are able to kind of reprocess the content that's on these pages. And soft 404s are sometimes a bit tricky to recognize. We do show those in Search Console, though. So I really double-check in Search Console that you have everything kind of set up properly and that things are kind of aligned with what you're trying to do. Another scenario that sometimes happens is that you're just looking at different variations of our index essentially in that we will maybe index one page very quickly, but it doesn't kind of get passed to all data centers automatically or right away. So you look at it once and you see the data from one data center. And you look at the search results again and you see data from a different data center, which might not be 100% copy of the previous one. So sometimes you see this difference between Google data centers. And you don't really see specifically which data center you're currently accessing. So you can't tell that this difference is because, oh, it's just like a timing difference between these two places. Does the sitemap error reflect have any effect on rankings? No. So let's see. So if the URLs, if the information in the sitemap file leads to URLs that we already know about, we will continue ranking them. The sitemap file helps us to optimize or improve the crawling and indexing from content on your website. So if things have not changed on your website, then we don't need to use a sitemap file to understand that things have changed. On the other hand, if things have changed on your website and we can't process the sitemap file properly, then that means we have to rely on normal crawling and indexing to recognize that change, which can mean it takes a little bit longer for us to understand that. In general, I would recommend to make sure that your sitemap file is actually correct, that it actually works well, because otherwise, why would you need a sitemap file if it doesn't have any effect, if it can't be processed? Is Google capable of detecting rich snippets spam? For example, site one is using crawled third-party content to build a website with rich snippets. Site two loses rich snippets, and the crawling site one gains the profit of third-party content. So I guess there are two aspects there. On the one hand, rich snippet spam, structured data spam, that is something that we do try to recognize automatically and manually. So if you're seeing that sites are using structured data in the wrong way, you can use the rich snippet spam report form to let us know about that. And someone from the WebStamp team will take a look at that and see, is this something that we need to take manual action on or not? The other aspect here is that one site is copying content from another site. And that's something that also kind of falls under web spam from our point of view. And sometimes there are also things where if one other site is copying your content, you may be able to take legal action on that or just contact the site itself and say, hey, I don't appreciate you copying my content. Can you take that down? So that's something where web spam does sometimes take action on that, but sometimes there are also other ways to kind of get that resolved a bit more direct. So that's worth looking into as well. How much wrong, how much can the wrong implementation of a canonical tag harm a website? It can cause quite a bit of harm in the sense that if you have the rel canonical setup incorrectly, then maybe we will pick the wrong version of the page to actually index. A really common scenario that we saw, especially in the beginning, of the rel canonical is that all pages on a website have the rel canonical setup to the home page, which essentially means for us that when we look at these detailed pages, we see the rel canonical to the home page. We say, oh, they don't want this page index. They want the home page index. And then what would happen is we would drop those detailed pages because we think, oh, everything should be on the home page. And we'll just focus on the home page, which means all of this extra information will get lost. So that's essentially kind of the worst case scenario where you have all of this content on your website, but with the rel canonical, you're telling Google, oh, everything is actually here. So that's kind of one thing that I would watch out for. We do have algorithms to try to recognize that scenario and other common mistakes that webmasters make. But we can't catch everything. And if you see your website doing a rel canonical incorrectly, I would even go so far as to say, I prefer to remove a rel canonical from a website completely until you're able to make sure that it's actually implemented correctly. Because it's really hard to kind of diagnose what all is happening when the rel canonical is not set up in the proper way. I have a website with a million URLs. Should I implement AMP? Yes, of course. So if you just ask me generally like that, that's kind of the direction I would head. Obviously, there are a lot of kind of detail aspects that are involved there where it's not so easy sometimes to implement AMP or it may be not suited for all kinds of content to implement AMP. So especially if you have kind of more static content, then AMP is definitely a great way to do that. Because AMP is essentially a cache version. It's a free CDN that you can use for your content. It's a great way to make your content available really, really quickly to users who are searching for your content. So that's one thing to kind of keep in mind there. On the other hand, if your site is very dynamic, where you can't really cache a page for all users, kind of like you would on a CDN with static content, then that makes it a lot harder to actually implement AMP in a way that makes sense for your website. So in a case like that, that's something where I'd say it kind of depends on how many resources you have available, how much developer time you have available to figure out how to kind of solve this problem around AMP and your kind of really fancy dynamic website. But for example, if you have a news website or if you have a blog, then that content is mostly static. And oftentimes, it's just a matter of kind of like installing a plug-in and saying, OK, turn on AMP, and then it's automatically done. So that's kind of where I'd say figure out where you have your gains and your losses and how much work that actually involves. Hi, John. Can I ask a question? Yeah, actually, my question is related to Google News. So can I ask? You can ask, but I probably don't have that much information about Google News. OK, the thing is, previously, our news stories was appearing on Google Search Grousel. But we actually are noticing from last four or five months, Google just changed their UI. And Google is showing three results with the image and the publisher name on OK. So but now, after changing that UI, we actually not appearing on that Search Grousel. But we are appearing on news.google.com and others platform. So what is the reason why we are not appearing on there? So that's the top stories section? Is that the section that you mean? Yes. Yeah. So that's an organic search element, essentially. That's not something that's tied to Google News. That's something where Search is essentially taking its signals and its algorithms to figure out which sites are best worth showing there for specific queries. So that's not something where I'd say would be related to Google News or related to specific markup on those pages. It's really a matter of Search figuring out, this is a great site. This has great timely information for the specific topic. So we will show it there. And if your site is not showing up there, then that's sometimes a matter of rethinking what you could be doing overall to make sure that your site is really high quality content. It's really worthwhile for Google to show in places like that. OK, one more question I have. And for example, I'm from India and running on a regional language website in Hindi language. So sometimes what Google is doing, Google is showing a wrong publisher name as compared to our publisher name. So it's sometimes showing Hindi news. Why is this happening? And that's an alternate version of your website name or it's completely different? Yeah, we have mentioned in the right name in the schema at Google Publishers Center and the new site map. But it's sometimes on query to query, Google is showing different publisher name on top of Search query. And that's for your content and a different version of your website or is that like a completely different publisher? Yeah, it's completely different publisher. I mean, it's showing it's Hindi news, the name of publisher. It's straight off our organization name. OK, I'd love to see some screenshots. If you have any examples or if you have any queries where I can reproduce that, that would be fantastic. I'm happy to take it. All right, you can go ahead. I will come back with the screenshots. OK. All right, thank you. Thanks. All right, Habir goes on with, is AMP easy to implement for e-commerce sites? What should I do? So this is, I guess, related to the previous question around AMP and a large site. And again, I think for a general e-commerce setup, it's probably not that trivial to implement AMP. If you do have developers who are able to take some time to figure this out, I'm pretty sure there are some things that you can do in that regard. But if you're just using a common e-commerce CMS setup, then maybe that's worthwhile waiting to see what the CMS comes out with regards to AMP. However, it's also important to keep in mind that AMP is a per-page thing in the sense that you don't have to implement AMP across all of your site all the time. If you have an e-commerce shop and you also have a blog and maybe you have a place where you're publishing your news information, then some of those other locations might be really well suited for you, where you could say, well, my blog is running on WordPress and I'll just activate the AMP plug in there, and make sure that the theme matches my site. And then I'll provide AMP for my blog content, but my e-commerce site still doesn't have AMP because my shop provider doesn't provide that yet. So you can wait with your shop and just switch your blog over to AMP as well. So that's something where you can always think about which combination could I be doing there that might make sense. And it's also worth keeping in mind that AMP doesn't change rankings. So it just shows a different version of the URL in the search results, one that loads really quickly. So it's good to get content to users, but it's not something where you need to kind of focus on AMP to get higher rankings. My site has 500,000 pages, out of which 150,000 get updated every day. What should I do to get the maximum out of the crawl budget? Should I implement priority in the site map? Should I implement an RSS feed containing the last updated pages? So I guess, first of all, taking a step back, if you have 150,000 pages that get updated every day, that sounds kind of questionable to me in the sense that it seems like if this is a content site and you're updating 150,000 pages every day, then that smells a lot like auto-generated content and not really something that from our point of view, our algorithms would be spending a lot of time. Sometimes there are good reasons to update a lot of pages every day. For example, if you have a weather website and the weather changes every day, it's something you can't stop. So that's something where we would try to pick up all of that content, but if this is a content website and you're just like recreating content, that much content every day, then I kind of question what is actually happening there. That's kind of an aside, I guess. With regards to getting the most out of the crawl budget, the site map file, in the site map file, we primarily focus on the last modification date. So that's what we're looking for there. That's where we see that we've crawled this page two days ago and today it has changed. Therefore, we should recrawl it today. We don't use priority. We don't use change frequency in the site map file, at least at the moment with regards to crawling. So I wouldn't focus too much on priority and change frequency, but really on the more factual last modification date information. An RSS feed is also a good idea. With RSS, you can use PubSubHubup, which is a way of getting your updates even faster to Google. So using PubSubHubup is probably the fastest way to get content where you're regularly changing things on your site and you want to get that into Google as quickly as possible. An RSS feed with PubSubHubup is a really fantastic way to get that done. Other things worth mentioning there is we recently did a blog post on crawl budget. So I would recommend taking a look at that. Most sites don't have the problem that we're not picking up all of the new content, but rather that we're getting stuck with all of the old content and kind of getting lost in a crawled jungle where there's lots of duplication with URL parameters, with other parts of a URL involved, where we crawl a lot of duplicates and get very few unique pieces of content out of that. I've looked at sites for site clinics where you have, where we crawl 1,000 times a number of URLs that the site actually has, which is kind of a really weird relationship there. And especially if we notice that the server is kind of limited and we crawl 1,000 URLs instead of one URL that has actually changed. So that's something where instead of just focusing on how to get content into Google Faster, you might also want to think about how to improve things that are currently happening. John, I know who asked the question, but if I want to get into more details with you on this, how can I contact you? Ideally through the help for us. So that's kind of where we send people with regards to individual sites crawling and indexing, that's not something that we tend to do a lot of one-to-one support on. All right, we have an affiliate e-commerce site. We added related products to help things, but we thought having product comparison information would be a benefit to the user and add value to the page. Do you agree? And if so, do we just display the info on a table or how can we best do it so that Google understands? In general, you can do that in whichever way you want. So that's not something where I'd say you need to do it in this specific HTML format for Google to understand that. If you're providing additional information on your pages, then that's something we can usually pick up right away. So it's not something that you'd need to do in any special HTML format. The one thing I would kind of caution against that I sometimes see is that people take an affiliate feed from one company and say, oh, I will add value by crossing it with the affiliate feed from the other company. And then basically, instead of having a website that's just one affiliate feed, it's like two affiliate feeds and you're not really adding value, you're just like showing two affiliate feeds. So really making sure that you're doing more than just republishing other sites' content in whatever way makes sense for your users, that's kind of what I would aim for there. Essentially, our ideal situation is that we see users searching for a product and we've seen your pages and we've seen other pages that are kind of showing similar products and we think your page is really the one that we want all users to see. And if we don't show your page on top in the search results, then that's an error in our algorithm. That's something that we need to fix. So that's kind of the ideal situation. It shouldn't be that, oh, this site is just as good as all of these others and it has a small tweak here. So that's kind of okay. It should really be, this site is clearly a step above everything else and it provides a lot of additional information, additional value to users. So it should clearly be the one that we should be showing. We have a lot of, lots of the same product but with different sizes. Is this impacting negatively in any way? How can we change this to one product showing all sizes? So this is a question I get a lot, especially around e-commerce sites and the hard part here is there's no one answer that works everywhere. So essentially what you need to do is you need to look at the pages that you're providing and think, is this really a unique page or is this just a variation of an existing page that I already have? And if it's just a variation, then I'd recommend folding that together. If it's really a unique page that kind of satisfies a unique need for the user, then that's something that would be worth kind of indexing separately. So when it comes to different sizes, probably those are just like variations of the same product and your time is better spent to concentrate those into one strong product page that lists the variations rather than to have them all separately. On the other hand, if you have like vastly different sizes or if the sizes that you have are really so unique that it's something that people are explicitly looking for, then that might be worth keeping separate. So those are kind of the different variations. So a common example that I often use is like, shoes where maybe you have like one shoe model and you have it in different color schemes and if these are just like normal color schemes, then probably put them together on one page. On the other hand, if you have a really unique color scheme that you also offer, then that might be worth kind of putting onto a separate page. So those are kind of the two approaches worth looking at there. Hi, John. Sure. Yeah, I've shared a few screen shots on the Jack section. Can you see that? Okay. I don't have any permission to access those. So I need to copy those out anyway and look at those later on. But if you can make them so that anyone can access them, then I can take a look at that later on and share that with the team. Okay. Okay. Should I be using both rel canonical and rel next and previous on all of the component pages of paginated content in case I don't have a view all page? Yeah, I think that's a general good approach there. In practice, I don't know how much additional value you get from like doing that kind of like theoretically best approach there. But I think that's something that if your CMS does that by default, I would definitely do that. So having the rel canonical set to the individual pages and then using rel next and rel previous to kind of link those different pages of the set is essentially our recommendation. Let's see, when an article is updated with significant information, should the published date and time be changed? Also, some people recommend removing the timestamp from the article. Is that recommended? So from an SEO point of view, this is totally up to you. It's something you can update the time and date if you want, you can leave it if you want from a usability point of view, especially if something is timely content, if it's tied to a specific place in time, then I would always recommend keeping it in mind. I would always recommend keeping a date on that. That's something that just makes it a lot easier for users to understand, is this still relevant? Is this less relevant? Do I need to look for a newer version of this piece of content on the same website? All of that is really useful. So in particular, when I see blogs that don't have any dates on the post, it's really hard for me to judge. Is this something that's relevant for my interests or not? And sometimes you just kind of like, well, I don't know how relevant this is. Therefore, I'll just look for something that I can tell how relevant it is. What are the best practices for? Yes, go ahead. Yeah, I have actually shared in public so you can check now. All right, great. I'll copy and paste that out for later. Thanks. All right, thank you. What are the best practices on keeping a page SEO-friendly on a JavaScript-heavy page, lazy-loaded photos and images from a gallery, for example? Sometimes this is tricky. So for the most part, we can process JavaScript on a page in a way to pick up all of the content on the page. So Googlebot is essentially a browser in the meantime and can browse pages in the way that most browsers do. So for the most part, we can pick that up. You can double-check with Search Console Fetch and Render Tool to see what Googlebot would be able to see there. One tricky aspect here is especially with regards to lazy-loaded images, where if you load the image after the user scrolls to a specific part of the page, then that's something Googlebot might not be able to recognize. Because Googlebot doesn't kind of trigger all of these events on a page just to see what changes on the page. So Googlebot will load the page once and see which images are available and try to focus on those. On the other hand, if this is a gallery, for example, and you have links to the individual images as well, then sometimes it doesn't matter if we can't see the thumbnail image as long as we see the link to the actual detail page with the image itself. And on that detail page with the image itself, we can load the image normally. And that's fine for us, too. So those are kind of things to keep in mind there. On the one hand, test with Fetch and Google. On the other hand, double check that we can crawl to the image landing pages, essentially. And then think about, does Google really need to see all of these thumbnail images? Or is it fine if Google just sees a link to the detail page and the lazy-loaded thumbnail image is essentially not shown to Google? Oftentimes, that's OK, too. How does Google see a 302 redirect if a page you're redirecting to is not of the same topic as this is only a temporary redirect until the original page becomes live again? For the most part, we will treat redirects as redirects. So we're not going to spend too much time on trying to see what was the old content of this page and what is the new kind of content on this page and how related are they and how should we treat these. For the most part, we try to see redirects just as redirect. So if it's a 302 redirect or a 301 redirect, we essentially try to follow that and see what's happening there. The main difference between a 302 and a 301 is which URL you want to have indexed. A 302 tends to tell us that you want the original URL index and a 301 tends to tell us that you want the destination URL index. So those are primarily the differences there. And if you're using a 302 redirect as a temporary redirect to kind of gap things over, that's perfectly fine. All right. Wow, it looks like we're kind of out of time. I have this room for a little bit longer. So maybe I'll just switch over to more live questions from you all. If there's anything on your mind that we've missed that you'd still like to talk about, free of free to jump on in now. No, everyone shaking their head. All questions answered. I see there's still some more questions that were submitted. So I can go through some of those. And if anything comes to your mind in the meantime, free of free to scream and we can look at your questions. All right. Can I use next previous tag in AMP pages and how do we manage pagination in AMP? I don't actually know for sure. In particular, if you have pure AMP website where your content is only available in AMP, I'm not 100% sure. But I believe you can still use link tags in the head section to kind of link those different pages. On the other hand, if your AMP pages are, in addition to your normal web pages, then you would put the rel next and rel previous links on your normal web pages. And the AMP page will just be connected to the individual web pages. So you wouldn't need to put pagination on the AMP page itself. Does Google see subdomains as the same IP as a main domain? Would it be better to host subdomains on separate IP addresses to get a better crawl budget? I guess these are two kind of separate related questions. On the one hand, subdomains and main domains, for the most part, we don't have a strict difference where we'd say subdomains are always treated like this and subdirectories are treated like this. In many cases, when you see subdomains as a part of the normal main domain website, we kind of fold that together and say this is one website. Sometimes we see subdomains as being clearly separate. For example, if you look at something like Blogger, like all those blogs are essentially subdomains and they're completely separate websites. So that's kind of where we treat them as separate websites. With regards to crawl budget, we do try to look at the websites that are on the same server to make sure that we're not overloading the server. So from our point of view, it's less a matter of giving these sites a lot of crawl budget and a lot of crawl time, but more a matter of what can we crawl that doesn't cause problems on your server? And that's somewhat tied to the IP address, but doesn't have to be purely the IP address. So for example, if you're using a content delivery network, then you might use the same IP address as lots of other websites. On the other hand, if you have a fancy hosting setup, you might have a number of IP addresses that are mapped to the same physical server, where we also kind of need to be careful and say, well, all of these different IP addresses are the same server, so we need to kind of treat that as one thing to avoid causing problems on that server. So my recommendation there would be to focus on making sure that your site works well for crawl budget in general, and less that you're kind of like tweaking IP addresses and subdomains to improve crawl budget, because probably you're spending your time on the wrong things then. Yes, there was a question from someone. OK, all right, let me grab a handful more and see where we can end up. I added search within a site code to my site, but in search results, the search bar is not showing. How long does it take? What is it? Sightlink search box code essentially is only used when we already show the search box in the snippet. So that's something where, if we haven't been showing that for your site, we won't enable it just because you have that markup. It's more a matter of when we do show this search box, either we use a site query or we use what you provide in your code. So it's not that we would kind of like activate this just because you've added this markup. My company acquired a really old website with lots of old organic traffic and original quality content. The website is really outdated, so it's hard to maintain. I think it goes on. It's like we'd like to combine this with our main website, which is responsive, and keep our Google rankings. What should we watch out for? So in general, I think this is a great approach if you can combine your content and provide it in a way that is more maintainable, is easier to use by users. That's a great thing to do there. The tricky part, I guess, is any time you're doing something that isn't a pure migration from one domain to another domain, it's unclear exactly what the final state will be or what it should be. So in particular, if you're merging sites, if you're taking two sites and creating one site out of it, or if you're splitting sites, if you're taking one site and you're splitting that into two separate websites, then those are scenarios where it's really hard to guess what the final state will be. You can't just take the traffic and divide it by two because you create two separate sites. And similarly, you can't just take the traffic to both of these sites and come add them together and say that's what the new site will get if you merge. So that's kind of one thing to keep in mind there is that you probably will see changes in search with regards to the traffic. And it's probably going to take a bit of time for things to settle down because we really have to understand the final state again afterwards to see what has changed, what, how we should be treating this new, like kind of like set of different sites that are hosted on one site now. And John, just on the migration question, if I can just jump in very quickly, out of curiosity, how long does it take for Google to start showing the new version of the URL if you have done the 301 redirects or 301 redirects actually? How long does it normally take? Because in the past, for an experience, it normally take around one or two weeks, but I feel like it's taking a little bit faster now. Yeah, so if you're doing a pure site migration from one domain to another, then I would guess that you could see those results within a couple of days. So that's something where like clearly not all URLs of the website will be indexed with the new version, but the main URL should be picked up within a couple of days. So that's something we spend a significant amount of time on to kind of improve that process to make it faster. All right, and let me just take this one here, someone who's been trying to get a question in here for a while. Since redirects are no longer being punished, would there be any negative effect by adding an extra folder to the URL? So instead of just like slash product name, it would be slash product slash product name to provide better functionality from a caching and more helpful 404 pages. That's perfectly fine. That's something where I would, even without kind of like these vague SEO aspects of like how much of a redirect does get passed on, this is something where I clearly see the additional value from the functionality much more important than any kind of like tweak with regards to SEO value. With regards to redirects, that's something that's perfectly fine to set up there to do redirects like that. The more general thing involved here is you're essentially making a site structure change if you do this across your whole website in the sense that we will have to reprocess your website to understand it again. So you will probably see some fluctuations in traffic. Maybe, I don't know, like for a couple of weeks here, I would guess until we're able to understand, oh, these internal links, they map to these URLs and those old URLs, they're redirected to those. So we can forward these signals to the new URLs and understand the new internal linking and everything is fine. But it's definitely something that takes a bit of time to be reprocessed. It's not so much a matter of like losing value, but it's more just it takes time to understand any internal navigation site structure change. So my general recommendation there is to figure out when would be a good time to do this actually, when would be a time where you're not reliant on search traffic that much, which could be like in the summer, maybe people aren't searching for your site that much, or maybe you're planning a big advertising campaign anyway and you'd get enough traffic to your website through the advertising during that time. So all of these things kind of come together. But from my point of view, anytime you see kind of like a clear additional value by making a change on your website, then I wouldn't worry too much about these vague SEO aspects, I'd really think about like the bigger gain is really this change that I want to make. And as long as that change doesn't break everything, then that's something I would clearly consider doing. All right. Last question. Sure. Yeah, so I was cited in Wikipedia. So a lot of sites were scraping data from Wikipedia and putting onto their sites. And this caused all the links were indexed, the scraped sites, they had their content scraped from Wikipedia and that was indexed to the Google. And what happened is that the rank of a few keywords went down drastically to page or 20s page or something like that. So is it a good idea to disable those kinds of or to generate a crap sites? You can disavow them if you want, but scraping Wikipedia is such an old school trick that our algorithms are really good at just ignoring that. That's something that used to be popular, I don't know, 10, 15, 20 years ago. So it's interesting that sites are still doing it, but I wouldn't worry about that. Okay. Thanks. All right. So with that, let's take a break here. It's been great having all you all here. It's been great with all the questions that were submitted. It looks like there's still a bunch that are left. So I'm sure I'll set up some new hangouts as well. I believe in two weeks when I normally do these, we're at Google I.O. So I'll be planning some during USA time. And what I'm thinking about doing maybe is doing one of these like in one of the Google offices where maybe I could have like one or two or three people from the hangouts that are at Google I.O. as well to kind of jump in and join us live and in person. So if you're at Google I.O. and you'd like to join in one of these hangouts in person, feel free to drop me a note on Twitter or on Google Plus and we'll see what we can do with regards to timing and finding a room that would work for them. All right. So with that, I wish you all a great weekend. Thank you all for joining and maybe I'll see you again in one of the future hangouts. Bye everyone.