 All right, welcome everyone to today's Google Webmaster Central Office Hours Hangouts. My name is John Mueller. I'm a Webmaster Trends Analyst here at Google in Switzerland. And these are Office Hours Hangouts that we do. Our webmasters and publishers can ask questions about their website, about web search, about how things work together. And we can try to find answers for you. As always, if there are new folks here that have a question that haven't been able to get in before, feel free to jump on in now. Not really new, but I have a question. And since I see nobody else asking. I have one idea, but I don't know if it's good or bad, but it just doesn't matter. What we are trying to do is we are trying to promote our Facebook and Twitter pages, for example, having subdomains on our site, redirecting to the site. Let's say we are pepsi.com. And we have facebook.pepsi.com, which redirects to pepsi.goo, to facebook.com slash pepsi. Now the question is, is this a good approach? Or we are losing some powerful links that would go to our subdomain when being redirected to Facebook and do it better just promote our Facebook page separately. With links, in general, we keep track of the starting point and the final destination. So if you're just redirecting through your site and you don't really need to go through your website, then probably you could just link directly to make it a little bit easier. On the other hand, if you're not sure if that link will always go to that specific page, maybe sometimes you point to a different Facebook page, or you think maybe you move everything to Google Plus instead of Facebook at some point, I don't know, then you might keep that redirect there so that you can change that redirect at some point and say, OK, this is the new destination of this URL that I have on my website. It just makes more sense from a marketing point of view to have everything promoted as our site instead of five different sites. Yeah, if you want to do that, that's perfectly fine. No problem with that. So it's not any big lose of link power or things like that? No, it doesn't change anything. It just makes it a little bit easier for you if you do want to change that redirect at some point. Because what I've read in various places is what you lose something like 50% of a link power if I redirect something like that. Where is nothing like that? No, I wouldn't worry about that. OK, thank you. John, before you jump into those questions, can I also have two quick ones? Sure, go for it. One is related with the link that I posted in the chat. And it's about the crawling flow of a website, time spent down by almost half. And usually when that happens, I see the crawling goes really crazy up. But this time, actually, it went down. Can you explain what? And maybe I'm missing something. And I'm not able to explain this flow. How come speed got improved? But crawling went down, actually. I don't know. I'd have to look at the details of the website, what's actually happening there. So that could happen, for example, if we're crawling smaller pages, then the time spent can go down. It might be that just because we're crawling faster, because your server is faster, doesn't necessarily mean that we'll crawl more pages from your website. So all of these things can kind of come together, where it's not necessarily that every time this graph goes down, the other one goes up. For me, this is the first time I saw it. And yeah, I was a little confused, but yeah. And the second one somehow also related. In order to save crawl budget, you usually block it by robots.txt. But sometimes you also get those listed, as far as the list title, not description in the search. Usually those numbers won't count towards site column index query, right? I don't know. I suspect they do count. But I wouldn't use the site colon count for anything with regards to diagnostics. And also on this one, I know you commented some time ago that you should stay away from using no index into robots.txt. But wouldn't that be a good way to basically also block it by robots.txt, but since you cannot put the no index tag in the page because it's blocked by robots.txt, to use the robots.txt to also no index those pages? Do you see any downsides to this? So since we don't officially support it, it might be that we drop support for it. I don't know if we still support it, actually. So it's something where, since it's not an official part of the way that we process robot sex files, I wouldn't rely on it actually doing something specific. So it might be that it still does what it used to in the past. It might be that that changes silently. But it's not the main, OK. But it's not an official approach, more or less. What I would do is if crawling is really a problem, then use the robots.txt. If you see these robot URLs in the normal search results, so not like site colon or specific URL queries, then that's kind of a sign that we don't know how to match these within your normal website. And usually that means internally maybe you're linking to those pages instead of other ones that are crawlable. And maybe you can change those internal links to really focus on the URLs that you do want to have crawled and indexed. Cool. On the first question, is there any chance I can ping you the URL of that domain with the weird crawling graph? Sure. You can put it in your chat here. I can take a look afterwards. I don't know if I can get back to you on that specifically. Yeah, no rush. But yeah, I'm just going to ping it then in case you have time. Thanks a lot. Yeah, sure. I'd like to step in here, if I may please. All right, go for it. Hi, John. So one short question, one slightly more involved question. You asked me quite often that if I contact you about Panda, a little joke earlier, you might have a look at my site. My only contact for you is an email address at gmail.com. And I haven't had to reply. I don't know if you've received my email. I should be using some other way of contacting you. No, that's perfectly fine. I received that, and I took a look at the site. But there wasn't anything specific that I could point out there. So it's kind of. OK. No. Let me move on to something, the second more involved question, which I've written up in the comments book. So the site's 12 years old, and I've got 620,000 pages in my site maps. And over the last weekend, since the week of Thursday or Friday, it's dropped something like 200,000 pages. So 320,000 index out of 620. And 200,000 have suddenly dropped. Like, say I have a one site map of 50,000 songs in Dutch, overnight goes to zero. Traffic doesn't seem to be affected yet. What's going on? Good question. So I took a look at that as well. I saw that in the comments for the Hangout. What usually happens there is, or usually happens, or what can happen in a case like this is basically you submit different URLs than the ones that we choose for indexing. So essentially what happens is we look at the index count based on the exact URLs that you specify in the site map file. And if we index a URL with the same content with a slightly different URL, we won't count that as index for that site map file. So in a case where you're seeing this kind of change in the index count per site map file and no real change in the traffic, then that probably means we're picking slightly different versions of the URL as a canonical. That could be as simple as something like a trailing slash or no trailing slash or .html or just without .html or www, non www, HTTP, HTTPS. All of those things where essentially the content is still indexed, which is why you're still seeing traffic, but it's not the exact URL that we choose for indexing. So what I usually do in a case like that to try to figure out is it like this or is the content actually missing from search is to look at the site map files in Search Console where you're seeing this big difference with the submitted and index counts, especially if you can tell where there's a site map file where this has changed significantly, and pick a couple of URLs from there and just search for the URLs themselves or do an info colon search for that specific URL and see which URL is actually shown in the search results. So what sometimes happens is you search for that specific URL and the info colon result shows a slightly different URL. And the slightly different one is the one that we used for indexing. So that's kind of a sign that actually we're picking up the content, but we're not indexing under that exact URL. And from a practical point of view, that's fine. It's not something that you need to fix. It does make things like reporting a little bit trickier because you can't really see that exact count in Search Console. So what I would do in a case like that is look at why Google might be choosing that different URL and either think about changing the internal site structure to match that URL or trying to find mentions of the version that Google is actually picking for indexing and to fix that. So maybe internally, you're linking to both versions. And if you simplify your internal linking and just focus on the one that you do want to have, then we'll switch that over again. So it's not a critical issue. It shouldn't cause significant traffic changes. It's just a slightly different URL that we're picking for indexing. Because a comprehensive answer, that's a little bit. Yeah, I've had UTF-8 issues, especially an upgraded UTF-8 and I wonder if it's connected to that. So thank you very much for your answer. Sure. I've got a lot of questions. Quick follow up on that. Let's say I have somebody with a very large website, 600,000 plus URLs. Only about 10% of that shows as indexed in the sitemap file. And one of the problems we've noticed is that it's an e-commerce site and users can get to certain products pretty quickly using the filters. But the filters themselves have canonicals to the non-filtered URL. So I guess Google could kind of get to those products by just going next page, next page. If it ignores the filters, just go next page, next page 300 times until it gets to all of the URLs. But we also noticed that the next prev hasn't been implemented correctly. And they actually have, from page 2, 3, 4 onwards, they have a canonical to the first page, which kind of causes Google not to crawl those pages anymore or crawl them a lot less. So is this something that could affect how Google indexes the products? And would this be an effect that we are only seeing 10% of the sitemap index? Maybe. Maybe. I mean, this is kind of tricky to guess at from afar. So what I would do in a case like that is try to look at the log files and see how far Googlebot is actually crawling. And is Googlebot actually going to those pages that have the links to the specific product pages? Or maybe Googlebot is not even able to find those links to the product pages and kind of missing them out because of that. You'll also see there, from the individual product pages, which URL Googlebot is actually crawling? So which one it's trying to index? And based on that, sometimes you can recognize, oh, this is a different pattern than I have my sitemap file and kind of try to figure it out that way. Right. Well, we use the info query or just pasting the exact URL. And it doesn't seem to be indexed in any version for a lot of products. So I assume mostly it's because those incorrectly set canonicals and not giving Googlebot a very easy, clear path to get to those products. Because I noticed with another website, by looking at the log files, I noticed that real canonicals do indeed lower the crawl rate if they're picking into a comp for those pages, which is pretty cool. But one thing you can also do is try to find a way to link related pages amongst each other. So instead of just having this top down approach where you have the categories and subcategories, also kind of have these cross links across the bottom where related categories are linked, related products are linked, so that if we start crawling in one part, we can kind of branch out to the whole rest of the website. Right, right. So definitely look at the internal linking and try to build a good architecture for both users and Google, I guess. No. OK, thanks. May I have one more? Let me run through some of the questions that were submitted first. And then we'll have more time for general questions as well. All right, when Googlebot crawls a page, how does it render the advertising on the page? Does it consider the number of ad calls or the space? How is native advertisement treated? We do try to render the pages as completely as possible, as far as we can go, of course, with things like robots text. So we do try to recognize where the advertisements are on a page so that we can look at things like, is this page primarily ad-focused, or is there actually visible content on the page above the phone? So that's something we do look at there. And we look at all kinds of ads, so not specifically just this type of ad, but also ads that you implement directly on the server side. All of that we try to take into account. We used to generate a lot of public profile pages for users, hundreds of thousands that are registered on our website, but most of them have never been populated with content. They could be seen as thin content. Can we take them down with a 404 or a 410? Does the presence of hundreds of thousands of 404 pages harm our rankings? Yes, you can take them down. If you think they're not so useful, you can take them down. You might even consider just using a no index on these profile pages if you think that within your website maybe they're useful, but for indexing, maybe they're not useful. So that's something you can choose. And you may also choose to implement something a bit smarter than just saying, oh, all profile pages are bad by trying to figure out, is this a good profile page or is this an empty profile page? And then based on that, decide no index or normal indexing, or 200, or 404, or 410, any of that. Is there a way to control which product thumbnail appears in Google Mobile? What are the general specs to show image thumbnails? So I'm not particularly sure which place on Google Mobile you're looking at thumbnails. So that's something where it really depends on how this information is bubbled up in search. For example, if it's a news result, then that might be something where you would need to use the news markup on a page to make it possible for us to recognize that thumbnail on a page. If it's from product search, then maybe you'd have to check with the product search folks to see what markup they're looking for there. So I don't really have an exact answer for you there. I kind of double check where you're seeing these thumbnails, where you're seeing these images, and then maybe post in one of the webmaster forums to get input on what feature this is. And then based on the feature, I think about which markup you might be able to use to help control that. Google guidelines, same misleading ads, maybe defined as below. So I wasn't completely sure what this was referring to, and it seems this is something that the ad side actually put out as a blog post with regards to last year. And since that's from the ad side, I don't really have any input there that I can provide. So that's something where I double check with that blog post and the team that posted it to kind of figure out what specifically you're trying to look at. Why does Search Console show in links to your site show no follow links as well as followed links? Wouldn't it be better just to only show followed links? So that was initially a product decision that we made on our side, where we said, well, all of these links could potentially be driving traffic to your website. Therefore, we want to show them to you regardless of whether or not they're followed. So that's kind of a product decision on that end with regards to whether or not we could filter them out. That's something we've discussed with the team before. But so far, it's not something they're really keen on changing. So if you need to kind of pull that out, I'd recommend double checking to see if there are any third party tools. I'd like you just dump a bunch of links in there and double check to see if they're followed or not. What's the rationale between not being keen on sharing that information? It's mostly with regards to kind of making or encouraging people not to focus too much on followed links versus no followed links. And I said just to kind of look at the links that are coming to your website and use that as an information as a kind of a guide to see which parts of your website people are actually linking to. I have a question about non-follow links, though. OK, go for it. It is OK if I have set on a certain site all external links to be non-follow. I think, for example, Facebook pretty much is doing the same. Since I don't have control over those links, I would prefer to just have everything non-follow so I don't risk linking to a bad one or something like that. Is that a bad decision in any way or it is fine? You can do that. I mean, it's ultimately up to you. Especially I think if you have a lot of user-generated content or only user-generated content, I could see that kind of being a way of saying I can't vouch for these links. People have posted them. I think people have generally good content, but I don't want to vouch for these links. So that's an option that you could say in that regard. I think, in general, for the web ecosystem as a whole, it probably makes sense to try to figure out, is this actually good content or not good content? And kind of based on some measure of quality, say, I'm OK with vouching for these links or I'm not OK with vouching for these links. So that could be by saying, well, I know this is a great user. They always post really important information. They post good links. That's something maybe I'll link to. But it's not something like, OK, you have a lot of no-followed links from this content. So this content must not be very trustworthy. No. Something like that on your side. No. Nothing like that. OK, thank you. How does Search Console handle or Google Search handle the situation of a business that is created with a similar name as another business in the same business sector, in the same community, but did not know the first one existed? Are they treated the same, or does the second one get ignored, demoted, blacklisted? So generally, when it comes to web search, we don't look at things like saying, well, this is the same business name as the other one. Therefore, they must be the same. We look at these on a per-web page basis. And if these web pages are unique, then we will show them separately in the search results. It might be different in Google Places, in the local listings. I don't know specifically how they handle that there. If that's something that you're worried about, then I'd recommend posting in the Google Places help forum to explain, elaborate the situation, maybe provide examples. It sounds like you have a specific case in mind, and get their feedback on that. But when it comes to the web, when it comes to normal web pages, we see these as separate web pages. And we will try to rank them separately. Are links within H2 penalized? No. So if you have an H2 tag and you have links in there, then that's perfectly fine. That's not something I'd see as anything problematic from a web search point of view or from a web spam point of view. That's essentially just the way that you're marking up your pages. In general, headings help us a little bit to understand the structure of a page. So I wouldn't mark up the whole page as a H2 heading or H1 heading, but rather use it to kind of separate the individual parts of your page with kind of a semantic markup using headings and kind of the body and kind of combining things together like that. But sometimes you do have links and headings, and that's perfectly fine. I asked this question last time, but we didn't get it answered. Can Googlebot see a SoundCloud multi-track players embedded on independent music websites as actual SoundCloud audio players? Or do they just look like big blocks and that might Google think that they're just advertisements? So we probably don't specifically recognize the SoundCloud player. I would be surprised if we could recognize that specifically. Maybe there are ways we can pull that out. But we'd probably see this more as kind of like this is a way of encapsulating multimedia content. And we'll take that into account like that. So we definitely, or I can't say definitely, but almost certainly wouldn't see them as advertisements because it's something that is kind of widespread on the web. And in general, we do try to recognize that properly. It's a bit different when it comes to things like video embeds because we do use things like videos directly in video search. And we try to recognize that a bit better. When it comes to just pure audio embeds, it's sometimes a bit trickier because we don't have this kind of audio search. But I don't see us taking that into account as any kind of advertisement. What you can do if you're really curious about this or really worried about this with regards to your pages, maybe send me some sample URLs that I can take a look at with the team. And I can double check that we're kind of picking these up properly. Again, we'd probably not pick up the audio content directly, but we would see that this is kind of like a flash embed or an HTML5 player or however you have that set up. And we shouldn't be mistaking that with an advertisement. We're planning to give a widget to different websites with powered by our website name as an option. Is that OK, or is it not OK? As long as you provide that as a clear option, then I don't see a big problem with that. So that's essentially fine if your widget provides value that people want to put on their website. And if they're OK with saying I got this widget from this other website, then that's essentially up to you. Why does Google show the wrong page where the search keyword matched the mentioned link anchor text? In the matching query with page content, does Google consider internal links anchor text too? So I don't know why Google sometimes shows the wrong page in the search results. That seems like something that you could submit feedback on on the bottom of the search results page that's always useful for the search team. Sometimes there are different things that we pick up, and maybe we get something confused. So that's always useful. With regards to internal links, yes, we do take into account anchor text on internal links as well. That does help us to better understand the context of that page within your website. So using good anchor text is always a good practice. Don't just link with click here. That makes it really hard for us to figure out what it is that it's actually linked. John, what about links that are used in what the tag called? Is that options, I think, or something like that from select dropdowns? When you select, let's say you go to an auto parts store, and you need to select the model in the year or something like that, and it leads you. Once you select it, it leads you to a certain URL. How does Google interpret that? Does it as a link, even though there's no anchor tag? Probably not as an anchor tag, because that's something that would be done with JavaScript if it's just one dropdown. You'd have to do something fancy with JavaScript. If it's multiple fields and you click the Submit button, then it's done with a form. So we probably wouldn't be able to pick up any anchor text for that. But if we can crawl to that URL, then that is at least something that we can pick up, where we can take the content from that page. But if it's purely done with JavaScript, then we don't have any anchor text. Right, but you could crawl the link and pass any ranking value. Sometimes we can crawl the link. So it kind of depends on how you implement that on your side. So if it's simple JavaScript, then we can kind of figure out, OK, this item refers to this specific URL, and we can recognize that URL and try to crawl that. If you're doing something really fancy where, essentially, you have to submit something, the server does something, and then redirects you to another page, then I suspect sometimes we might get that wrong, or we might not pick that up properly. OK, and since there is no URL attribute you can add anywhere, do you usually take it as a follow link if you just find it in plain text or in the source code? If it's just like text in a string in JavaScript, we would treat that as a no followed link. OK. And if you're implementing it as a link and using JavaScript to kind of build that link on the page, then you can specify a no follow as well with kind of the DOM manipulations in JavaScript. Yeah, OK, thanks. All right, what if I change the content, drop in ranking, and again publish the old content on which I was ranking? Would I get the same ranking again? Probably not. So if you're changing the content, we reprocess your website, and then you change it back to the old content, then essentially you're changing the content. So that's something where over time those signals probably change a little bit and probably wouldn't be ranking the same way as before. Just like if you leave the content the same for the longer period of time, your ranking will change over time. So just by changing things back doesn't necessarily mean you'll get that old state. You might get the new state that reflects the old content that you had before. So that's kind of a tricky situation where if you change something back, you don't necessarily have the old state again. Otherwise, you could take something that ranked really well maybe two or three years ago and assume that it would always rank well. And that's definitely not the case. I guess the question was a bit different. If you get any bonus or penalty for being an old content that has a tradition being there or if it's a fresh content, what if it is just the same about this, not in relation with other sites? Because obviously, even if you don't change anything, other sites will go better or worse and your ranking will change. But I think the question was more about that. If there are effects that this content was here for two years, it has a tradition. So a fresh one would be better or worse. It depends on niche industry. Yeah, I don't think you can say that in general and say, this is the way it'll always be. I think that's something that really depends on the rest of the ecosystem in that area where you're trying to rank. So sometimes it makes sense to have content that's kind of older and kind of stabilized a little bit. And sometimes it makes sense to kind of update things over time. And sometimes that can shift from one to another. For example, you have a physics website that's talking about a specific topic and nothing has ever changed there in the last 5 or 10 years. And then suddenly a new discovery comes out where everyone wants to find that new information on the site. So that's something that can change over time where you have to kind of be in touch with your users and make sure that what you're providing matches what they're expecting. All right, another repost from an earlier hangout. Search terms for my industry are dominated by thin content and doorway domains. What's up with that? I took a look at that thread and sent that on to the spam team to double check. So I don't know specifically what is happening there, but I did pass that on to the team to double check. Any solution to integrate AMP if we change URLs at photo galleries for each photo? I don't know specifically how you mean that. So what I would do in a case like this where you're talking about the technical implementation of how you could set up AMP pages optimally or use the AMP markup best is maybe post on GitHub or on Stack Overflow with the AMP tag so that someone from the AMP team can take a look at that specifically. I believe there's a component for photo galleries on AMP, but I don't know the specifics there. So I'd really double check with the AMP experts to make sure you have things covered. A site in Ember or Angular where it's only Fetch's render that shows Google finding these pages, but we don't have any options to see the meta tags or canonicals on these pages, what can we do there? Yes, that's correct. At the moment, there is no search console tool that shows you just rendered source of these pages. However, you can, of course, take a look at your browser. So you can use inspect element in Chrome to see the version of the page that's built up by the browser. I believe there, in the meantime, also are third-party tools out there that will render a page kind of like Googlebot would and give you the HTML source that's generated from there. So those are kind of the directions that I would head there. In general, as long as you're not cloaking to Googlebot, these other tools will work just as well. Having too many no-index pages cannot hurt a site overall? No. Of course, as long as these are not the pages that you're trying to rank in search, because if they're no-index, we probably won't show them in search. But otherwise, perfectly fine. We recently had a URL that was showing first in search results page, but has completely lost the ranking. Is it possible to find out why the URL lost its ranking? No, not necessarily. So unless it's something really obvious from a technical side, then it's not possible to kind of look up what specifically happened with this URL that it's no longer ranking for this query. But a lot of times, there are technical issues that you can double-check. So things like a no-index or a bad canonical setup or maybe error pages that are shown to Google, these are things that you can find more information about in Search Console. If an e-commerce site outputs 100 items in a single page versus, say, 24 items, would all of that content be taken into consideration when calculating the relevance or would lower content be weighted with less importance? Hard to say. So I think, in general, what would happen in a case like this is we would try to focus on the individual items themselves versus kind of seeing, like, where on this page those items are linked. So if you have a list of 100 items, then we'll try to use that list of 100 items to end them pages and rank those item pages separately. We'll also try to take that list and say, this is kind of like a category page and treat that as a general category page based on the content that's on there. But if you have 100 items or 24 items on a category page, essentially is kind of the difference between a page that has a lot of text and the page that has not so much text. And both can be usable pages. It's not that more text is necessarily better or that your rank higher with less text. It really depends on what you're providing there. But especially with e-commerce sites, probably we try to focus on the individual items directly themselves anyway. So it wouldn't make that much of a difference. What if we have several ad calls on a page? Each one has different floor prices so that actually just a few of them are rendered. Is it a bad signal per se to have several ad calls on a page even without necessarily showing an ad to every user? I'm not sure what you're trying to do there in a case like that. We do try to render the page the way that it's actually shown to the user. So if you're doing fancy JavaScript and that JavaScript doesn't result in anything actually being shown on the page, then we wouldn't use that with regards to indexing. So if your ad calls, your JavaScript, does server side calls, but it doesn't actually show that content on the page, then that's not a part of the page. If my desktop ranking is higher and my mobile ranking is lower, should I focus on only page speed to get equal ranking with desktop? I don't know. I wouldn't necessarily say that only page speed is something that you need to focus on. But if you're seeing significant issues with regards to the usability of your mobile pages, then that seems like something worth focusing on regardless. How should I list my software to this enhanced search results? So there is a page on the developer site, specifically with regards to software listings that has the markup on there. But I double check before the Hangout. And it says this is currently still in testing. So it's not available for general use. But I believe there is a forum there that you can submit your site to and say, hey, I implemented this markup. Can you take a look? And perhaps let me take part of this better as well. Many pictures can be compressed on websites with services like TinyPNG. How important is that for search? At the moment, that's not critical. Large images do mean that the page is load a bit slower, so that's something you might want to take into account with regards to usability in general. It can also mean that there's a higher load on your server, more bandwidth that needs to be used for crawling, but more bandwidth also used by users. So that's something that may play a role as well. If your site is able to handle the load with ease, then probably it's not so critical. Obviously, making things faster for users is always a good thing. With regards to search and rankings specifically, I don't see this as something that's like a critical ranking factor. Obviously, indirectly, it might play a larger role in the sense that if users don't come back to your website and they don't recommend it to other people, then that might be something that we pick up on and say, oh, nobody's actually recommending this website. Maybe it's not as good as we thought. In the UK, we have to provide a banner showing our site uses cookies, which uses an external link to a site about cookies. It's a followed link. Is that a problem? In general, that's not a problem. So as long as this is like a banner on top, then we can recognize that as well. That's essentially less of an issue there. Whether or not that link needs to be followed or not, I suspect that doesn't play a role at all. I have a shop where I have randomized display of products. I want to provide users with new content, but I'm not sure this is the best approach from an SEO point of view. So I assume this is something like on the home page of the website. And with regards to that, that's essentially up to you. So if you want to provide a random selection of products, that's fine. One thing to keep in mind, though, is that the links that you have on your home page tend to be things that we think are more important. So if you link to specific products from your home page, we might think that, oh, this is probably something really critical and important for this website. Therefore, we'll give it a little bit more weight. So if you're randomizing the products that you're showing on the home page, and you're spreading this across all of your products, which might be fine, but it might also be that you could actually achieve more by saying, well, these are really my primary products, or these are the ones where I know most of the users are happy with. These are the things that I really want to promote. Maybe I'm running a special for these products now. Maybe these are new products that I just got in. Maybe these are things that I earn the most money off of. And with this thinking, you could say, well, this is what I really want on my home page, and provide that as the baseline product links that you have on the home page. And if there's room for other things, then maybe that makes sense to kind of spread that out across the rest of the site. But in general, you do probably have some preferences with regards to which products you want to promote most. And those are probably the ones that you want to put on your home page. John, in regard to that, we saw an issue with, it's not random, it's like the opposite of randomizing. We have IP delivery for the content because of the type of site. We are people in New York should still use stuff in New York because it's most relevant to them. But then that, we went through a period where that, and it still is causing an issue where, because you're indexing from California, almost every result we get is from California. So it looks like we only have stuff in California as far as Google's concerned. So it's best practice for a user to do this because there can be 3,000 miles between a wine tasting or a balloon ride if you're sitting in New York and California. But is there any proposed way of handling that? Because we don't want to show different things to Google than we do to users, although there would be a very good reason for doing so. So do you have any current thinking on that? I think, especially with regards to the home page, that's always kind of tricky. But what I would recommend there is to make sure that you have evergreen content or something linked on there as well, so things that you could show to all users. And then you could still have this block of the personalized version as well. But at least you know this kind of block of things that are relevant for all users, which might be category links, for example, or the main locations where you're seeing users coming from. Well, that's creating a circle where the visitors are coming from California now, because that's where we're, I mean, even if you Google generic terms and you know what we sell, so if you Google a generic term like experience gifts, which is the industry term for what we do, the first page at the ranks is experience days for California, not our home page. So it's gone as far as thinking, we're basically a Californian offering. Not our site is the best guide for anything, but. Yeah, I suspect you probably just need to split things up a bit more on the home page. So I haven't taken a look at your site specifically recently, which is probably a good thing. Sometimes a good thing. I'll be calling again on Friday, so you can have it. But if there is a way to make sure that you have a generic block of content on there, in addition to the personalized block of content, then that's something that does help us to figure that out a little bit. Is it best then to present visitors with a simple click to show things nearest me, like most mobile stuff does these days? You could try that, yeah. And then the rest of it is just general. Yeah, I mean, it kind of depends on how you can set that up. But that's something I would test out, yeah. OK, all right, thanks. You all check them yourself. All right, data highlighter. Does this actually do better? Does this actually better doing it from Search Console? Is it better to add markup to the web page? What's the best practice? Data highlighter is a great way to try out structured data from my point of view. So a lot of times, you'll have a website that has content that could use structured data. But getting structured data implemented is like this big task where you need to prioritize time with the engineers and all of that. And using the data highlighter is a neat way of testing things out and seeing does a structured data markup actually make sense for my website? You could try it out like that. In general, I recommend, for the long term, switching to actually putting the markup on the page. Because that way, on the one hand, the markup will be available for anyone that can consume this markup. And on the other hand, you can be sure that the markup is picked up correctly. So in particular, if you change your layout, if you change things subtly on the page, then the data highlighter has to kind of relearn that from your templates. Whereas if you're marking the content up directly, you don't have to worry about that. You specify this is the price, this is the date, and it'll always be marked up like that. So from that point of view, if you want to plan for the long run, I'd recommend doing it directly on the page. If you want to try it out, then give the data highlighter a spin. To remove pages from Google Search Index is it better to return noindex nofollow or just noindex in the Xrobots tag? Essentially, both work. So noindex nofollow means none of the links are followed as well. If you just want to remove that page, either of these options are fine. At a recent SMX, Maria said there's a limit in the number of URLs that Googlebot can crawl. Can you tell us what this limit is? There is no absolute limit. So it's not the case that Googlebot only crawls 100,000 pages from every website. But rather, we try to figure out how much we can actually crawl from a website with regards to the website itself and how it's hosted. So that's something where we did a recent blog post about that maybe a month or two ago by Gary, where he talks about the crawl budget, essentially. So I take a look at that blog post and see what kind of applies there. But it's definitely not the case that there's any hardcore coded limit where we say we only crawl this number of pages. All right, let me just open it up for more questions from all of you. What else is on your mind? Quite a short one. All right. I'm going to start in the site. Myself, a lot of other sites going with native domain, so without that. Is there any public guideline recommendation from Google that we should use that was naked domain? No. That's totally up to you. When you talk on the line, what I assume that maybe because you have a lot of subdomains, but for the site not having subdomains. That's totally up to you. Sometimes there are technical reasons to go one way or another. For example, if you want to use a content delivery network, and sometimes you have to have a host name, you can't use a naked domain. But essentially, that's totally up to you. From a search point of view, it doesn't matter. OK, thanks. I have a question, a basic one. All right. With a normal 301 for a product page, going from a product that was sold, and it's now out of date, so we're taking it down, and we're moving it to a new product. Is there a best practice for keeping the old one live? Do you recommend keeping it in the site map? Because you can't put it in categories anymore because you don't offer it. So you don't want to mislead people by them clicking on it and ending up somewhere else. But if you don't have it there or in the site map, unless there's any external links, you're essentially killing it. So a 301 is only as good as something pointing at it. Do you or anyone else in the call, do they usually keep that in the site map for a few months or somehow leave some kind of remnant of it live to allow the 301 to bite? Does it work? Because there's no real point otherwise if you don't. I know that's a very basic question. I guess what generally happens there is we recrawl those URLs anyways. We've known about them. They were linked before, so we'll assume they're still linked somewhere and we'll just recrawl those regularly anyway. So how real do you do that if there's no way to get there? We have a big memory. Like Google remembers lots of crazy things. You will sometimes see this in the 404 error report as well where people come to us and say, well, I've removed this page like five years ago. How come you're still trying to crawl it? You're like, well, you still know about it. We think maybe it has come back from the dead after all these years, so we just want to try it. So you can go direct, do you mean, without using other linkers? If we know about it from before, then we'll try to take a look again. But what I would do there is if you're trying to figure out how long you should keep the redirect in place, for example, take a look at your log files and see are people actually going there or not. And if no people are actually going there anymore, then maybe at some point you can say, oh, I'll just drop this redirect. On the other hand, if your CMS can just handle all of these redirects without a problem, then I will just leave it there. It's almost more hassle to figure out, are people actually going there? It is to just leave that redirect in place. It's more for the first three to six months where we're just making sure that the new one fully replaces the old one in the index. But I didn't want to basically create a dead end for Google where, because it's not in our sitemap and we don't have it anymore, then essentially, if you've got 5,000 products, you're not going to have external links to all of them. So there's just no way for you to get there. Yeah, apparently there is. I wouldn't worry about that. We recrawl all of our old things every now and then. Some people want it, some people don't want it. Essentially, we know about these URLs, so we want to double check what happens when you go there. OK. John, two quick questions. One's about that university website. If you have any other news, I noticed we're still not showing up at all for like 95% of the relevant keywords out there. Again, it wouldn't be a problem if we're showing up on like 50, and I know we have to move up, but not showing up at all. It's really, really odd. I talked with the team about that, and they're saying it's ranking as it normally would. But it's not ranking at all. I want to rank more. I know, I know. But from their point of view, it's essentially working as expected. So it's not odd that it's not showing up even if I go to the last page of results at all? I don't know, maybe. I mean, just because it has that content on the page and it's a university website doesn't necessarily mean that it'll show in the search results for that. So from their point of view, it might just be something in the way that we're ranking this website. So as long as those URLs are indexed and there's no manual action visible there, then from our point of view, that's essentially working as it should be. So it's not the case like Rob. You want it, but it's. Right, right. But it's not the case like Rob's website where it has been included in a certain issue. OK. OK, I'll keep trying then. Second question, regarding IP-based redirects for multi-language websites, is there, I'm not sure, is there a best practice blog post or something like that? If users land on the home page and then a 302 redirects them to the appropriate language version, is that fine usually? It's just fine. HF-lang is set up correctly, but it's just for people who just go to the direct URL of the main home page. So essentially, what you want to do is, let me see if I can find the blog post. What was it called? So what you want to do is to have one URL that does this IP redirect and not use the IP redirect on all of your URLs. So basically, you have one generic version of the home page that does this fancy redirect. And that's the one you mark as the X default. And the other pages, the individual country pages, are the ones that you let get crawled and indexed normally. So the blog post is called Creating the Right Home Page for Your International Users, 2014. And it's kind of, I believe it covers the part with redirects as well, but essentially the idea being there that the X default is a version where you do the fancy redirects and the other versions are ones that we can crawl and index normally. And the important part is that we can actually crawl and index those other versions separately. Otherwise, if you always do the redirect, then we'll always see the version that US users see and will never see your other content. Right. So the redirect happens just the first time users goes to the home page. But if he leaves the site and tries again, unless it's in an incognito window, he won't be redirected again. So if he tries to do the home page again. I suspect that won't work because Googlebot doesn't supply session cookies. So it would look like a new user every time Googlebot goes to your home page. And is that the problem? Then we would always see that redirect to whatever the US version of your website is, for example. Well, the US version is the home page. The English version is the main home page. And then we have a 302 for Romanian, Italian, and Spanish. Then we probably wouldn't see those Romanian, Italian, Spanish pages because we would always be redirected to the English version. But I know you started crawling from other countries as well, if I remember last year, correct? Or something like that. Only from a really, really small number of countries. So that's something where we noticed we weren't getting that much value out of actually doing that. So we haven't been ramping that up. It would create like a giant load on a lot of websites, too. We can't crawl from all countries. Right. But if there is the menu that you can use to access any individual versions and it works without redirects, is that fine? That would work, too. But what would probably just happen is we would think the English version is really, really your home page because every time we go to your home page, we go to the English version. And we'd see the Romanian and the Spanish versions kind of as a link from the main home page. Even if hreflang is set up between them? Probably, yeah. Because if you go to the Romanian version, which is slash or o, you won't get redirected because you accessed it intentionally. Yeah. But I would take a look at the blog post and kind of double check how we're actually crawling there. OK, OK. Cheers. All right, I need to run. But it's been great chatting with you all. And I hope to see you all again in one of the future Hangouts. And if I didn't get to your questions, feel free to add them for the Friday Hangout or I'll set up the new ones in two weeks again. All right, bye, everyone. Thank you, John. Have a good day. Thanks, you too. Bye, John.