 All right, welcome, everyone, to today's Google Webmaster Central Office Hours Hangouts. My name is John Mueller. I'm a webmaster trends analyst here at Google in Switzerland. And part of what we do are these office hours hangouts where webmasters, publishers, SEOs can jump in and ask their SEO or search related questions around their websites. There are a bunch of questions already submitted, so we have some stuff to go through. But as always, if any of you want to get started with the first question, feel free to jump on in. Can I ask a quick question, John? Just a quick one. Do you remember the other day when I was saying that? I did some fairly big changes to a site where I started to, if you like, allow Googlebot into areas that had not been allowed for a while. And I made some different choices on something that had canonical lines to some sections and stuff. But because it's quite a template-ish site, that made quite a big difference to a lot of pages as it's dynamic. And then all of a sudden, the day after when I was working on it, I literally went and looked at real-time analytics. And there looked to be almost like this swarm of direct traffic like going on the cache. It was literally like search question mark or cache question mark search or something like that. And it was literally like flying through the URLs. And when I went to the information retrieval summer school last year, one of the talks was about efficiency in storage, et cetera, in the index or indexed people generally. And one of them was around how they have these separate storage areas. So say, for instance, something had not been accessed for a while because it was canonicalized. You didn't actually have to keep crawling it because you'd been canonicalized and you recognized that in a pattern. You wouldn't have to keep going to it at all. So even if you weren't going to re-index it from that activity, you could potentially want to go through that old memory and have a look and say, well, actually, is this much different? And is it actually worth re-indexing? And the quickest way to do it might be to just fly through, go back in the old storage system if you like, long-term memory. And just zoom through from the cache and say, well, rather than crawl all that, it might be worth just having a quick look. I just wondered whether that might have been what I saw. Because the day after, when I checked in Search Console and looked at crawl stats, because obviously I was a bit alarmed because I thought, is it some scraper or some DDOS? I wanted to check. Next day, there was a huge spike in Search Console crawl stats. And then the day after, there it was again. And the day after that, it was triple. So it was like three or four days now of huge spike there. And I saw that twice the real-time cache thing. And it was from various locations simultaneously all over the place. So do you think that that might have been potentially that kind of behavior? I don't know where you're seeing the cache behavior. I saw it in analytics. I saw it in analytics. Yeah, I don't know. So that's the URL that it was calling? Were it like some cache URLs? Or? Exactly. It was literally like the cache. It was from the old cache, yeah, of from the cache, from Chrome cache. Is that? But I mean, percent of all that. I don't think that would be. So I'm not exactly sure where you mean or what that looks like. But it doesn't sound like that would be something like an effect that you would see from Google's crawling. Because usually what would happen in a case like this is if we recognize that there are significant changes on the website, we'll just try to crawl the known URLs a little bit faster. So we have a bunch of URLs that we already know from your website. And we basically decide, oh, we want to make sure that our index is as fresh as possible. Therefore, we'll take this list of URLs and crawl them as quickly as we can. It kind of depends on your server what we think your server can take. But we'll try to get through that a little bit faster than normal. So that's particularly the case when we find significant changes across the site with regards to maybe structured data, or with the URL choices, with URL canonical, redirects, those kind of things. So maybe that's something that was triggered from your site depending on what you change. But in general, that would be visible like normal Googlebot requests. It's like we would just go through our list of URLs and we just check them all. It wouldn't be the case that you would see something that looks like a Chrome cache or something. So I suspect what you were seeing there is either an effect that analytics picked up kind of incorrectly, didn't really know how to interpret that. That's a bug, maybe. Maybe it was an analytics bug. Or maybe it's totally unrelated. Maybe it's just like Googlebot was crawling and this other random thing was happening at the same time. And it looks like they're related, but actually they're two completely separate things. Correlation versus causation. That's that old thing, yeah. OK, yeah, OK, OK, yeah, thank you, thanks. All right, any other questions before we head off into the submitted ones? I actually submitted this question. But can you answer questions about the ad review? Not really. So which ad review do you mean? The ad experience report and? Just a little bit. So I was mostly just watching what the team was doing there and making sure that the integration with Search Console was working well. What kind of issues are you seeing there? Well, the ads are following the ad correlation, a ad coalition like protocols and stuff. But we're still seeing that we're failing the review. So we're going to be moving over to more of like a welcome screen type of stuff rather than a pre-stitial ad to try to pass the ad review. But the deadline to do that is for us is March 22. And that's just right around the corner. And we need some time. It's more the lack of information about why we failed the review process than it was that we did fail. So we just don't really know where to go from here. And so we can make a change, do the review again. But we literally don't have any time now. So I was wondering, what can I do? I suspect that's tricky. So what I would definitely do is post in the help forum that they have. I don't know if you've posted there. The team has been watching those posts and trying to do the right thing there with regards to responding and giving more information. So that might be a good place to start. That's also the best place to kind of reach the team if there's anything specific that you need to pass on. I don't know if they would do things like extending that time or how that's set up. OK, all right. All right, thank you. I appreciate your help. Sure. Hi, Joe. Hi. Sorry, I'm late. I have a question. First, I have two questions, in fact. Where I can send the screenshots or other details that we talk? You can send them to me on Google+, so it's John Mueller on Google+. Should be pretty easy to find. That's also where I think we share the hangouts. If you add me to a thread on Google+, and remove the public thing, then it's a private message. Sometimes it's a bit confusing. OK, got it. And the next question, which is more important. We have a section on our website. We've around, let's say, 20 million pages. But there is a problem there. We think we fixed it one month ago. We removed all auto-generated pages from titles. We generated search pages from titles, so like artificial content. So we think we fixed this, because I saw this. I saw details about this in the new Google Search Console, where I saw excluded from around 10 millions to three millions now. So I think Google managed to remove those auto-generated pages from index. But now, the big question is, when we get to see results about this, because we wait. But we don't know how much. So was there a manual action on your website? No, it was a gradual action, because we seen a decline from 12 months ago. So usually, these kind of changes take time. It's something where the algorithm has to reevaluate your whole website overall. And a lot of low-quality pages, they can play a role there. But sometimes, there's just so many other different factors that come into play. And that's something that doesn't just jump back up. If you remove one part, you really have to think about, what can I do across the whole website to make sure that it's significantly better? And that's something that, even if you make big changes with the design and the functionality, and you add new features and things, I would definitely expect that to take multiple months, maybe a half a year, maybe longer for that to be reflected in search, because it's something that is really, it really needs to be reevaluated by the systems overall. OK, so we wait. Well, I wouldn't wait. I would continue working on your website. Yes, of course. It's not finished yet, but don't give up. I used to wait. Don't give up, at least. Yes. OK, thanks. Sure. John? Sorry, I was just going to make a contribution there if you're right with this. I've had similar problems with stuff being indexed that shouldn't have been, and then it takes forever on bigger sites. And one thing I found that seems to be working quite well is really about reevaluating categories and thinking, actually, would it be better to just combine these categories? So, you know, are they, even if they're not the same, sometimes I think it comes down to the query clusters meeting the same actual query, even if the content is not dukaker. Yeah, so I've literally found that that's had quite a big impact where we've literally sort of said, well, actually, if this and this category are actually they kind of mean the same. They don't have the same words, but they actually mean the same, and they may be ambiguous. And sometimes Googlebot was like ranking one, and sometimes Google was ranking one, and then sometimes it was ranking the other. Just merge the two, redirect one to the one that's actually ranking more. That seems to have quite a big impact and fixing that. So I don't know. I'm sorry. I hope that's useful. I think that's definitely an approach to take as well. It really kind of depends on the website. But if shifting things around in categories, combining things to make really strong category pages is an option on your website, that might be a way to also kind of lift things in the meantime. Can you hear me? Yes. You can. OK. Back from the gentleman's comment. Assuming a site's re-indexed, actually, this is one of the questions that I'd placed in the question. So assuming that his site's completely re-indexed, I didn't know if that was the factor that you were saying was going to take time. Is the re-indexing process or the re-evaluation after the indexing process? Usually it's the re-evaluation process. So the re-indexing is usually more of a technical thing that can take a bit of time, but kind of the re-evaluation of the whole website overall, recalculating all of the signals that we have for the website. That's something that usually takes quite a bit of time. Thank you. And it's really helpful to know that you can't just take off the bad content. You have to really improve what you have. And I know there's been comments from your team about that before. Yeah. I think that's something that's really important. It's kind of the same with links as well, where if you remove all of the problematic links, then that doesn't mean you automatically have more good links. Well, you removed all of the bad stuff, but there's still this lack of good stuff on the website as well. So you really have to do both. Clean up, take all of the bad stuff out, and make sure that the good stuff is actually a lot stronger. Right. Because when you had those bad links, you were basically overranked. So when you devalue those links, you take them off. You're like, you really only had 10% good links, for example. So there's not much to rank on. So yeah, I understand. Cool. All right. Let me run through some of the questions that were submitted as well. We use a third-party review company and embed their code in our website with an API for collecting and showing service and product reviews. I understand we don't get any direct SEO benefit from the review text by doing it this way. So we're now thinking of collecting our own reviews so customers can submit a product review directly, which we will show on our site and also in the markup. Is there any issue with us doing this, or must we use a third-party company for collecting reviews? No, you can definitely do it yourself. A lot of websites do it themselves. Sometimes it's a matter of finding the right plug-in for your server to make it easy to manage all of this. But collecting reviews is something that's not really that tricky on a web server. So if you can do this yourself, if you have the capability to install something or get some code to do the reviews yourself, I think that's a great thing to do. We're creating new pages on a regular basis. In Search Console, we use fetch and render and then submit for indexing. But after 10 URLs, it always says, an error has occurred. What's happening there? How can I get more? So in general, I would only use a submit to indexing feature for things that are really critical where you're saying, this is a completely new website and Google has never seen it before. Then that's the case where submitting to index makes sense because you're telling us about this for the first time. If you're modifying things on your website, if you're changing things, if you're adding new pages on your website, I would always let that run through the normal organic process, which means we either crawl your website and discover those links and then we index those from there. That's one great way to do it. Another way is to use a sitemap file and to tell us about all of your new and changed URLs directly with a change date and then we can go off and crawl those URLs directly. Most CMSs let you create sitemap files automatically or have plugins that do it for you. So if you're using a CMS, and it sounds like you're doing that when you're creating pages on a regular basis, you probably wouldn't do that from hand. So in a case like that, make sure you submit your sitemap files, make sure your CMS is set up to ping Google when the sitemap file is updated, and then you really don't need to use these manual tools for submitting URLs. Our homepage implements geolocation 302 redirections for both users and crawlers. The homepage has a very strong external link profile. As most Google crawler visits are from the US, does that mean that most of our link juice gets passed to the US domain? Like Google.com, we have a strong business reason that requires geolocation 302 redirects on the homepage. So in general, if you're always redirecting Googlebot to one version of your website, then probably we will assume that version is your main version of the website and we'll try to rank that as like your primary website. So that's something, a bit of a problem with these georedirects in the sense that Googlebot might not even have a chance to view the other version of the website or to run into that. So my general recommendation is to, instead of using redirects or swapping out the content directly, is to use something like a banner on your pages so that when people go to your pages or your homepage, for example, they see a banner on top, saying, hey, it looks like you're from the US. Here's our US homepage and kind of guide them to that appropriate language version or country version. By doing it like that, Googlebot will always be able to crawl all of these different versions. It will know that maybe one of these versions is a generic homepage that you show to all users. You can notify us about this with the hreflang markup. So you can say the homepage is our X default, for example. That means it's the version that's kind of by default shown and then you have maybe a French version and an English version or for individual countries or however you have that set up. And you can tell us about all of these different language and country versions with hreflang markup as well. And then what will happen in the search results is we will know that these pages all belong together and we'll swap out the URLs in search to match the right one. That works really well with banners. You can also do that with a redirecting homepage, but it's sometimes a bit tricky to make sure that you set that up properly. So in particular, the homepage itself would be the one that redirects, but the individual country versions would be non-redirecting. So anyone going to the US version of the homepage would be able to see the US version of the homepage or anyone going to the French version of the homepage would be able to see that. There wouldn't be a redirect from the French version to the US version if the user is from the US, for example. However, you might have a generic homepage, like just your root, and that could be the one that would be doing the redirects. So in a case like that, you tell us the redirecting homepage is the xdefault. You tell us about the two or the multiple individual country language versions. And then we can deal with that appropriately and kind of say, this is the US version, this is the France version. We'll show it appropriately in the search results. But if you don't use the hreflang markup and in particular, if you don't let users from other countries access the wrong country version of those pages, then what will likely happen is that we focus everything on the version that we can look at, which might be just the US version. And if your business is focused primarily outside of the US, that might not be so great. So that's something that's sometimes a bit tricky to set up. We have a blog post about this, I think maybe three years back called, I think the best version of your homepage or something like that. I can add a link to the comments afterwards with that. If we wanted to use JavaScript to localize some text on a landing page, depending on where the user's IP address is from, can Google read the text? Can you give me any advice? So I guess that kind of goes into similar direction in the sense that Google usually crawls from the US. So what would probably happen is we would always see the US or the English version of that text and we would try to index that. So if it's generated with JavaScript, for the most part, we can process that and we can see the text that's generated. And what might happen then is that we see your normal homepage, which might be, let's say in French, for example, and your JavaScript text is in English because you think, oh, this user is from the US, I will show them English content. And then we have French content for the homepage and then a section on top, maybe or somewhere on the side that's in English. And that might be a bit confusing. So that's something to keep in mind. There's definitely ways that you can do this in a neat way. What you might also do if it's really just a small part of your page is to robot out the JavaScript so that you're kind of making sure that no crawler is actually looking at this JavaScript. It's important to make sure that this is just a small part of the page so that it doesn't come across as cloaking. Because if you're changing a large part of your page with JavaScript and no search engine is able to actually see what that looks like, then it's really hard for us to understand where we should be showing your pages. Are you talking access to that little section for Googlebot, John? So when you say robot out, you talk about in-page, aren't you? Activity such as don't allow that area to be crawled or something. So in particular, I was thinking about if this is JavaScript that you have in a separate JavaScript file, then blocking crawling of that JavaScript file would prevent crawlers from even trying to access that JavaScript. So blocking the actual file itself. Exactly. So separate out that functionality onto a separate small JavaScript file. Yeah. And then put that one. I don't think that's optimal because it just makes everything so much more complicated in that you have to remember like why you put this in your robots text file and why this piece of JavaScript is there. So ideally I try to find a way to kind of just do that in a natural way, maybe directly on the page and to find a different way to deal with that. But depending on what kind of text it is, what the reason behind that is, it might make sense to go to extremes and say, okay, this piece of JavaScript that is only the small bubble on top, I'm going to block it by robots text just to make sure that no search engine gets confused by it. But okay, so I'm just thinking that a lot of the time, as we know, when JavaScript comes from, say, libraries, they're massive and it's not like to actually block just one function because you'd have to separate it out somehow, wouldn't you? Because otherwise, if you blocked the whole file, that could have lots of other things that it's actually also adding to the rest of the JavaScript on the side. Yeah, it really depends on how you implement it. I mean, JavaScript doesn't have to be big. Like, there's lots of stuff you can do with JavaScript with like a handful of lines. But right, there's some really big, yeah. Yeah, yeah. Okay, thanks. With the new mobile index, would you advise having our product description shown by default on our mobile version as opposed to currently sitting behind a dropdown? I don't know. I'd probably have to take a look at an example page to see a bit more of what you're doing there, in particular how you set up the dropdown. But in general, if the content is on the page, so if you open the page in a browser and you can search for that content and it's there on the page in the HTML somewhere, then we'd be able to pick that up for indexing as well. It's something where there are ways to implement this that involve requesting extra content from the server to kind of quickly load an initial view. And then when you click on it, it loads the content from the server. And in a case like that, we wouldn't be able to pick up that content because we don't know where to click. But if it's already loaded into the DOM on the page, when the page is loaded, then for the most part, we should be able to deal with that properly. I think the bigger aspect that you'd want to look at there is more the usability aspect, where if users search for a piece of that text, we show them that URL in the search result and they come to your page, are they able to understand that this page matches what they were looking for, even if they don't see the text immediately. So that's something kind of to watch out for. But you're definitely not alone with this problem. For example, Wikipedia pages have this quite a bit as well where on mobile, you have all of these little expando sections where you click on them and it expands and shows the full content. And if you were searching for something that's somewhere in one of these sections in search and you land on a Wikipedia page, you might be confused and say, well, where is this piece of text that I was searching for? So you're kind of like Wikipedia in that regard. But I'd still make sure that the content is actually loaded on the page so that we could even index it. Spam reports, what happens to them? Do they all get thrown away? Nobody looks at them. Why bother filling them out? More or less. We do look at the spam reports. They don't get thrown away. It's something that's really important for the team here. And the Web spam team goes through these regularly. So it's not the case that we go through all of them and say we need to take man on our action on every single one of these reports. But what we try to do is figure out where the bigger patterns are that we see and to take action appropriately there. And in particular to help improve our algorithm. So instead of going through a thousand spam reports that all report the same problem for different sites, maybe there is a way to improve our algorithm so that this particular problem doesn't work on any site. And then suddenly we solve it for all of these thousands of reports. So that's something that sometimes happens there and that sometimes makes it look like, well, nothing is happening here. Why do I even bother submitting them? But actually the team is taking this into account and thinking of ways to kind of automate the whole thing so that you don't have to bother filling out even more of them for the same kind of issue. That's it. If you find bigger patterns where you say that the spam report form is really hard to use for that or if you have a big setup where you're saying, oh, someone just set up a new private blog network with these 2000 domains and somehow they seem to be getting away with it, then that's also something you can send to me directly. You can send that to me on Google Plus. I pass these reports on to the web spam team as well. And usually they're able to kind of go through these and try to figure out where it makes sense to take more at manual action and where not. The other thing to keep in mind here is that we take manual action in a variety of ways. And sometimes that involves just kind of neutralizing the spammy technique that's being used. It doesn't mean that we remove the site in question completely from the index. And it doesn't mean that that site won't drink at all in the index. So one example that always comes to mind is kind of keyword stuffing where a site might be using the same keywords over and over again or kind of like hiding it in small or hidden text on the bottom of a page. And one of the things that we try to do there is just ignore this keyword stuffing so that when the site does this, we try to ignore that technique. We try to neutralize essentially the spammy effect of that. And the site might still be ranking. It might still be ranking number one for some queries. But it's not because of that keyword stuffing. It's just because of all of the normal good things that they're also doing. So that's one thing kind of to keep in mind. So it's not. You've got to come on that. Sorry, I don't mean to literally was a question about keyword stuffing that I wanted to ask at some point and you just brought it up. So I was reading in a book on information retrieval that actually keyword stuffing was query agnostic. It was literally just like it's a quality of the page. But that made no sense because I've seen where as page and a lot of instances, I've seen pages where it's pretty screams out at you what they're trying to rank for. And it's just too obvious that actually what happens is even when you type that query into search, they don't rank that with that page, but they might rank the rank with another page. So it's like that page is ignored. But if there's another page on the site that's also relevant, that it's almost like it's an intentional ignoring of that individual page of that query cluster and another page just gets ranked instead. So is it query agnostic keyword stuffing? I mean, is it like, well, we see this page and it's, we see this site and it's got keyword stuffing and we just dampen the whole whole site down or is it mapped to like query cluster as well? So actually it'll rank for some things, but not others. It, I think it kind of depends on what we find there. In general, we try to be as specific as possible with regards to the action that's taken. So if there's a way for us to recognize that, for example, one page is keyword stuffing on this particular term or type of term, then if we can take action on that in the sense that we kind of neutralize the keyword stuffing effect there, then that's something that we'd like to do. In a lot of cases, that's not that easy. So it might not be possible to recognize exactly which specific term they're trying to keyword stuff for or which particular page within a website and it might be a little bit broader. But in general, we try to be as specific as possible so that we neutralize that particular effect but we try to keep the rest of the site normal because one of the things that we see all the time is that people aren't trying to be abusive when they do these kind of things. They hear about this technique from, I don't know, a friend of a friend of a friend and it's like, I don't know, it's an SEO technique from the 90s and somehow it's still being used and it's not that they're trying to be spammy and like rank for terms that are totally irrelevant for their business or for their website but it's just that they happen to have done this and or maybe they copied and pasted it from an old version of a website and we'd ideally like to just treat the rest of the website appropriately and say, well, this particular thing here is a problem but the rest of the website is actually normal. It's like we shouldn't be dropping the whole website if we can neutralize that one particular thing that they're doing wrong. So in effect though, is that something that might be connected whereby you see this specific page, you see the obvious page that would rank and is the natural target for that term or that query, that just is ignored and just another one gets ranked instead, that might be a fact of keyword stuffing on that page. I don't know if that would be that direct. It's really hard to say but these are also things that change over time. So something where I can see how maybe you might see an effect like that but I don't know if that would be kind of like the by design that we would just pick a different page from the same website or if that's kind of like a side effect that's accidentally happening. Even though it's like massively, massively obvious that the target is a particular page because that's what the content is but maybe it's just overdone. Yeah. I think it's also tricky to kind of generalize from something that's very obvious for a person versus something that's very obvious for a computer or for an algorithm because sometimes what people see as totally obvious is totally unknown to the algorithm. So a really common case that we see all the time is when people move from .html pages to .htm pages. Anyone looking at those pages like personally would say oh this is obviously the same page but if we don't have redirects from one version to the other then our algorithms will like oh this is a completely new URL. I don't know what to do with this. I will just like put it at the bottom of my list. So those are the kind of things where it's sometimes tricky to say it's totally obvious any algorithm that's like even a bad algorithm will be able to figure this out because sometimes our algorithms don't figure it out what is totally obvious to a human. Say synonyms. Sorry I'll just say this bit because it has occurred to me as well that often I see a lot of people or a lot of sites where they overdo it with synonyms because they don't realize that actually they're the same thing and that I suppose you can be often keyword stuffing when you're literally like using synonyms. You try to do this whole like and there's this whole like strangeness around people are like going crazy for some technology that maybe hasn't been used it's a little semantic indexing and they're kind of going you know they're using plurals they're using stemming they're using you know it just doesn't read well but they don't realize actually all those words that's kind of can translate really quickly keyword stuffing. Yes Jeffrey what's what's on your mind? Okay no sorry nothing John nothing. All right all right video off I think. Okay so let me run through some of the other questions as well lots of stuff left to go wow. Spam reports we did that one. Ah hreflang mystery if hreflang is used to point to several URLs containing the content in alternate languages but those URLs don't contain the proper canonical tags they're being canonicalized to one version of content in a slash us directory. Then why would those URLs show up in search results? For example when searching from the UK slash UK shows up in search but they aren't indexed and info command shows slash us URLs and the new index coverage report shows them is excluded. This goes against the proper hreflang setup where each URL should contain self-referencing canonical tags. I I don't know of this particular case so it's really hard to say but in general we can follow the hreflang links as well even even in cases where we pick one version as a canonical so this is really common in cases where the same language content is used across multiple countries then what what we might say is we understand the hreflang links between these different versions and we understand that the content is actually the same so we pick one version of the content to index but we swap out the URLs anyway because of the hreflang connection that we have. So that's that's something that sometimes happens what also sometimes happens is that we understand the language or the country connection based on other factors so that could be things like like the URL in this case it could be things like the internal linking within the site if we see kind of the different language links between the pages or country links then sometimes we can pick up that connection even without like having clear hreflang guidance but obviously having good hreflang guidance is the best way to kind of increase your chances that we actually do it right if a single domain has top three organic positions for highly competitive and commercial keyword is this a mistake what could be happening here so that we can set a strategy for the site in position four or five in general that's not necessarily a mistake we do sometimes show the same sites multiple times and the same search results page that that can be completely normal it's it's something that depends quite a bit on the queries and the sites involved but it's not that we would look at that and say this is wrong the site should only appear once in every search results page it can definitely happen that we show one site multiple times is that more for is that more sorry I hope you don't mind me asking this but I'm saving questions up and so I see that a lot on weaker serps very less obvious when there's when it's competitive terms you know a lot of the location based stuff I'm seeing that quite a bit where actually there's pages that are not the set not quite the same and a lot of the locations say for instance in the UK they may have same town name but they're in different counties or different states so I suppose it's query diversity or it's result diversity on the same site often isn't it sometimes yeah but it seems to be like the weaker serps and that's more common I mean that's save things yeah I don't know where it would be more common it can happen in all kinds of places I I've seen that for some very competitive terms but also for a lot of weaker terms where maybe they're not that much competition but one site has a lot of content around kind of this term so that's that's something at least from my point of view we wouldn't see that as a bug if the search results are bad because of that then I would still submit feedback but it's not that I would say the number of times that a page is shown in search result is by itself a criteria that something is broken John can I jump in with a question about Knowledge Graph sure one of the things when we're working in let's say English language markets like Australia for example that's smaller not necessarily the United States there seems to be a lot of US related content ranking in the local markets rather which is quite in contrast to what's in the search result page which are usually Australian sites in that case is this something that is a product improvement or is something wrong here that needs to be improved I guess and this is something that you guys are working on or or how do we understand this discrepancy so that's in the Knowledge Graph on the side you're seeing like in the FAQs the ones that kind of like you know start with like five and they kind of fold out do you know what I mean they're in the main search results okay frequently asked questions kind of related to the search um I don't know I think that would be really useful to have as feedback because a lot of times these product teams are based in Mountain View or somewhere in the US so having feedback from from places that's not in the US is really useful sometimes for them to figure out like how to actually improve this so what what you could do if you see this happening either submit feedback on the bottom of the search results page or you can send me some screenshots ideally so that I can see the query and everything yet that you are searching for so that we can reproduce it and I can pass it on to the product team to try to help figure out that's something where I I find especially the the non-us content is something that sometimes is is tricky for the product team to figure out so having examples is is really really useful okay I'll see if I can put some together when I see it and send it out thank you that would be fantastic um why does why Googlebot doesn't want to come to the big websites I don't know I okay let's see uh we we have a site uh with 500,000 pages and one small site with 500 pages according to statistics of scanning and webmaster tools we we can see that the box scans very often the small site but doesn't go to the large site um how can I tell Googlebot uh to to come and visit us essentially uh we we don't have any any particular I I guess algorithm that favor small sites with regards to crawling usually we try to make sure that we can crawl all sites properly and that we can recognize when content is new and important on all of these sites and often that means big sites need to be crawled a lot more so that we can kind of keep up so it's definitely not the case that by default we would crawl a big site less if anything we would crawl them probably a bit more however we we don't crawl all sites with the same speed on the one hand there are technical limitations with regards to your server's capacity as we proceed it uh for example on the other hand it's also a matter of the website kind of overall quality and the feeling uh from from our algorithms like how much is that actually worthwhile to spend a lot of time crawling this website and indexing a lot of content from it and sometimes what happens is that a small website is just a really fantastic site that we need to make sure we keep up with everything and a large website is just like a big fluffy site that's filled with a lot of content but actually not that much really critical important stuff so I I don't want to over generalize or assume that your website isn't that great but that's definitely something that that I would look into on the one hand the technical foundation that we can crawl fast if we wanted to on the other hand that it's really clear to search engine crawlers that actually there's a lot of value to be gained from having all of this content index as quickly as possible the other thing to keep in mind is that crawling doesn't mean ranking so just because we crawl something more often doesn't mean that it ranks better or if we crawl something less often doesn't mean if it will rank worse so if we have all of the content from your large site index and not much is changing there then there's no reason for us to continue like hammer your site with googlebot request because we're already showing your site and sure in search there's nothing like new that we'd be missing out by not crawling as quickly as possible um if we remove a video from a page remove the video markup and remove the page from the video site map how long should it take for the video box in the search results to disappear I don't actually know I assume based on things that that I've seen with the video team that this is something that often takes a little bit longer in that we need to really be sure that actually this video is completely removed and no longer included on this page if the video was hosted on your web page on your website removing that video directly can also help because then we know that this video is definitely not hosted on the internet anymore we can take it out if it's something that you're maybe embedding a youtube video on your pages then obviously the youtube video itself will still be around so that's probably not so easy but in general this is something where we have to kind of reprocess your page to understand this this particular change it's more than just a one-time re-crawling and re-indexing our product detail pages have four content blocks that are collapsible on page load they're all collapsed except for the first eight lines of the detail tabs we did this to provide the user a better experience so that they can pick which section they want to see instead of a big scroll of content we're concerned that we may be losing out on some SEO value in the collapse section because they're hidden on page load is this a problem should we do something differently so at least for the moment when we're indexing the desktop version of the page if something is hidden by default on page load then we assume it might not be the primary content of this page so the most common place you see this kind of bubbling up is that we tend not to show this kind of content in the snippet in the search results so if someone is looking for a piece of text that's hidden in one of these collapsible sections then we probably wouldn't use that as a part of a snippet on the search results page because we don't want to over promise for the user we don't want to say hey this text is immediately visible when you click on this page as we shift towards mobile first indexing we realize that's a lot harder because you can't just make everything visible by default especially on mobile you really have to focus on the most important parts and there we were probably going to say that we will treat this content like we would anything else but it's still something to kind of watch out for from a usability point of view so what what I would do here is not worry so much about the SEO aspect that you wouldn't be ranking for this content but more think about the usability aspect and in particular in search console you can look at the queries that are showing your page in the search results and think about if those queries are actually based on content that's visible to the user and if a user were to go to those pages after doing that query would they feel like they found the answer to to that query on your pages or would they kind of be guided to like that expanding section automatically and know oh it's probably in here I just have to click on this and then I'll see the full content so that's kind of where where I would look at it not so much with regards to SEO but more with regards to usability John so I mean with your design so basically like I mean we're also facing a similar kind of situation so we are trying to I mean showing a lot of content basically on one certain topics and that might be possible that some section of our content is useful for some users depending on the queries and when the user is searching for that content or maybe set of questions they might see our kind of first forward or something like so so they don't see the immediate kind of the content they're looking for so I mean that like you said like I mean we can use somehow the usability purpose by so maybe a jump tag or something like that on the page we can actually direct navigate to that particular section so I mean how it will be like can we done on the mobile site basically I mean we have a kind of limited space and if you just put in the kind of jump tags over here let's take them you can say most of the first four so then again it might be kind of objection to kind of thing from a product team itself and then maybe some other team who can actually say like why are putting all these things on the top of the user landing to these specific pages they don't see the content but they see all these links which is not might be not useful so just wondering how we should like go about for these I would test it I don't think there is one answer that that fits across all types of sites and I would test this mostly on a usability part of you and this is something where you can do a B testing where you can put together a small panel of users and show them mocks and kind of see how they react to different variations but this is something that that I would test primarily from a usability point of view and not with regards to SEO I think for SEO we can probably deal with all of that so I definitely wouldn't break a page for SEO from a usability point of view because in a case like that if a user does go to your page because it's really ranking well and they don't convert then you've lost already so it doesn't really matter it's more important that you actually have good pages that work for usability purposes and then we should be able to figure that out for the SEO side as well thank you I realize it's sometimes really tricky to find that balance though so I don't know it's sometimes it's nice not to have an absolute answer too because then you have more of a chance to to be creative and try to do something different than all of your competitors are doing I can't make it that easy for you you have to like figure things out yourself sorry let's see our links mentioned inside the source code that can't be seen by the page also counted as links some WordPress plugins for example have this website is powered by or optimized by whatever and it's only visible in the source code not on the web page so that sounds like it would be a hidden link that wouldn't really be okay with with regards to to our webmaster guidelines so if it's an html and it's an a kind of link with with the actual url then that's something that would be a problem for us if it's an html comment for example and it's just like a piece of text saying this this plugin is powered by so and so and it's an html comment and that's not a link and that's perfectly fine so the the problem is really if if it's a link that's passing page rank to another site that's not actually visible on the page by default then that would be a hidden link and that would be something that our website algorithms would try to figure out and take action on could you give us some insight on the relevance of semantic html 5 for example if you have a site that uses h1 tags for all elements in the navigation will this send confusing signals to google about the main topic of the page or does the crawler understand that this is part of the nav as defined by html 5 and look for the h1 in the main instead so as far as i know we don't process html 5 in kind of this semantic way so we look at the individual headings and try to deal with that it's fine for us to have multiple headings on a page it's that's not a problem so from that point of view i he wouldn't worry about this if this is the right way to set it up for your site then i would just do it like that but it's also not the case that you have to kind of like set up this clean semantic html structure on all of your pages in order to get some kind of an SEO boost there obviously having clean markup makes it a lot easier to have a page that works well across a number of devices makes it a lot easier to implement structured data properly all of that definitely makes sense but there is no inherent SEO advantage of having clean semantic html i'm running into problems with the ad review process i think we looked at that and yeah maybe we can just shift over to more open questions what else is on your mind oh i see nicholas dropped something in the chat which i can't actually access because it's private uh i'm sorry john i will explain you uh now this is the result from when i search apple like example and when i go to the website and come back in google search result i have a new border with people also search for this is a same like standard people also search for like suggesting google or this is something different because i see the different people search for different uh suggesting from standard i i don't know i assume it's it's slightly different because we we only show it when you go back to search right yeah no i go back so this is a new actually because i never see i think we've done similar things in the past but uh i i believe we we've started showing this more or at least i've started seeing more people post about this publicly so i'm assuming we're showing it more um i i think these these are all just like normal experiments that we always do to try to figure out how can we recognize what the user is actually looking for and guide them to to the right content on on some other website so that's something where kind of like any other website where we're always experimenting with ways to improve the usability and really to guide users to the right content to the right information so that they can solve whatever problem that they're looking for uh in an efficient way can i just interject there because last you know last summer when i went on that course the information through the summer school a lot of the teaching by some really big companies there was like facebook google and so on but uh bloomberg etc was similar to this and it was it was called it was like recommender systems that's like the big thing now it's like the amazon thing same thing really and it was it was based around persona modeling rather than individuals and i've asked this question a few times of you john in webmaster hangouts and you've dodged it yeah when i said are they really building more models of types of users rather than individual users because obviously there's only so much personal data search engines or anybody any company can use without permission of an individual user so actually forget the user and really focus more on users who are similar to x yeah so this is me a lot of this whole like people also ask for is like persona based modeling rather than yeah just just based on groups of people who are similar i i don't know i would assume that we have people doing all kinds of stuff around that area so every last time that's that's my assumption because i mean these are normal things around marketing around usability where it makes sense sometimes to have groups of users personas sometimes it makes sense to try to figure out like on an individual basis what what are the the preferences there what i think is also important is to figure out when it makes sense not to personalize as much to kind of avoid generating this filter bubble where it's like when you search you only find the stuff that you already know and it's like you you don't really expand your horizon you don't see what else is out there so that that kind of balancing and trying to figure out how to personalize how to kind of provide general information based on on the type of stuff that you search for i i think that's that's a fascinating topic but i have no idea what what the details are that that are happening there it's what i'm saying is i'm looking at five books on my bookshelf now from uninformation issue we're called recommender systems yeah and that seems to be like what this is a big like patterns of like this person bought this so she also yeah i mean this this is a gigantic topic that's something that that shows up in a lot of big websites um so i believe what was it maybe 10 years ago Netflix did that big contest where where they had this ai contest where you were supposed to generate recommendations based on what people were watching or something like that and these are are really big problems because there's a lot of options out there and personalizing to some extent makes sense finding out what would be the optimal personalized result ideally but also figuring out when it makes sense to to add more diversity more more other viewpoints for example yeah just just probability of somebody may also be interested in x because groups people who also bought y are also interested in it it's just yeah it's just like any other website really isn't it i i think that's something where if you're working on a big website and you have a lot of experience or knowledge around that then that's that's a really valuable experience to have that's really useful for a website to to try to figure out because it's like on smaller websites i think the the interlinking is is more visible and more something that you can do on a manual basis but if you have more than a couple thousand products like how do you tell which product is similar to this how do you kind of find a recommendation for the user based on what they happen to land on the first time they come to your website that's that's really hard and getting that right i think makes can make a gigantic difference in the conversion so yeah fascinating topic all right go for it yes um i had asked this in the question the written questions um and uh sounds like you're going more verbal at this point so uh you know you discussed in years past how the focus should be on you know getting the user what they want um now would it would you say that really the way to do that is let's if you have a product sites conversions if it's a it's a news or blog site it's it's if they're reading your the whole article um and you can imagine every type of site is going to have a different sort of metric for conversion but is that really what we should be focusing on i mean because you know you've said you know balance rates are kind of useless um i yeah i i would focus on that just purely from a business point of view sure i mean essentially you're putting this website out there for for a reason like i mean sometimes they're the websites you just put out there so that you have the information you like publish it it's like you just want everyone to be able to find this information but a lot of times you have a real reason why you want to do something on the web that's that's around like i don't know some monetary value that you get out of uh people doing something on your website that could be clicking on ads that could be buying something that could be signing up or getting your ad best getting your phone number all of those are are potential things that you could say this is like a goal of mine that users actually do when they come to my website so that's that's something where what what i've also seen some sites set up a kind of micro conversions uh for example if they go to your web page and they click on your about page then that might be important if you're a small business because your about page might have your address your phone number your opening hours on it and that kind of shows that actually they they're pretty interested they're not going to buy it directly from my website but maybe they'll come by maybe they'll remember the address or the phone number and and come and visit me so all of these things can add up and you can say well it's not that thing my primary metric is how many sales i do on my website but how many people maybe sign up for a newsletter and that in turn sometimes means that there's a potential that i can sell them something or that i can win them over as a client for maybe a service or something else so all of these small things are useful to track as well true true and i i guess the way we could just kind of simplify is call engagement so if i had an information website and there were ads okay you know people click through in the ads and that's good but ultimately you know if they're not reading through the content um you know i should be focusing on that right because if they're just kind of looking at the first paragraph and you know and hey this is not what i needed and that's kind of a bad signal for my site yeah i i mean i wouldn't do this as an SEO goal and say that people need to read my content so that i can rank better but rather think about it the the other way around that if i rank well people are going to go to my website and if they're going to go to my website and not read the content there's like why why am i even ranking well what what value do i as a website owner have out of it uh from from ranking well and search i don't get paid for like the ranking i get paid for people actually doing something on my website right okay cool i think we're a bit over time um it's been fun talking with you all lots of good questions lots of good comments um i'll set up the the next batch of hangouts as well uh next week i don't know maybe i'll set some up for next week because a week after next when we usually would do them again i'm at smx munich so if you're at smx munich come over and say hi uh but i'll probably try to set something up for next week as well with the hangouts so that we don't kind of get too far off track all right thank you everyone and thanks john you all a great weekend thanks john great weekend all the best bye everyone john hey are you