 All right, welcome everyone to today's Google Webmaster Central Office Hours Hangouts. My name is John Mueller. I am a Webmaster Trends Analyst here at Google in Switzerland. And part of what we do is talk with webmasters and publishers like the ones here in the Hangout and the ones that submitted a bunch of questions already. As always, I'd like to give those of you who are kind of new to these Hangouts a chance to ask the first questions. Is there anything on your mind? I've got a question. Can you hear me? Sure. I've done a fair bit of SEO before, but I've actually got a job interview coming up with a very, very large news website. And I'm trying to get my head around what the differences might be. Just to give you an idea, they rank very well currently. They produce literally hundreds of pages every single day. One of my thoughts was, how on earth would you, because things are quite topical and you'll probably only update the content on one day, how do you get a measure for how well the day's content is performing relative to the search traffic that's out there, given that you're probably not going to update it beyond that day's news? That's an interesting question. So I think that the tricky part there is that you don't really know what the rest of the search traffic is. Yeah. So essentially what you're probably going to be looking at is general freshness, like how quickly do the news stories show up? Do they get indexed fairly well? Are they ranking appropriately in the search results based on what you think this site should be doing and compared to maybe other news sites that are focusing on the same topics? So that's something where it's sometimes really tricky, because you have to be fast in analyzing the situation. Yeah, I mean, typically I get the impression that you're ranking it almost instantaneously. So I don't know what's the fastest you do it, but given its size, it does seem to rank very well. What I can see is part of their problem is given they're trying to gain more from SEO, I think a lot of their stuff is ranking, but I guess they don't know the stuff that isn't ranking on a particular day. And presumably, you're not going to rank everything, even though they're a very well-known domain, you're not going to rank everything they produce on that one day, are you? It depends. I mean, this is something where I think a lot of technical SEO comes in, a lot of understanding the technical tools also plays a role in that you want to make sure that the website is crawlable really well, that the new content can be picked up as quickly as possible. And maybe do something like log file analysis to determine when was this content put out there, when did Googlebot actually look at it, like what's the latency for that, and when did users start going to that page from SEO? And that's something where you can probably pull in a bunch of metrics from those various sources. Can you do that in analytics, or are you looking at other tools? You can't look at the crawling in analytics, as far as I know, because Googlebot tends not to execute the analytics JavaScript. You can look at when users are coming. So it's kind of like jumping over that initial crawling stage. But I think they'll be getting users, though, from social media. They'll be getting users, they have their own apps as well. They have other major home pages, which will be redirecting traffic to it. So about 20% of that traffic will come from SEO. So the users will start coming instantaneously almost. The second they send out a tweet, they'll start getting hundreds of users. It's how much of it is coming from SEO. And is it ranking good enough compared to the, or do we need to go back and look at it immediately within the hour? Yeah, I think if something isn't ranking well when it comes to news, you can't fix it for that article. You have to kind of jump forward and say, OK, I'll fix it for tomorrow or for next week. Because if something is really a hot news topic, you can't tweak the SEO of that article while it's a hot topic. So that's something where I would focus more on the log files themselves, not specifically analytics, and really look at what refers are actually specified there. So you can see if it comes from Google or if it comes from Twitter and other social media channels and try to create some metrics to figure out what is the latency for an average article to be crawled? What is the latency from the crawl to actually getting traffic to it? And then that's a metric that you can work on kind of optimizing. Just picking up on a point you said earlier then. So because I was going through the idea that perhaps you could change the content on the same day to get an improved ranking, is that probably not possible? You can change the content, but usually that's something that I'd say harder to do. And usually when it comes to news articles, it's usually less an issue that you're not targeting the right keywords or something. It's more of a technical issue that maybe your template is set up in a kind of a bad way or we can't crawl, we clean URLs, we always find these session IDs and crazy stuff in the URL patterns. And those are things that you can't really fix in one day. Obviously, if it's a news article and it's still a developing story, then updating that is probably a good idea. Some competitors sites appear to be testing five or so rotating headlines and seeing which headline gets most traffic. Is that OK from a Google's perspective? Are there any sort of ground rules for what you should or shouldn't do when you're rotating the headline literally through five different variations? From an organic search point of view, that's totally fine. I think that's really hard to track because you never really know which version of the headline is currently actually the one that's shown to users in the search results. That's what I was wondering. Yeah, so it's not even possible. It's really hard to track that. I think theoretically there might be ways that you could figure that out, but I don't have anything offhand. The other aspect is I don't know how Google News would deal with that. If there's policies around Google News that say, oh, you shouldn't be doing this or you shouldn't be putting out multiple URLs for the same story, I don't know what the policies there would be. Would it be a case of you having to put out multiple URLs? They'd actually have to be different URLs for the same story, would they? I don't know what would be involved there from the Google News point of view. I've definitely seen some sites like The Verge. I noticed they are rotating headlines and they seem to rotate them over the course of the day. And even as a same user, you'll see different versions of the same headline coming up during the course of the day. And they seem to be narrowing down on which is the one that's getting the most traffic. But like you say, how do they know which one is the one that the users are actually seeing that you've indexed and is the one that you're showing? There's also the aspect that Google's algorithms tweak the headlines in the search results as well. So that makes it really hard. I think this is really kind of more into the advanced area. And it's something where you really have to make sure that the technical foundation of the site that you can actually measure these differences is really downpacked. And that's, I think, the hard part for a lot of websites OK, thank you. Sure. All right. Enrico, you have an interesting question about hreflang. Can you elaborate a bit more what you're looking for there? Yes. Hi, John. I was talking with a colleague. And we were talking about hreflangs and canonicals. And there is a rule that Google applies. And it's a bit tricky for me to understand. And the rule is that in a non-canonical URL, the hreflang should point to a canonical URL. And this rule is a bit tricky. And it causes a bit of trouble sometimes. And I'll tell you why. If I'm visiting a website, and I'm visiting a web page about a list of products, let's say our category page already filtered in some way, already sorted in some way. So the URL contains, for me, that sort and filter the list of products with some criterion. And if I have a link to a corresponding other web page in another language, but for the same page, and I click that link, that link will take me to the corresponding list of products, sort it in the same way, filter it in the same way, but in another language. So I am in an English page. So I click that link that points to the Spanish version. And I get a Spanish version of the same list of products. But those parameters in the URL make that page of those URLs non-canonical. So what I am telling me with your rule is that in the English page, I should use as hreflang a canonical URL. But in the same page, the HTML link on the page should point to the non-canonical Spanish version of the page. So there is a bit of discrepancy here. Because in the Yacht mail code, I'm saying the user, OK, go to the corresponding non-canonical URL and a Spanish URL with all the parameters. But in the HTML code, you are telling me that the hreflang shouldn't point to that corresponding URL. That should point to a canonical Spanish URL. And so this is a bit tricky, because, for example, Yandex doesn't have this rule. So I can please, for example, both Yandex and you. Yandex tells us just to put in the hreflang URLs the corresponding URLs, regardless if they are canonical or not. Why wouldn't the Spanish version be canonical? Well, the Spanish version keeps the parameters, because the CMS needs to produce the same list of products. So the CMM needs to get exactly the same parameters. So I think that you're not doing a canonical across the languages, right? You're not saying the Spanish version has a canonical as the English version. No, no, I'm not doing that. I could tell this non-canonical English page has a corresponding canonical English page and the non-canonical Spanish page has a canonical Spanish page. I'm not doing nothing fancy with a cross canonical and so on. But I think it's an increment that you ask me to use the hreflang to point to a canonical URL, because language and canonicalization are two very different issues on concepts. So the hreflang should point just to the corresponding URL, regardless if the corresponding URL is canonical or not. Kind of the technical reason we have that like that is that if a URL is not canonical, we tend not to crawl it that much. So if we don't crawl it that much, we can't confirm the hreflang link between those two pages. So that's kind of the background thought behind that, in that the hreflang link should be between different canonicals, so the canonical English page, the canonical Spanish page, because those are the versions that we will crawl an index, and those are the versions that we can connect with the hreflang. If it's linking to a Spanish page that's not canonical, then that's likely that we probably won't even crawl that page that often to even recognize that there's this connection between those pages. Yeah, and there's some of why you put this rule on. But there is a practical issue here, because other search engines like the index don't apply it. And so I can't please both Google and Yandex, for example. And is there if you can link, if you link to the canonical versions, then that would work for both, too, right? Well, Yandex just asks me to use in the hreflang URLs the corresponding URL, not the canonical corresponding URL. So Yandex asks me for a different thing. But you can specify the canonical still there, right? I can still specify the canonical, but they do not mix the canonical and hreflang. While you are mixing the two concepts, when you asked me to use the hreflang, not point to a real corresponding URL, and a URL that keeps the parameter. But you asked me to use in the hreflang a canonical URL. That's from a semantic point of view. It's not 100% correct. And mixing these two concepts can cause trouble. For example, I kept talking about Yandex because I'm working on a website that has a Russian version. And here, we are two different search engines that have two different rules about how to use hreflangs. And I don't know what to do. Yeah, I guess that's a tricky situation. But that's something where we work to put the RFC out for the rel canonical. So that's something where, if you're seeing others interpret it in different ways, then I would point them at the RFC. And that's essentially the canonical documentation, if you will. Yeah, but we are not talking about the canonical documentation. We are talking about the hreflang documentation. Yes. And the reason that RFC calls that as well? Yes. Really? I don't know that. Well, well. OK, thanks. And the documentation explains how to mix rel canonicals and hreflangs. I don't know. I haven't looked at that in a long time. But I would look at that. OK, thank you. Thank you very much. Sure. All right. Let me run through some of the other questions here. What can be the reasons why a page is out of the index, even though the page is crawlable and indexable? And there are no obvious reasons for penalty on page or off site. That's sometimes tricky, because there are lots of different reasons why a page might not be indexed. It starts with essentially the general understanding that we tend not to index everything that we find on the internet. So just because it's possible to index something doesn't mean that it will be indexed. And that's partially for technical reasons, partially also for quality reasons that we might say we don't have enough reason to actually index it's, I don't know, let's say a million pages from this website that we just discovered yesterday. So these are essentially a lot of different things that could be in play here. One thing you can do is use the Search Console fetch and render feature to submit that page to our index. And if, from a technical point of view, everything is kind of OK, then you should see that page be indexed within maybe a day or so. So that's a kind of a rough way to double check is there really nothing technical in the way of this page actually being indexed? And if that's OK, then that's something where you essentially just work on the rest of your website to make sure that it's easily crawlable, that we can discover all of these pages, and that it's clear for us that this is really important content that we should be indexing. Rank number one for big keyword with 100% visibility but no traffic. The personalized search is turned off. The website is indeed number one, but only 500 traffic per day. What could be wrong? So if you're ranking for a keyword and you're not getting a lot of traffic, then usually that's either that maybe not a lot of people are actually searching for this or perhaps the page that you have, the way that it's presented in the search results, is something that's different from what users are expecting to see. That could be something like the title of the page or description of the page, the content that we pull out there. These are all things that are essentially possible. So it's important that you try to search like your users are searching and try to find a way to double check that people are actually searching for this content. Because if nobody is searching and you're ranking first, then that doesn't really help your website that much. Search Console offers some great information about the links in 404s. Is there any plan to add a list of pages that can be considered low quality as well? That's an interesting idea. I'm not aware of any such plans. So at least from my side, I don't think that's something that the Search Console team is working on at the moment. I don't know if that would be something that we would even add like from a quality point of view. I do agree that sometimes it would be useful to have a bit more information about what Google's algorithms consider to be quality. But I don't know if a list of low quality pages would be that useful for the average webmaster. But I'll definitely pass that on to the team. And maybe we can find something there in the future. How can I identify which pages on my site Google has judged to be low quality so that I can improve them? OK, that's a second vote for that feature, I guess. That's good. Getting no return tags in hreflang sitemaps, how to resolve this. I have a link back to page A to page B and B to page A. This probably just means that we're still crawling the URLs on your website and haven't picked everything up. So one important kind of tidbit to keep in mind is when it comes to sitemaps, we take the extra metadata that supplied in the sitemaps into account when we've recrawled those pages. So if you specify a lot of hreflang links in your sitemap file for individual URLs, or if you have other kind of structured data in the sitemap file for individual URLs, we don't take that into account when we read the sitemap file, but rather when we process that individual URL that is specified in the sitemap file. So in this case, assuming the links are correct between those URLs, it's more a matter of us actually recrawling the rest of your website. And that's something you can help give us some information on as well through the sitemap file by specifying a last modification date. So the last modification date is used by our systems to determine which URLs we need to crawl next. And if you tell us that you modified this page recently, perhaps adding hreflang or perhaps changing something else on that page, then we'll generally try to recrawl that page a little bit faster to catch up and make sure that everything is OK there. So having a correct last modification date helps us there. What doesn't help us so much is if all the pages on your whole website have the same last modification date because then our systems assume that probably your sitemap file is generated in a way that's incorrect. Maybe it's using the date of when the sitemap file was generated rather than the date of the actual URL. Migrating from HTTP to HTTPS with URL and architecture change, on January 30, we migrated our website from HTTP www to HTTPS non-dub-dub-dub. And change of architecture to a mobile bootstrap site to maintain all the content and items on the website we're experiencing a 30% drop in organic traffic. What could be the problem? So this is something where I'd say it's really hard to say in general without looking at the specific website. So if you want to send me the URL either through Google Plus or on Twitter, feel free to let me know and then I can take a look there. Sometimes it also helps to post in the webmaster help forum because other people can find issues as well. But the general thing to keep in mind is when you're making URL and architecture changes on a website, you're making some really kind of in-depth changes on the website. And those require that we actually reprocess pretty much all of your website to understand it again. So just flipping from HTTP to HTTPS is something that we can pick up very quickly, where we can say, well, everything just moves one to one to HTTPS. But if you're making URL changes within the website, if you're changing the layout, the template, the internal linking, all of that makes it a lot harder for us to reprocess. So it sounds like you've already started this process and you're kind of stuck in this stage. So this is something where I would tend to kind of let it play out and see where it settles down. And then take a next step to actually make sure that the rest works. Because when you make bigger changes like this, we really need to essentially understand the whole website again. It takes quite a bit of time sometimes. John, would it be better to split that type of migration into two migrations, migrating to HTTPS first and then changing your architecture later? Maybe. Maybe. I mean, changing the architecture is something that's always going to be painful. So it's more, from my point of view, more a matter of when do you want to have that pain. Is that something maybe you can move it to a time when you're not so dependent on search traffic, when you tend not to have that many visitors? Then that's probably a good thing to do. And if you can kind of pick the time frame when you're doing it, then you might as well just do it all at once. Because then you kind of have it behind you. So John, if you're moving sites all together, if you're moving URLs all together, you do say it's better to keep the other than the domain itself, it's better to keep the URL structure in place or it doesn't, since you're 3.0.1ing, it doesn't matter if the new URLs are the same or not. If we can tell that it's essentially a one-to-one site move, so the URL structure stays exactly the same and you're just changing domains or switching to HTTPS, then that's something that's a lot easier for us to process. Because then we can just say, all of this is over here now and that's fine. Whereas if you change the internal linking, if you change the internal URLs, even if you go from .php to .html, then that essentially means we have to understand the whole internal linking again. And that can take quite a bit of time. OK. All right, thanks. Can buying pop-under traffic hurt your domain? Does the quality of the site, the traffic, where the traffic comes from matter? I don't think from an SEO point of view we would even recognize this, but users might not be that pleased if you're kind of guiding them with a pop-under to your website. So that's something that I would look at more from a marketing point of view rather than an SEO point of view. For a new site, is Google OK with the idea of testing different headlines, titles, for news posts to see which one performs best? I think we looked at this in the beginning. When it comes to organic search, essentially it doesn't matter for us if you're tweaking things like this, but it's really hard to actually track those changes. Because on the one hand, you don't know exactly which title version is currently indexed, so you don't really know which one the user saw. On the other hand, there's also Google that algorithmically rewrites titles as well, trying to match them to the query and to the content that you have on your pages. So that's something that I think from an organic search point of view could be pretty tricky. With regards to Google News, I don't know how perhaps their policies apply to that. So that's something you'd want to look at with the Google News team separately before you start doing these kind of tweaks. Just a further thought on that. I was just thinking now, is it any different to what you can do within Google Analytics, which is you tell Google Analytics you're AB testing two different pages? Is it actually any different to that? And then that is the way of telling Google which page is live at the moment. I don't know how the Google Analytics AB testing works. So it's hard for me to say that. My gut feeling is that the Google Analytics AB testing is more a matter of what the user sees when they come to your page and not a matter of what would actually be indexed for your page. But I don't know what the specifics are there with the analytics. That was the exact question that was going through my mind, actually. Do you know where I would go to get guidance on that? Is it analytics for them? Yeah, yeah. Right, OK, thanks. I checked there. Yeah, John, I have a question. Sure, go for it. It's regarding HREF, actually. The thing is, we have a global page which is not like an IN in India specific. But the thing is, we have given HREF from all the languages to the global page. The page will be like a domain slash whatever you are. But we have also a state specific language specific pages where we have given HREF to the main global page. There are two pages which are the same language, English only. Suppose the English has a theme, three pages. It is mandatory to give a canonical for three to the main page from these other two pages. For the canonical, we'd want to make sure that the pages are really equivalent, that they have the same content. So if these are different pages with different content, then the canonical tag wouldn't apply there. Actually, it depends. Sometime they may change or they may not change. Because it is English. So English is the main language which we prefer. Like a few pages may not be changed. And a few countries they won't prefer to change actually. So they left it as it is after translation, even after translation. So is the HREF help during that time? Or we need to give a canonical? You mean the hreflang link? Yeah. OK. Yeah. So if these are targeting different countries, but they're all in English, then the hreflang tag is perfect for that. Yeah. During that time, canonical is not needed. You mean to say? I think it would probably help to look at the specific setup that you're trying to do. I don't understand completely exactly what you're trying to achieve there. So what I would do is maybe post in the Webmaster Help forum with kind of an example setup of a bunch of URLs and how you would like to link them together. OK, that's fine. Thank you. And one more thing in Scheme. I want to introduce Schema tag in Drupal. Do you have any idea of that? How to inbuilt of Schema in Drupal? I don't know. I assume there are plugins, but I don't know Drupal at all. So OK, that's fine. And one more thing is we have set a frequency inside map saying that we update never. But even though we update like even now and then by the frequency of every month or within twice in a month, but we have updated status that frequency is never. But if you put a status as never, does it make, consider that to crawl the page or like how it consumes? So we use sitemaps to kind of add on to our existing crawl. So if we're already crawling those pages and in a sitemap file you're saying this page hasn't changed, we won't stop crawling that page. So it's not that we would crawl less because of your sitemap file. We would just try to crawl better based on the date. We ignore the change frequency at the moment in sitemap file. So that's not something that you need to specify there. It's also not something where you would have problems if you specified it wrong. We focus more on the actual last modification date in the sitemap file. And how we move priority? We also ignore priority. Thank you. Makes it a little bit easier, so you don't have to focus on those extra fields. But it's something where we've noticed over the years that the additional value of those fields is minimal. So we decided at some point to say, well, we're not actually using these at the moment. It's something where I believe the custom search engine at some time used priority to understand a little bit better. But for organic search, we don't use all that. All right. For a well-ranked sports news website that produces 100 news articles a day, what are the best tools for filtering down to identify the news items that are under or overperforming relative to the new trending search traffic? I think we talked about this a bit in the beginning. So I'll just skip over this. How does Google consider indexing rate of newly published posts for any blog? Does manual URL submission or fetch as Google affect the ranking of a page by any means? Or should it be left to Google to index generally? So in general, we pick up these pages fairly quickly on our own. If this is a new website, then obviously it takes a bit of time to understand where we should be ranking these pages in the search results. But apart from that, essentially, once we have it indexed, it's indexed. And we will try to rank it in the normal way. So it doesn't really matter which way you submit those pages. Once they're indexed, they're indexed. Let me see. With the coming Mobile First Index, do you foresee increased complexity for sites using dynamic serving? Could serving a responsive site and dynamic serving for mobile user agents be problematic either now or once a Mobile First Index comes? So I don't see having a responsive and a dynamic serving website problematic in the sense that you would break something by having this setup. But by having a setup like this, it's a lot harder for you to actually debug what is happening. So using any kind of SEO tools to crawl your website, any tools to determine what structured data you have on your website, it's a lot harder if you dynamically change the content, depending on the user agent that's actually accessing it. So that's something that I wouldn't say it's a problem, per se, in that you will get penalized or demoted in search. But making sure that everything is working smoothly is definitely harder. Tell me, John, which if Googlebot comes along and they pick up first the responsive site and then they see the very header and then they re-crawl with Googlebot mobile and see a different site, which site do you think might be picked up as being the mobile site in search results? If you have the very header and you point to a mobile site, then we'll trust that. OK. So that's for us a pretty clear sign that you're actually saying, well, this is really the mobile version. No, sorry, hang on. Just one second, a follow-up question. So in this case, it's dynamically served. So the responsive site is served on the same URL as this alternative mobile site. There's no pointer other than the various header. So it's just a matter that you'll see it with a different user agent, is it? Yeah, yeah. So at the moment, we crawl most of the pages with the desktop Googlebot. And then with the mobile first index, what will happen is we'll crawl most of the pages with the smartphone user agent. So in that case, we would probably only see the dynamic serving mobile version of the page, because that's essentially what you show us. OK, OK, cool. In Search Console, I verified my domain with dub dub dub and without. Now I'd like to choose a preferred domain version. Every time I select the preferred domain, I'm told to first verify both domain versions. I already did this. What's up with that? So that sounds like something weird is happening. In general, that's with a preferred domain setting, that's essentially what we're trying to do there. You have both the dub dub dub and non dub dub dub versions verified, and you want to pick one of those. So that's what we're trying to do there. What might be happening is that perhaps you have one of these with HTTPS and the other one with HTTP. That could be a reason why this might not work. The other thing to keep in mind is that this preferred domain setting is essentially just a confirmation of the other factors that we've seen on the site. So we use a bunch of different factors to determine which one is the canonical version, so which one is essentially the preferred version. That includes things like redirects, like rel canonicals on the pages, internal links, external links to some extent, the URLs that you have in your sitemap file. And a lot of these are factors that you can control on your side. So if you tell us with as many of these signals as possible that dub dub dub is the version that you actually do want to have indexed, you have a redirect set up, you have a rel canonical set up for the pages, then we'll just use that anyway. So that's something where the preferred domain setting isn't critical. It's not something that I'd say you absolutely need to do. And if you can't make it work in Search Console, then I would just leave it at that and focus on the other factors there. With regards to Search Console, my guess is it's something like HTTP, HTTPS. And if you can send me the example URLs, then I can double check with the team to make it a little bit clearer in the UI there. But from a practical point of view, this is something that you can control outside of Search Console just as well. Hi, John. Could I ask you a question? Sure. Go for it. It's in the list as well. But so working on a rather large site that has a lot of user-generated content, this is all for Belgium. And there are two domains, one for the French speakers and one for the Dutch speakers. But from a user perspective, it makes a lot of sense due to the user-generated content that happens on the French site and it happens on the Dutch site to also cross these over and have both French and Dutch on the French website and both French and Dutch on the Dutch website. So these are two separate domains. And they're completely translated. And we do have the alternate HR plans in place for all of the architectural links, but also all of the user-generated links. So if you post something Dutch on the Dutch site, it gets copied over to the French sites. And from the Dutch site, we have an alternate HR plan pointing towards the French site. Ever since this implementation, which from the user perspective made a lot of sense, there's been a decline in visibility as well as traffic coming into the website. So the fear here is that Google might see this as a problem. And I was wondering what your response is to this. It's hard to say. But in general, the hreflang doesn't affect ranking. So it affects which URL we show when we do show the page in search results. So just by implementing hreflang, essentially, you're telling us which one of these pages should be shown in the search results. It's not that we would demote that page or demote the other page or anything like that. We would just try to pick the most appropriate URL to show. So overall, you shouldn't be seeing a change in traffic there apart from users actually getting to the version that's probably more likely than the one that they wanted. So that's kind of, as a starting point, what might be happening if you, at the same time, kind of set up the structure together with the hreflang in the sense that suddenly you have these two language versions of the user generated content is that we just have to crawl a lot more URLs on your website. And we have to understand how we should be ranking these URLs separately. And we might be running into a situation where you're essentially kind of diluting the value of your website across too many URLs, where we don't know how important these URLs are individually because there are just so many of these suddenly. So if you've kind of gone from the one language version to multiple language versions, then that change from one to many is something you might reflected in the search results with regards to the rankings. If you've already had the mini URL set up and you've just added hreflang, then that shouldn't be changing anything there. So in this example, we did have the hreflang set up. But what we didn't have was the transfer of the French content to the Dutch website and the Dutch content to the French website. So but that does make sense from the delusion point of view that possibly what happened here since now there's a lot more pages on both of these websites, hundreds of thousands more. And then there's an interesting thing to add to that. So we are pointing from an hreflang perspective saying that there is a French version of this. So if it is a Dutch app, what you will technically end up with is a French website in terms of structure. But all of the user-generated content is in Dutch. And that same works by itself, which doesn't matter in Belgium, since the users speak both Dutch and French. And again, from surveys and questioning, this is what they seem to prefer. But would that matter to Google? That we're saying this is a French page, but now the content is mixed. And it is definitely, let's say, 50% Dutch and 50% French. What would probably happen is we would show the translate link in the search results just to make it possible for the user to see that page in that language. But otherwise, that's fine. I mean, that's something that happens like that. For example, Google Groups does the same thing. We have the kind of the UI in different languages. And the hreflang-generated content is however people post it. That's something where from crawling and indexing, obviously, this adds a lot of complexity, because we have all of these different language versions, kind of the multiplied versions. Otherwise, that's something that we can generally live with. OK. This helps a lot. Thank you so much. Sure. Sorry, Joan, can I ask something? Sure. What negative things could happen if I do not put a canonical URL in the hreflang? That we just ignore the hreflang. OK, nothing else bad, right? Nothing else bad. OK, in the meanwhile, I read the RFC, and it does not handle that case. So basically, it says that I can put whatever I want. Basically. So OK, thanks. Sure. All right, here's an interesting question. How can I find the guy who banned my website and took months of work? So I'm assuming this is with regards to web spam manual action. And in general, the web spam team has clear guidelines on what can be taken manual action on and what we won't take manual action on. So that's something where I would try to go through the webmaster guidelines and double check to make sure that it's really something that your website is really doing things in a way that works well for our guidelines. Sometimes it's obviously not that easy. Sometimes there are things that are a bit out of your control as well. For example, one case we've recently seen is when advertising on a website tends to be malicious in the sense that maybe they're cloaking to either Google or maybe they're cloaking to users and sending users to other pages instead of the page that the user visited. Then that's something where you might want to talk with your ad network and make sure that the advertising that they're serving is actually high quality and useful as well. That's kind of, I understand the frustration that when the web spam team takes manual action that it can be a pain. But essentially we work to make sure that the quality of our search results is as high as possible. And sometimes that means you do need to do a bit of extra work to make all of that work. There is a question here with regards to a big collection of hreflang links that I think ends up with the US page is ranking higher than the UK page in Google UK. It has hreflang en and hreflang xdefault. How do these two tags work together? So essentially what we try to do when we have a correct cluster of URLs are available in hreflang that we swap out the URLs accordingly based on the best fitting match for the user. So if you have one page for English in the US, one page for English in the UK, and one page for just general English, and the user in the UK is searching, then we'll try to show the UK English page in the UK. Whereas if a user in maybe Switzerland is searching in English and you don't have a version for Switzerland English, then we would show the more general English page. And if a user in Switzerland were searching in French, perhaps, then if you don't have a French page, then we would take the xdefault page. And you can combine these tags for the same URL. You can say this is the URL for UK English and general English and the xdefault. So that's something that you can combine like that. Is structured data markup still important for the ranking of my website? No, not necessarily. So we use structured data markup primarily to understand the content of the page with regards to things like rich snippets. And rich snippets are essentially just a kind of a richer display in the search results. They don't affect the ranking of your pages. So that's something where just adding structured data markup won't change the rankings of your page. Obviously, in the bigger picture, sometimes having some things marked up on the page make it easier for us to extract what this page's topic is about. And that makes it easier to show it in the right search results. But it won't change your rankings there. John, just a quick related question on that. We had structured data count of about 5,500 beginning of January, sort of near the end of January. It suddenly dropped to 2,500. Zero errors on anything. Might that be related to the update in Search Console? Because there was an update in Search Console around that time, or is this just Google deciding to not recognize the structured data anymore? I don't know if anyone else had a similar problem. I don't know which particular update was happening around then. It says it in Search Console that around January 22nd, there was an infrastructure update that might affect data. But that seemed quite a large amount of drop to be due to an infrastructure update. I just wondered if you knew about it, that was all. I don't know, offhand. I know we've made some infrastructure changes there. And at some point, we decided to try to show the more relevant pages in the structured data kind of aggregation report. So that might be something that happened there. But I have no idea what the timing was there. I believe it was more towards end of last year than in January. OK, I might pop an email on it, then, if that's all right. Sure. Sure. Is there any risk to having a read more or page two from an AMP page to a full article or next page on a responsive website? Would Google see this as a doorway tactic? No. I think that's perfectly fine. The important part here is really that the AMP page is equivalent to the normal page on your website. So if the normal page has a full content and the AMP page just has like a snippet on top and the read more link goes to the full page, then that would be seen as problematic for us. Whereas if both of these pages have the same block of text on them and the read more link goes to yet another page on your website, then that's perfectly fine. Any change that color themes will be added to Google Calendar, I don't know anything with regards to changes planned for Google Calendar. So I can't really help you there. I believe there's a help forum for Google Calendar. So I would double check there. Maybe they have some ideas, or maybe you can leave your feedback there. Is there any specific limit for keyword density in the content? No, not really. So we expect content to be read naturally. So focusing on keyword density is probably not a good use of your time. Focusing too much on keyword density makes it look like your content is really unnatural and makes it hard for users to read. And search engines generally recognize that fairly quickly and they say, oh, this guy is just trying to keyword stuff their pages, and therefore we will ignore this keyword completely on this website. So instead of focusing on a specific keyword density, I would just make sure that your content is easily easy to read. One really kind of simple trick that I've seen people recommend is to read your content out loud to someone on the phone. And if you can do that without cringing, without breaking down laughing, then probably you have some naturally written content that will work for search as well. Whereas if you can't read this out loud over the phone without like wondering, oh gosh, this is terrible, then probably it's worthwhile to kind of rewrite that in a way that actually works for users. Just a follow-up on that, though. If you're aware that there are some related keywords that you're not using, and maybe you didn't put them in the first pass, presumably there will be an advantage, though, to put them in a rewrite, would there? Them being related, let's say something had two different names, for instance. For the most part, we figured that out. So we know about a ton of synonyms. And if you search for one version, we can show you the other version. But OK, but. So that's not something where you need to do the typical SEO light bulbs and lights and light bulb and all of these different variations. That's not something you need to do. The important part is really that you just talk about the topic that you want to talk about in a clear way. So if everyone is searching for lights and all you're talking about is darkness, for example, on your website, because you're writing about the situation without lights, then obviously, people aren't going to find your page for that. Because we don't understand that people searching for lights are actually supposed to go to this page that's talking about darkness, for example. So that's kind of the situation where you should at least talk about the topic that you want people to find your website for. And this is something that seems fairly straightforward, but I see this quite a lot on especially small business websites where you'll have fancy pictures of cakes and everything, and you'll talk about how awesome it is to kind of go outside and see the flowers. And everything is about cakes and selling cakes in the photos, but it doesn't talk about buy your cake here. Cake and cakes, yeah. Yeah, so being direct about what it is that you're really trying to offer to people with that definitely makes sense. But just dropping all of the different keyword variations doesn't really add any value. What about names? Because one scenario that I've come across is where people have sort of nicknames. So for instance, there's a football here in the UK called Wayne Rooney. And he has various nicknames. Have you got those nicknames in already? You can try it out. That's fairly easy to try out in Search. For the most part, I'm pretty sure that if it's a common nickname for an existing name or for a specific person, then probably we've figured that out. That's a good point. If you just put it into Search and it's coming up anyway, you've got it then. Exactly. Right, thank you, yeah. I am interested in your point of view on the use of AngularJS to deliver just the boilerplate and let the client render the site. You can definitely do that. So if fetch and render in Search Console shows your content normally, then whether or not you're using Angular or React or whatever framework that you want essentially doesn't play a big role there. When you're looking at AngularJS sites, I think the more important part that I see people struggle with is that it's a bit harder to actually debug this on your side because you can't just use the curl to fetch the page and see the full content and kind of double-check that the structured data is there or whatever it is you're trying to do, you really need to render those pages on your side as well, and then you can look at that. Some of the SEO tools have moved on to allowing rendering as well. I believe Screaming Frog does that. I think Spotify does that. Some of the other tools tend to go in that direction as well. So if you're using one of those tools that allows rendering, then I see absolutely no reason why not to go down this route, especially since it can improve the user interface quite a bit and can make the development quite a bit easier. Can Google recognize two different languages on the same website and provide those languages to their respective target groups? Yes. So we try to recognize a primary language per page. So if you have different pages and different languages on a website, absolutely fine for us. If you have different languages on the same page, then we will try to recognize the primary language and understand the other languages on that page too. But it's a bit harder for us to say, well, this is really the page for someone searching in French because we know the page actually has French and English on it. And sometimes this is something that you can't really avoid. So for example, if you have information about vacation homes in Spain and you write about this in English, then all of the addresses and the places you talk about will be Spanish names. So we will automatically recognize there's a lot of English on here, but there's also some Spanish here. And for us, that's something we can generally deal with. But it makes it a bit harder if someone is searching in Spanish for us to say, well, there's a lot of English on here and a bit of Spanish. But maybe a Spanish user would be interested in this as well. It's a lot harder to draw the line there. How to track button clicks in analytics? I don't actually know how to track that in analytics. In search, if you link to something like a tell colon and then the number, that's not something that we would track in an organic search. John, could I have a quick follow-up also in relation to your previous answer? Sure. So would you then not say that in the example with the Dutch and the French website in this case that even though you might be showing the mixed content on both platforms, from Google's perspective, they are less likely to show it since they're not 100% sure of the content is French and of the content is Dutch versus somebody that's pure Dutch and versus somebody that's pure French? Yeah, that's always kind of tricky if you're swapping out the boilerplate. With the hreplaying, that gives us a bit more understanding of what it is that you're actually doing there. So that's, I think, the right approach to take in a case like this. But it's always a bit tricky. Like, in your situation, you know that users in that country will be OK with both language versions, definitely not the case in other countries. So that's sometimes a bit tricky. Would it make sense to, because in this case, the Dutch version, which is the first created, is canonicalizing itself. And then it's creating this copy on the French site. And in this case, the French site is also canonicalizing itself. So there's an alternative tag pointing from Dutch to French to Dutch. Would it make sense to have the unique version, the one that was created first, also have a canonical across the main canonical from the French site, pointing back to the Dutch site? In that case, we would kind of lose track of the non-canonical version. So in that case, you'd be saying, well, this is the one that you want to have indexed. In the other one, we'd kind of lose track of that. What I would do in a case like this is test that out. And maybe take one section of the site and say, this receives sufficient traffic and is from a traffic point of view equivalent to another section of the site. And just try these different tactics out and see, does it work well with the canonical? Does it work well picking one version as a canonical? Which setup actually works best? There's a couple of tests I have in mind. So you say the exact same thing. Just confirms that. Again, I appreciate it. OK, great. Let me try to run through some of the other questions that were submitted. And then we can try to see if we still have a few minutes left. So I still have this room for a bit longer. Maybe we can go run through them. I'm wondering if Google is planning to incorporate an open source for acquiring HTTPS. So I guess this is similar to the Let's Encrypt setup for certificates. I'm not aware of any plan to provide certificates, especially since the Let's Encrypt does this really well. I'm wondering if there's a specific type of HTTPS that would be sufficient for the average website. From Google's point of view, when it comes to search, we just require that it's a certificate that works well in modern browsers. So anything you want to do past kind of this baseline of a certificate that works well in modern browsers is totally up to you. I know some e-commerce sites like that extended validation certificate, kind of that green thing in the beginning of the URL. From Google's point of view, when it comes to search, that's definitely not required. But maybe you want to do that, maybe you don't. Our home page, we're using a 302 redirect to the man-female category pages, which is based on the last scene category installed in a cookie. We assume Google doesn't use cookies. Are we safe? Yes, we don't use cookies. So when we crawl, in general, Googlebot doesn't store and kind of serve cookies again, because we try to crawl in a stateless way, where we look at each URL independently. There's one situation where we sometimes use cookies, and that's when we can't look at the content at all without kind of returning a cookie that we've seen before. And that's really, really rare, and that's definitely not something I would recommend doing. It's more a matter of Googlebot trying to work around essentially a broken setup on a website. HTTP Pass migration, would it make sense to make a 301 old HTTP version to the HTTPS version and then to migrate to www.https? I'm not completely sure how you mean there. So that's something where I double check maybe in the help forums, but in general, the guidelines when making site migrations, from my point of view, is to keep it as simple as possible, because every additional steps that you add in there adds additional complexity, which might break at some point. So I try to keep it as simple as possible, as direct as possible, when it comes to those redirects. Let's see. I think. Oh, here we go. I have a site that was starting to rank. We put in e-commerce and converted everything to HTTPS. All the pages are properly 301 redirected. I've seen a huge drop in rank. And a bunch of lost key phrases. It's been a few months now. Is there any specific issue with the type of site? Not that I'm aware of, that we would say we treat a type of site any differently than we would the website overall. So that's something where what I would tend to do is post in the Webmaster Help forums and see what other people are saying about this. Maybe there is something specific that's kind of a technical thing that you can do to improve things. Maybe there are more quality issues that you can do to improve things. That's really kind of hard to say there. My company produces a fair amount of video data for our clients. I understand Google ranks sites for the volume of new data it publishes and uploads to its sites. I'm trying to understand how to relate these videos, new data to the company website so that Google will take into account all of the new data it publishes. The data currently resides on Google Drive. Let's see. Let's see. Ultimately, the question is, is there a way to connect these requests for the videos to the company website and leave the videos on Google Drive resulting in higher Google rankings due to the videos? Or would the video have to be stored in the company domain and the client request to download directly from there? So with regards to videos on the pages, there are two aspects there. On the one hand, if we can recognize that the video is hosted on the page, we can show this page in video search, which is one way to get a bit more traffic there. On the other hand, just by having a video won't make your site rank higher. So when it comes to normal organic search results, just by having a video, even if we recognize it, it doesn't change your site's rankings. So this is something where I would look at more around, do I want to have these videos shown in video search? And if so, how can I make sure that technically Google is able to pick these up and recognize that these videos are there? From a technical point of view, what you can do is perhaps use a video site map to let us know that this video is embedded on this page and it's hosted when the files are hosted here. Another thing you can do is to make sure that the video embed that you're using is as kind of default as possible in the sense that we can definitely look at that and pick that up. You can't easily test to see if we can recognize that the video embedding is working well, but you can see in the search results kind of the video snippet that we show, or if you switch to video mode and if your pages still show up there, then that's a sign that we're able to understand this connection between the videos and the pages themselves. But again, it doesn't change the rankings in the normal organic search results just by having videos embedded. I'm in the process of moving from HTTP to HTTPS. I want to disallow crawling certain portions of the old site and remain visible in HTTPS. So what can I do there? So in general, the important part is that we can recognize that this is a site move from the old site to the new site. And if you're blocking us from crawling the old site, we can't recognize that there redirects to the new site. So from that point of view, that's kind of where our guidelines come from saying we should be able to crawl your old site as completely as possible. And on your new site, you can do whatever kind of robots text exclusions that you want to do. So that's kind of where our recommendations come from there. If you absolutely need to block crawling from the HTTP site for specific parts, then that's obviously not something we can prevent you from doing. But it does make it a bit harder for us to process kind of this site move in that regard. I was having 26,404 errors shown in Search Console. I fixed them. But by day, the number is lowering, but it's still not completely zero. So what's up with this? What can I do to kind of make this count disappear? The important part, perhaps to keep in mind, is that for a lot of these aggregate reports in Search Console, we look at the current information across the website. So if you fixed all of these errors and we can recrawl them and see that they're resolved in the meantime, then you'll see that we'll recrawl a bunch of them in the beginning. So you'll see kind of a drop in the graph. But it's going to take quite a bit of time for that graph to actually go down to zero. So I would assume this is something that might take a couple of months, right, it actually to settle down into the new position. The good thing here is that 404 errors don't cause problems for your website. We don't think your website is lower quality because of 404 errors. It's completely normal state for URLs that used to exist, might not exist anymore in the future. So that's completely fine, not something that I would really worry about. Just to add to that, also in the event that Google thinks there's 110 million URLs in the index, but the amount of 404 is 186 million? That can happen. I mean, essentially for every website, if it's set up properly, we can find an infinite number of 404. You're right. It's like if you do a bigger site change and you take out an infinite calendar, for example, then that's something we could have known about a ton of URLs that are kind of in this calendar because we conclude to the year 9 million. But that doesn't mean that it would be in any way problematic. We have a lot of volatile URLs. But I was just checking and I thought so. But again, it's nice to hear you say it. Yeah, great. John, on that one, one of the things that does seem to confuse people a bit or concern a bit about Search Console and the crawl errors is that sometimes they're genuine where people really are still links to 404 page on your site. But a very, very large number of the crawl errors that are reported are actually historical ones. They might be from a site that hasn't existed for two years, but Google is still cashed effectively the fact that that site linked to you at some point or other. Do you know if there are any plans at all with Search Console to kind of try and separate that out so at least people can see live, real 404 errors of sites that are existing or pages that are existing that link to their site as opposed to pages that haven't existed for a very long time that link to their site? Yeah, there are definitely explorations that the team is doing there to try to bubble up the more important 404 errors. So at the moment, it's shown with, I believe, the priority information there with the crawl errors, where we take into account things like was it in a sitemap file? Did it receive traffic at some point recently? Is it linked from a bunch of other places and use that to kind of bubble up the 404s that we think are more important? So if you go to the 404 error list and you check the first couple, and they're all like completely random URLs that you've removed a couple of years ago, then you can kind of be sure that there's not something really important hidden below in the list. So that's something we already kind of do there. But I mean, we get this question like. All the time, exactly. Yeah, so that's something that I know the Search Console team has been looking at as well to try to make it possible to really recognize which of these errors are actually critical and which of these errors are kind of like, well, we just want to let you know that we found these errors on your website. Probably you don't really care about these. Yeah, I mean, there's always questions because, I mean, Googlebot often goes off on a bit of a jolly, quite legitimately, and decides to kind of recheck 10,000 links, and then suddenly people get e-mails saying, your crawl errors have gone up, or they're seeing Search Console. But all Googlebot's doing is crawling historical ones, which don't really mean anything. And you see these kind of questions come up all the time. So if there were anything that could be done from the Search Console point of view, that would probably confuse people a lot. Yeah, definitely. OK, thanks. There we go. John, does the phone workforce use the crawl budget? Yes, we do use that with regard to crawl budget. But I think the good part there is that really for almost all websites, there is no issue with regards to crawl budget. Because if there are many phone workforce, does it reduce the crawl budget or something like that? We do take that into account. But the thing with regards to crawl budget and crawling in general is that we try to focus on the URLs that we think are important first anyway. And if we have leftover capacity, then we'll kind of run through the old 404s or kind of double check other things on your website. So it's not the case that it's pushing anything out. And how I work in new URLs, new pages, which are OK? Yeah, and we recognize those normally when we crawl the website. So that's something we try to prioritize there. OK, good one. Thank you. All right, let's take a break here. Next week I'll be in Munich, SMX Munich, if anyone is out there. In two weeks, I believe Gary and Maria are at SMX West in California. If anyone is out there, feel free to come and say hi as well. So probably in two weeks, I'll have to shift the hangouts a little bit, because I'll also be in California. But afterwards, we'll try to get them back on the normal time frame as well. All right, so with that, I'd like to thank you all for joining for all of the many questions and discussions. And hopefully, we'll see each other again in one of the future hangouts. Thank you, John. Thank you. Thanks, thank you. Bye, everyone. Bye.