 All right. Welcome, everyone, to today's Google Webmaster Essential Office Hours Hangouts. Happy New Year, everyone. My name is John Mueller. I am a webmaster trends analyst here at Google in Switzerland. And part of what we do is talk with webmasters and web publishers like the ones here in the Hangout, like the ones that submitted a bunch of questions already. So as always, if there are any of you who are relatively new to these Hangouts who have a burning question on their mind that they really want to ask, feel free to jump on in and ask away. Question. All right. Go for it, Baruch. So Google, the disavow list, there's sometimes domain operators like URLs that you can't disavow. So in other words, the text is really messed up. And one of them being a Russian URL that I'm trying to disavow, but it won't let me disavow because the text, I don't know what it is, but it's just saying you can't disavow the specific URL. And it's like the letter R in their language is different. It gives me an error. Like it won't let me disavow the actual domain. And it's a really like. That should actually work. So it uses UTF-8 files. So if you can generate a UTF-8 file with those letters in it, then that should work. Also, what you can also do is use the penny code version of the domain name. So the xp- and all the weird normal letters in there, you can use that version for the disavow file, which makes it a little bit easier to maintain because at least I don't have those characters on my keyboard. So it's something I'd otherwise have to copy and paste and hope that it works right. OK, I'll try that to just never see that before. I mean, I did go to the who is and generate that and then go back and I don't know. But I'll try that. Thanks. Yeah, those early characters are sometimes tricky. But the same thing applies to all other like non-Latin script. So if it's a Ray-Bake, if it's right to left or Chinese, Japanese, all of that if you want to disavow something, then either copy and paste it and save it as a UTF-8 file or use the rewritten punny code version after the main. Amazing, thank you. I have one more thing I'd like to do with this. As far as I understood, you are not able to disavow redirects. So is there any way to Google that a sub-narrate direct is not good, it's not paid to be. Somebody is talking to you with that doesn't mean someone has some kind of domain problem with the main sub-narrate directs, do you have a site? We hope so. I don't think at the moment there is a way to do that, because disavow tool is really specific to links, which is where we pass those signals. And redirect is really rarely something that we see any problems with, where anything bad is happening. Thanks. All right, let me run through some of the questions that were submitted. And as always, if you have questions in between, feel free to jump on in. And we'll try to make some room for more general questions toward the end as well. All right, first one here is I want to know if there's a difference in timings on how you choose to add a no-index or no-follow tag, like lowercase or uppercase, with a comma or a separate? Which one is recommended? How long would it take for the change to take effect? From our point of view, any of those work, they essentially the same thing, none of them are faster or better or slower than any of the others. Uppercase, lowercase is fine, separate tags, one tag with a comma separated value, that's perfectly fine too. So that's not something you need to do anything specific there. With regards to the opposite, we sometimes hear that as well, like should I put an index or follow on my pages as a meta tag, and you definitely don't need to do that. We essentially ignore the index or the follow tag because that's the default behavior anyway. So for the robots meta tags, we add up the negative tags essentially, and any of the positive tags can be ignored because that's the default already. So you really only need to use no index or no follow or none if you want to combine both of those together. We're a digital magazine active in 30 countries, and each of them has a specific third level subdomain, country.domain.com, with local news by local writers. And it goes on and says we are serving our website through Akamai or CDN, and we get local IP addresses. Is that a problem? Is that something they need to change? No, that's perfectly fine. Lots of sites use CDNs. We can deal with that perfectly fine. With regards to geotargeting, if you're using a generic top level domain, so in your case, you mentioned domain.com, probably as a placeholder. But if it's a .com, then you can set geotargeting yourself, and we will use your preset geotargeting from Search Console instead of anything else that we try to generate. So the local IP address is definitely not something that you need to have. In the age of CDNs, everyone has a local IP address anyway if you're using a CDN, so that wouldn't really make sense. So local IP address is not something you need to worry about. But a lot of people still think that with the local IP, they can rank really well, John. I mean, it's still out there. There's so many articles still about it. Get a local IP, or do this, or do that. Yeah. Yeah, you don't need a local IP address, just to be totally clear. It tends to have an IP address that's close to your users if your server otherwise has a lot of latency, but you can do that with a CDN just as well. Are relevance and quality interconnected? Like if I improve the relevance of a specific piece of content for the users, does it automatically improve the quality of my website, or is there no direct relationship between the two? So this kind of goes into how does Google actually rank things. And a lot of that we can't really tell. We don't talk about those details, but relevance in particular is specific to the user and the user's query. So we see these things as kind of being separate values that we look at separately. So separate kind of ranking factors that we would look at. We would say, perhaps for this user, for this query, the relevance is kind of like this, and the quality of the website is kind of like this. And another site might have higher relevance and slightly lower quality. And understanding how the end rankings come together is something that's really tricky, especially if they're conflicting like that, like high relevance for the query, but low quality or lower relevance, and higher quality website. How do you rank those two? That's something where there isn't really an absolute answer. So that's something that we're always working on to try to improve. And it's not just relevance and quality. We look at a whole bunch of different signals, and we have all of these different graphs. And combining those and figuring out which one of these factors is more important at the moment for this user's query, that's really tricky. And that's kind of what makes search interesting, because this can evolve over time as well. Do localized words in the URL influence ranking for local websites? For example, having URLs with French words, for a website with French content, will that somehow influence the ranking in France compared to websites with English words? No. For the most part, that won't influence the ranking of a website. We do look at the URL a little bit to try to understand what this site might be about, but we look at other factors much more. So if you have the keywords in the URL, or if you have an ID, or if you have an English word in the URL, that's not really that critical. For users, that can sometimes make a difference, though. So if they see your URL in the search results, or they see your URL written out on a piece of paper, or in a forum, or somewhere else, and they see the URL, and it's all written in English, and they're French-speaking users, then they're probably going to go, this looks like an English website. Probably I won't click on this. Probably I won't try to go to that website, because it looks like something that's not really that interesting to me, maybe in a language that I don't fully understand, or maybe I just don't like English websites at all. So trying to balance what users are looking for and what makes sense on your site is something that always kind of is something webmasters should focus on, so not specifically for SEO, but kind of think about your users and how they would feel when they see this URL. When you sometimes say that the quality algorithms don't check something on a page-by-page basis, like the amount of single clicks for a page, so that mean you might check that on a site-wide level, maybe, maybe not. It really depends a lot on the different algorithms and on the different factors, so that's not something where I'd say, well, if I say it like this, then it means this, exactly. There are lots of differences there, but especially when you're looking at clicks, that's something where we primarily do that to understand which of these algorithms are working well and which of the ones we need to help improve on. That's a change from what you used to say. Now you added a lot, but I don't know. What second level uses? What did I say? Uh-oh. Okay. I'm not really sure what you're referring to, but... You said primarily, that's what primarily you think you use it for. That means about some second that you think you use it for, too. Because otherwise you would have just said that's what you use it for. Yeah. I don't have any more specific details to add there. That's something where we use a lot of these factors in different ways. So if we primarily use it for something like this, then that doesn't mean that other aspects are kind of ruled out. And especially when it comes to understanding which algorithms are working well, we try to take into account a lot of different things. Because just looking at the raw numbers for any algorithmic change can sometimes be misleading, where, for example, if you show, I don't know, Pokemon figures to someone who likes Pokemon Go, then they'll always click on those in the search results, even though they might not be really good search results. So you can't blindly look at just one metric isolated from everything else. How can you separate between a website that's a really popular website and a website that's not popular at all but a real authority in its specific niche? That kind of goes into the previous aspect as well, where we look at these different factors and we try to figure out, based on these different factors, how do we show this for individual queries, for individual users? And that's something that can differ over time. That can also differ across users, across location, personalization, all of that can kind of come into play. So this is something where there's no single, easy answer for understanding how we differentiate between those two. So it's not really that I have a clear answer that I can give you there. Can you tell us something new? SEO tips for 2017? I don't know, something new, so. I love it. I don't know. All I have is like the year. You should put the year on your website so that users know when to click on your pages that it's a current website. Time stamp is very important. Time stamp on articles is very important. I mean, I get frustrated to know how long ago this article was written. So having your time stamp with a date is important, no? I find it as a user really useful to know how old a piece of content is, but from a search point of view, we look at so many different factors to try to understand the age of a piece of content that just putting a kind of a date stamp on there isn't really going to change anything. So SEO point of view, probably less an aspect from a usability point of view, definitely a good thing from my point of view, at least. So more markups, then? Markup. I mean, you can mark that up as well, but it's something where we probably pick it up from other places already. Yeah. All right, we have some blog content which we currently reverse proxy to appear in a subdirectory of our main site, so example.com slash blog. This is seemingly a best practice according to SEO people we've talked to. We'd like to host the blog on a separate subdomain like blog.example.com to avoid cookie stealing attacks if the blog software is compromised. So what's the best practice here? From our point of view, you can use either one. You can use a subdirectory, you can use a subdomain. Both of those will work. We'll be able to deal with both of those and be able to rank those appropriately. I don't think there's anything from an SEO point of view where I would say, this is the way you should do it or this is the way you should do that. Sometimes there are clear technical reasons to do it one way or the other and that's obviously what you have to work with. So in your case, if you're doing this for security reasons, that's a good reason to do that. So that's definitely something that I consider a positive change here. The main thing I'd worry about here is really just that you're doing a kind of a partial site move and these things always take a bit of time to settle down. So I would do something like this during a time when you're not really dependent on the search traffic specifically through that section of your website. So in this case, you're moving your blog to subdomain. So maybe that's not as critical, but if you were moving your e-commerce site or your main marketing landing pages to different subdomain or different sub-directory, then obviously that's something that could have an effect on your overall site's traffic from search. So that's something that I'd consider and past that. If you're already talking to SEO people, then I'm sure you're following the best practices that we have in our help center with regards to site moves and partial site moves. So I double check that all of that is set up properly and do that at a time that makes sense for you and just get that done. So especially if this is a security issue, if this is something that you're doing to improve the quality of your website, the security of your website, then that's something that I would say overrules anything that's like maybe potentially an SEO factor where you have like a 10th of a percent more or less. Those issues are much more important than anything from a SEO point of view. As an e-commerce affiliate website, I understand we need added value to help our site rankings. We're thinking about displaying price comparison information for renting our product compared with renting the same product from competitors, which we believe is useful for the users. Do you agree this would be classified as added value and how best should we display this information for Google to recognize its usefulness and what we're doing to give it a benefit from? So from Google's point of view, from an SEO point of view, if we can recognize that these pages are unique, then that's what we're looking for. And the kind of indirect effects of providing extra value for users, that's something that you would get from talking to your users and understanding how they respond to this. So these kind of things are probably changes that aren't trivial to make on your website. So I definitely maybe set up a small user panel and discuss this change with them, go through different options with them, maybe even do some live A-B testing on your website to figure out how you can best do it so that users actually appreciate those changes so that users are willing to send their friends directly to your website rather than to one or the other websites that's offering the same products as an affiliate. Hi, John, can I just jump in with a quick question? Sure. Mobile interstitial penalty, I think, went live today or the blog about it got updated. How long do you think before we'll start seeing the effect of that? Will there be some kind of propagation or will it basically be seen immediately? Usually that takes a couple of days to kind of propagate across the different data centers. So I would expect that over the next couple of days you'd start seeing these changes. So major fluctuations, major. Sorry? Major fluctuations? Major fluctuations, I hope not too much major changes because we announced it quite a while ago. We double-checked some of the sites that we were going to flag or that we're going to see changes from this interstitial change and they did fix their interstitials. So that's kind of a positive change and hopefully something that helps users to actually get to the content a little bit faster. Hello, John. Looking at internal linking reported in Search Console, would you say that the homepage should be the most linked to page within your website? If not, how visible should it be? I don't actually know. So I pretty much never look at the internal links feature in Search Console because that's something that from my point of view just kind of works for most websites. I think it was different in the early years of the web where people built sites on their own and maybe they kind of messed up the internal linking sometime because they were doing the pages, static HTML pages that lived on their own. But nowadays if you're using a common CMS then that CMS is already pretty much crawlable from the beginning and not really something where I'd say you need to worry about the internal linking of all of your pages. So for the most part, I wouldn't worry about this too much. I assume in most cases the homepage would be the most linked to page on a site, but I wouldn't fret too much if it's not listed as a top item within the internal links feature in Search Console. I have a related one here. OK. Some are related to the system. What about external links? I discussed this on the stage and he told me that it's expected to have most of the links going to the homepage, which sounds not really. But what happens if in a certain case this is not true? Like you have a few very popular inner pages that have more big links to the homepage. Is there any analysis or anything wrong about this? Yeah. So I guess the question is, what about external links? Is it a problem if I have more external links going to a lower level page than to the homepage? And from our point of view, that's perfectly fine. That's not something you have much control over. Sometimes there are lower level pages that are just really popular. And everyone is linking to those lower level pages and they don't link to your homepage. So that can happen. That's not really something I'd really worry about. Thank you. Hello, John. Yes. This is Jesse. Can you hear me? Yes. Thanks for inviting me to the Hangout. I'm working on thestar.com, the Toronto Star. And we've been having an issue since about September where this is about five months after our migration to HTTPS. I'm not sure if that plays a role, but a lot of our brand terms when we're searching are bringing up thestar.ca, which is a domain which fully 301s and is canonicalized to thestar.com. And I can't figure out why our homepage in particular would be turning up for a search like thetorontostar.com with thestar.ca instead of HTTPS, thestar.com, when we've done the 301s and the canonicals. And I checked the Google cache for that page and it looks like Google sees thestar.com. So I'm just trying to figure out what steps we can go to to try and correct that situation. Because everything down below the homepage seems to be like our subsections, the sports page, the GTA page, these kinds of things are properly showing the HTTPS.com versions. Yeah, I think you posted in the forum as well a while back. Is that possible? Yeah. So I looked at that specifically with the team that's working on the internationalization. And from their point of view, that's something that our algorithms are picking up at the moment. So it's not something that you'd be able to directly influence on your side. We're basically understanding that these URLs are equivalent. And we think the CA version is the one that users would be more interested in seeing in the search results. And from the feedback I've been getting from people, that's not always the case. And in your situation with HTTP and HTTPS, that's probably even more, so less the case that you'd want to see the CA version in search. So I've been in touch with them to see if there's something that we can do to improve that on our side. I don't have any update at the moment. But I know they're looking into that issue. As a Toronto resident, I prefer .ca, so that's fine. Yeah, but it's not on HTTPS. Yes. So there's nothing we can do on the .ca server, like signals I might be able to look for. It's purely just something that we have to wait for you guys to look into. Yeah, yeah. OK. All right, thank you very much. Sure. New York Times is also not an HTTPS, so I don't know. I mean, is that a problem for 2017? Well, I mean, if you see the non-HTPS version in search, then you might think, oh, this is not on HTTPS. So you want to have the HTTPS effect there, too. So I totally understand this desire to have that swapped around. I think in your situation, I continue to ping the forum thread every now and then to make sure that it doesn't fall off our radar. But it's something that I've been meaning to double check with the team as well on our side. OK, thank you very much. John, just quickly on the HTTPS thing, a bit of a random question, but you may have answered this before, but things change. Would having an HTTPS site versus an HTTP site help with Panda issues? So if Google is looking at a site and thinking that there are some quality issues, the reason I ask is that you often kind of ammit singles list of things that the average person and hence Google would consider as things that would help you trust a site. And one of them is would you put your credit card details in and that kind of stuff. And obviously, HTTPS is one of those things, so just out of interest. I don't think we would use that for Panda. Thanks. All right, let me run through some more of the submitted questions, and then we can open up for more general Q&A from your side as well. I heard you say that structured markup is in the ranking factor. But how will Google view a website? A, that doesn't have any structured data. Versus one, that has a lot of structured data. Doesn't this give Google more information? For the most part, we really don't use the structured data as a ranking signal. Sometimes it helps us to understand the concepts of a page a little bit better to understand the relevance based on what the user is searching for and what you have on your pages. So it's not so much that we would rank that site higher, but we would understand it a little bit better and be able to show it for the more appropriate queries, which can indirectly, of course, have a ranking change as well in that it would be visible for those queries. But the primary element with regards to structured data is really the visible aspect with regards to rich snippets, rich cards, the structured markup that we would show in the search results. So that's something where the primary visible element would be there. Can subfolders get a penalty, for example, if slash FR compared to the whole website? If so, can the messages get lost as we have a website where two geotargeted subfolders have seen big drops in traffic? Yes, subdirectories, subdomains can have manual actions as well. That's something where we try to be as granular as we can, but we can't always be so granular that we do everything on a per page basis. So sometimes we can isolate a subdirectory or a subdomain and just say, well, this problem is only visible here. A really common example is hacked content, for example. So sometimes only one subdirectory will be full of hacked content. And we can say, well, we can isolate this here. We can kind of filter that out. We will show that in Search Console under the manual action section appropriately. So that's something that can happen with regards to messages. I'm not so sure how messages get back delivered. So for example, if your site has a manual action now and you verify your site for your first time a year from now, would you still get that message? I don't think you would get that message there. I think there's a time frame where you would be able to collect these old messages and otherwise you wouldn't. But you would always see that in the Search Console manual action section. So if you verify a site and you look at the manual action section, then that's where you see the current status of a site, regardless of when that manual action actually took effect. So in this case, if you suspect there's a manual action on a subdomain or a subdirectory, you can verify that in Search Console, double check the manual action section in Search Console. And if there's nothing there, then there's nothing there with regards to manual actions. Hello, John. Yes. Sorry, I just see a question from Mihai, which I had intended to ask. This is regarding the start.com and the start.ca. He asks if I've tried to use the site move feature within Google Search Console. And I did go through the steps of using that tool without actually hitting the fourth step, saying submit this. But my question is, I'm unaware of that tool ever being used in the past, like years ago, when the start.ca did have content on it. I don't think anyone would have kind of clicked that button. But is that something that would be risky for me to try? Because I know when I get to the fourth stage, it basically says, would you like to move all the start.ca over to the start.com? And I'm just not sure if there would be any unanticipated results from trying that. I can't think of anything negative that would happen there. I don't know if that would fix your issue, but I don't think anything negative would happen there, because your content isn't actually index under .ca. It's index under the .com version. So that's something where you wouldn't, or at least, from my point of view, you shouldn't lose anything when doing that. I don't know if it would fix the issue, though. So that's something you could try that. You can cancel that if you notice that something weird is happening. I don't know. With a bigger website, I'm kind of hesitant to kind of just randomly try things out and see if it fixes something. OK, thank you. Let's see. Regarding to last week's adding home to breadcrumb query, to which you said this has no direct SEO effect, but if you feel it aids the user, then feel free to implement. Off the back of this comment, my response would be if it's such an addition benefits the user, then sure, it would have an SEO benefit, would it not? Let me just double-check. So I think that the general question was, should I add a home link to my breadcrumbs on my site? And I think that that would still be the case that it would primarily be something that users would see and not really something that search engines would kind of worry about, because we'd have other home links on those pages in general anyway, and we'd be able to understand which is the home page of the site. So that's something that I think would really just affect users. And of course, indirectly, if users are more happy with your site and they can deal with the UX on your site better if it's fast, if it has the content that they're looking for, then indirectly you might see an effect with regards to search in that these users might be recommending your site more as well, but there's no direct SEO effect from doing that. So this isn't something that I would do just with regards to hoping that you would see jump in rankings, for example. I have a query about a site domain, site migration, old domain to new domain, and the requirements and effects of non-exclusion. If a site goes from one domain to another and happens to redirect sitemap, XML, and robots text as well, I've heard that you shouldn't do that. You should server 404 or something else. Is that a problem? So in our site move documentation, we do recommend not having a robots text file on the old domain so that we can follow all of the redirects to a new one. In practice, sometimes that's easier said than done because you have everything maybe on the same server and you are serving the robots text file anyway for the old domain, or you're redirecting to the new robots text file. So that's something that's sometimes harder to actually implement. But having a robots text file that returns 404 on the old site helps us to be able to really crawl everything from the old site. So that's something we see as a best practice. I don't know if, from a practical point of view, you would actually see a big change in the activity with regards to a site migration by having the old robots text file return of 404. The site map file, we do recommend submitting a site map file for the old URLs as well with a change date based on the date when you set up those redirects. Because having that site map file like that tells us that we should recrawl those old URLs. And when we recrawl those old URLs, we'll see a redirect to the new URLs and we can process that a little bit faster. So that's a good thing to have there. So if you have that on the old domain or on the new domain, that doesn't really matter so much, just a site map file for the old URLs. And that's something you can remove over time. When you see that we've recrawled everything and migrated your site, then you don't need to keep that site map file up anymore for the old URLs. Why are URLs marked as crawling errors in Search Console and how long does it take until these URLs won't be displayed anymore? Should 410 URLs be displayed with special pages for users, which tell the user the requested URL was removed and offer links to other pages within the domain? So this is, I think, a pretty common question in that we show these as crawling errors in Search Console. Every time we try to request the URL, that returns an error from your website. And we do this with the idea that you look at these errors and you try to figure out which ones are actually accidental 404s or URLs that you thought should be returning normal content but are accidentally returning your 404. But having URLs that don't exist that are broken, that shouldn't work, return crawl errors is perfectly fine. It's a natural part of the web. It's a sign that your website is built properly, that we can crawl URLs that don't exist. We get an error from that. And we know we can move on from that. We don't have to kind of figure out what does this mean? Should we drop them or not? It's a clear sign for us. So having 404s is perfectly natural. It's not something you need to suppress. It's not something you need to fix. If these URLs don't exist, they should return a 404. 410 or 404 is essentially the same thing for us. We process 410s a little tiny bit faster, but in the long run, it's the same thing. These URLs don't exist, so we can ignore them. If you don't have any special setup for 410 URLs, then that's fine. I would just use a 404 URLs instead. You don't have to put any special text on these pages. If you're seeing that users are going to those 404 pages, then obviously make sure your 404 page is useful for the user so that they don't get lost and kind of bounce off to some other site so that they can still find something relevant within your website if they happen to stumble across a URL that doesn't work. Yeah, I think that's pretty much it with regards to 404s. We have a couple of blog posts as well on best practices for 404, so I kind of look into that if you're curious on what you can do there. Could you please advise us on the best way to set up O embed widgets on third-party sites, which can be clearly understood by Google's search bot? We're currently, we've rendered the page in Search Console and can't see anything wrong with it as we aren't blocking any resources, but just want to know if I'm missing something else. So I wasn't quite sure what you meant with O embed widgets, but this looks like just a normal kind of JavaScript widget that you can embed on a website that pulls content from another site. So from our point of view, anything that you would do with normal JavaScript-based content applies to this as well, in that we need to be able to crawl the scripts that you're embedding so that we can process those scripts. We need to be able to crawl the URLs that you're fetching. So if you're using any JSON feeds, if you're using any Ajax requests, then we need to be able to access those feeds, those URLs as well, so that we can actually render those pages. That means they need to be allowed for crawling so that we can actually crawl those URLs, request those URLs, and see what's there. And then if you're using fetch and render within Search Console, you should actually see us rendering those widgets as well. And that's a sign that we'd be able to pick those up and actually use those for search. So that's kind of the end state that you'd be looking at is whether or not they actually show up in fetch and render. With regards to anything more that you need to do there, that's pretty much it. What you wouldn't see is you wouldn't see these widgets in the cache view of the page when you're looking in the search results. In the cache view, we generally only show the raw HTML version of a page. We don't show the rendered version in the cache view. So that's something, if you're looking there, you probably wouldn't see that there. If you're double checking the rendered version of the page and that looks OK, then probably things are working OK. Have some questions regarding hreflang. Our website is translated into 15 languages and we've recently implemented hreflang tags. Soon after implementation, they appeared in Search Console and now we have 40,000 hreflang tags out of which 230 have errors. But when I check some of these pages, they seem OK. What's up with that? So this is something that I'd say is kind of normal in the sense that with hreflang, we need to be able to confirm it from all sides. And if there's any page within that pair that we haven't recrawled since you've added hreflang links there, then you would see this error. So if you're looking at a total of 40,000 hreflang tags and 200 that have errors, then that seems to me kind of like a reasonable ratio there in that probably just these haven't been recrawled yet. And as we can recrawl them over time, then these errors will probably settle down. Some of this might also have to do with regards to things like flukes during crawling, where maybe we get a 404 instead of the actual page content at some point. And then we would also flag that in Search Console as an issue with the hreflang report. But that fixes itself as well when we recrawl those pages. So this is something where I think from the ratio, in a pretty good state, I doubt for a larger site with a larger amount of languages that you would be able to get this error count down under below, I don't know, maybe 10th of a percent or so. That's probably really hard, because crawling errors can happen from time to time. And we deal with that normally. It just means it breaks the hreflang link between one language and the other language. The other language pairs are still maintained. So we would probably flag that in Search Console. But I wouldn't see this as being too problematic. Let's see, when it comes to large Amazon affiliate price comparison sites with tens of millions of product pages that utilize a lot of duplicate content, does Google treat duplicate content as one of the major negative signals? In our case, Google completely stopped indexing our price comparison site. We also ended up losing more than 50% of the traffic since the Penguin update. In your opinion, what would be the maximum rate of duplicate content that could be tolerated by Google without any ranking changes or quality change signals? So we don't have a maximum rate of a loud duplicate content. I don't think that would make sense for most situations. So from our point of view, we really want to make sure that your site has significant, unique, and compelling content of its own. So that it provides a value that's past just aggregating feeds from a very variety of different affiliate providers. So that's kind of the primary thing that we're looking at there. And when our web spam team looks at a website and says, well, all of this content is essentially just regurgitated content that we have already, then they might choose to say, well, it's not really worth the time and effort to actually crawl and index any of this website. And you might see changes like that. Or when our algorithms look at a site overall and they say, well, overall, all of this content is essentially the same as stuff that we already have before, then our algorithms might say, well, maybe we can deprioritize the crawling and indexing of the site because they're like gazillions of URLs and there's very little value overall. So it's not really worth our time to spend too much on this. And for that, there's no hard threshold where we'd say five pages are good, 10 pages are bad. When you really look at things, try to look at them overall across the whole website. And if you're aware that your website is kind of in this fringe area of having a lot of duplicate content and really low additional value add, then that's something you should work on regardless of any changes that you're seeing in search already. That's something that will already be kind of like a potential problem that's just waiting to bite you. So really make sure that your website has something that's unique and compelling on its own so that even without any traffic from search, you would be sure that people would be sending their friends to your website because you have something that the other websites don't have. John, can I just quickly pick up on something you said, which was that you said the web spam team might look at a site and kind of have a view on it and then perhaps bias things in a certain way because there's no quality content on there. When they did that, would that actually be a manual penalty, or is that something that they might do in the background that you wouldn't know about, do you mean, just to be clear? No, that would be a manual action. OK, thanks. You would see that. I believe in Search Console, it's called Pure Spam. You would see that like that. It's OK. I just wanted to just clarify. It wasn't something that they would do that you wouldn't then know about. Whenever the web spam team would do something like that, it would always be flagged to you so that you would know about it. Exactly. If it's a manual, then we will tell you about it. Thanks. There are probably some manual actions that we would take that we can't directly tell you about. That includes things maybe around copyright or legal issues where maybe there are aspects involved that make it a lot more complicated to actually flag that. But for the most part, if it's a manual action, if it's based on web spam, then we would be able to tell you about that. OK, and in those rare cases where they have to do something which they can't tell you about, would they presumably regularly review that? Because if a site did something wrong, I mean, presumably it would be quite a major thing that they would have to do wrong to get that kind of action against them. But then they stopped it. Presumably the web spam team would be checking back at some point to make sure that had then disappeared. Yeah, yeah. Thanks, OK. All right, another question here. We fully 301 the domain shown prominently for brand searches instead of our proper HTTPS domain. The 301 homepage is shown as position number one. I think this is the Toronto website. So I need to double check on that. With the move to the mobile index, how will this impact site architecture? Is this change only around content extraction, or will it also be used to build an internal link graph within the site? Yes, we'll try to actually switch completely to using the mobile version of the site for crawling, indexing, and ranking. That's something that I think won't be that trivial of a change. For a lot of sites, if you're using a responsive setup, then that's already kind of possible. That would be something that probably wouldn't change that much for a website. But if you have a separate mobile site, if you're using dynamic serving to serve different content to mobile users and desktop users would see, then obviously that's something that could play a role there. And I think this is kind of a challenge that a lot of webmasters will face in the coming year in that a lot of the SEO tools are focused on the desktop version of your site. And if you want to crawl the mobile version of your site, it's sometimes a bit tricky to actually do that. So that's something where understanding how the mobile version of the site can be crawled, understanding the content on the mobile site, the internal linking, the embedded content, images, videos, making sure that all of that is really kind of up to the right standard that they would have for the desktop site. That's something that I suspect a lot of people will be kind of working hard on. When do you decide, like, OK, this is it? I mean, when should we all SEO stuff? I mean, we're already shifting to the mobile world, but when should we just stop using desktop? I mean, as soon as the launch date begins, I mean, you shouldn't focus on that. Yeah. I don't have a launch date. So I don't think this is something that will be happening in the next couple of months. No, no, I'm not talking about the launch date. I'm not allowed to. I don't think it would make sense to stop focusing completely on desktop for most sites. So you'll always have some amount of desktop users going to your site. And as long as you have those desktop users around, then it makes sense to actually make sure that your desktop site works really well, too. So it's not that you can just drop desktop completely and only focus on mobile, but rather maybe you'd be shifting subtly, kind of like what we're doing with the mobile index. So we're noticing most of our users are actually on mobile, so we need to make sure that the mobile search results are the ones that are reflected well in search. John, both on, I think, with Google.com and then Google Page Insights, those are the... I don't know. Yeah. There's a page to create it for, I think. If you go ahead. John, thanks for answering that question. It was actually me who put the question up. Long time, no see. The use case is that it's actually a very large, national newspaper, very large site, millions of pages. And they're looking to make a mobile app, and they're trying to refine their mobile app to simplify it. But the issue is that maybe it's difficult to message to them that this could have implications because they could lose substantial traffic. If they simplify something and their navigation is much more simplified, where they're only showing the latest 30 stories, rather than showing thousands of stories on a desktop site. And that's why I'm putting the question up. And then the follow-up question really is not so much as when do you think this is going to come to fruition, but are you guys going to message the implications of this? Because it sounds like it could have quite substantial implications if you are running a separate site, as you mentioned. Yeah, so these are all things that we're still working on. So at the moment, we have a really small set of sites that we're trying to see how the implications are. And we're trying to see how things settle down. Can we get all of the images? Can we get the structured data? Can we get the videos from those as well? And these are all things that we're trying to figure out where the problems will be. What we can do on our side to make sure that you don't have to do as much of anything to kind of handle that. And where our limits are, where we have to say, well, we need to get in touch with the webmasters when we see this kind of scenario come up and let them know ahead of time that this is going to be a problem and give them some feedback, maybe some tools even, regarding how to fix that. Do you expect that there will be a period where you guys are actually giving some best practice around this before you actually launch? I definitely think so, yeah. OK, cool. Thank you. So I'm working on a set of FAQs to kind of capture the questions that we've been getting so far, so that people have a bit more information about that. And past that, I think it really depends on how we see the experiments running and where we kind of see the limits of what we can do on our side to kind of work around the issues that a lot of mobile sites have. OK. John, on that subject, how are you dealing with issues like the way that people use mobile as different to desktop? So it's one thing to generalize and say, you know, we've reached a point where far more people are searching on mobile than they are on desktop. But for example, when you kind of analyze an e-commerce site, you'll find that mobile will convert way worse than desktop will convert. And I appreciate you need to take attribution into account. And often people will kind of initially browse on mobile and they might buy on desktop later type of thing. But from our point of view, I mean, we have far more desktop users and tablet users than we do mobile. And we have far better conversion on desktop. How is that site going to be taken into account? Because it's also saying, oh, we're now going to kind of optimize for mobile. But that might mean that a lot of sites will get mobile traffic, but the wrong sort of traffic, if you see what I mean. I don't think they should get the wrong sort of traffic. I don't think the type of traffic should be changing there. The tricky part I think would be if you have different content on mobile than you have on your desktop. So in the case like you mentioned, maybe you have like lots of long form text on desktop because that converts well on desktop. And on mobile, you have like the short form text or the short forms to kind of fill things out on mobile. And if we switch to the mobile index, then maybe we wouldn't have all of this long form text content, which could be something that we might miss for ranking and indexing. And those are kind of the things that we're looking at at the moment, like how do these sites actually differ, and what would the effects be for our search results? Because obviously from our point of view, it's not optimal either if we don't show a good site for queries where we think this would actually be relevant for the user. Yeah. I mean, you could argue that a desktop will generally provide more information. And that information might actually be obvious to Google when it comes to indexing those pages and understanding what they do. So it sounds like you've still got quite a lot of questions to answer before you do this. Yeah. I mean, on the one hand, that's definitely the case. But if we see that most of our users are on mobile and they click through to the mobile version of your site and your mobile site has less content, then they would see the less content anyway. We would almost be misleading them by showing them the desktop content in search, which they wouldn't see on their mobile devices. And I did get that. But I wasn't talking about it. I wasn't trying to manipulate in that kind of way. It was more Google, presumably, looks at sites and looks at pages. And outside of purely what someone searches for, Google is trying to establish what it is that those sites are actually doing. And so the extra information that you would find on a desktop site might actually be quite useful to Google to actually understand what that page does. I'm not talking about querying, actually. Just simply to understand what it is that site's actually doing. Definitely. I mean, these are all things that we have to figure out and learn which mix is the best one. And even if, I don't know, in the worst extreme case, it might be that, well, this idea isn't ready yet. Like the web isn't ready for this kind of shift yet. So maybe we have to shell this and look at this in a year later or whenever. But these are things that we'd like to experiment with them and make sure that the metrics come out well, that it works for users. It works for us as well. So you'll definitely be hearing quite a bit from us around all of these things. I would add confirmation here, if you can. We're using the responsive design. That is nothing to worry about or anything like that. Basically, same content. Yeah. So I think the response design is pretty much the optimal situation, because regardless of which device we use to crawl the site, we will see the same thing. We'll get the same content, the same structured data, the same videos, the same images, the same internal links. It's kind of all there. So that's kind of the optimal situation for us. That's not completely true. I mean, a good responsive site will adapt to the size of the screen. So for example, as something gets a bit smaller, there might be some text that gets smaller, or there might even be some text that disappears. So presumably, even if you have a responsive site, there still will potentially be some things that you need to have a think about. Yeah. I mean, for the most part, we can pull that from the HTML anyway. If it's responsive, then it's just a matter of the CSS kind of like hiding it or showing it in different ways. And that's something where it's still in the HTML. We still kind of have a chance of getting that content. Sorry. Yeah. Whereas if it's a different site, then that's obviously something we might not even see at all. I understand. But even with responsive, so I mean, I'm possibly getting this completely wrong, so forgive me. But at the moment, for example, if you have things that are hidden, I mean HTML, but they're hidden, you roll over them, they appear. They're kind of standard JavaScript, CSS type hides. That in many cases, Google will give them a lower significance, or will ignore them completely. But then of course, once that site gets looked at with a mobile-first index, then Google will have to look at those things and say, actually, I might not have to give them a lower priority anymore. So there must be some changes around that kind of space, presumably. I think you and Gary have mentioned that sort of thing before. Yeah, with regards to kind of hidden, invisible content, we are looking at that a bit differently when it comes to mobile. OK, so responsive sites might see a slight difference, but we'll probably learn a bit more about this maybe when you publish your FAQs. Definitely, yeah. Thanks. All right. Yeah, it looks like it's finally the year of mobile. So with that, I need to take a break here. It's been great having you all here again. I have the next hangout set up for Friday in English and Thursday in German. I think it's Friday the 13th, right? So that would be interesting. Yeah, thank you. Morning, thank you. Yeah. 13th is the new launch of Penguin Part 3. I have nothing special planned. So hopefully no crazy launches, nothing for you all to worry about, sorry. Or maybe that's a good thing. So with that, let's take a break here. Thanks again for joining, everyone. Thanks again for all of the questions and the discussion. And see you in the next time. Bye, everyone. Thank you for answering our questions.