 All right, welcome, everyone, to today's Google Webmaster Central Office Hours Hangouts. My name is John Mueller. I am a Webmaster Trends Analyst here at Google in Switzerland. And part of what we do are these Office Hour Hangouts, where webmasters and publishers can jump in, ask any questions that they're interested in around web search, and we can try to bring the answers. A bunch of questions were submitted already. But if any of you want to get started, feel free to jump on in. Yes, John, can I start? Sure. So in a previous game note, we have discussed a situation with a 302 hijacked. And the loophole seems like very serious. And I have sent you an email with all the details. But unfortunately, I think that maybe it went to your spam folder because of the links, et cetera. So by that time, yes, it's about a serious loophole called Google 302 hijacked or Google proxy hack. And it appears that affects not only me, but many other people. It's about a content being mirrored, not duplicated through copying or something like that, but mirrored in i-frames or some other kind of frames. And Google bought this food that this content is the original content. The cached version of this new domain name shows my original domain name. And the content is cloaked, so I cannot send any kind of DMCA takedown complaints. The problem is that this new entity starts to rank for some reason. And as I said, the cache shows the original domain name, which is even more confusing. I have found three tools that can do that. And basically, they are below maybe $50 or something like that. So anybody can download and use these tools in order to achieve something like that. And in our case, it's about the website that's 900 pages and all the rest of the victims, which I so have a potentially similar situation. So I have tried almost anything, but from everything that I read online, it's about something that we have no control of. So Matkut said in 2011 that apparently this loophole is fixed so nobody can use it anymore. But these tools are sold online, and apparently they are used. And the offender in that case has hundreds of domains in his disposal. And on phase two, he uses that content. Now he actually shows it to the visitor on another entities. He mixed it with his own content monetization and that content ranks and these entities do receive traffic. If you share me your email again, maybe I have mistaken it for some reason, I can give you these tools. So my idea is maybe you can try to look what these tools are doing in reverse engineering and close that loophole. It's not about me personally, that example. It's about closing something that potentially can be abused by people. And in this example, he's using successfully by the offender. Sure. I copy my email into the chat. Feel free to send me some of the details there. I'll take a look at that with the team. In general, these kind of things are fairly old, and usually they shouldn't be causing any problems. What sometimes happens is if you look at the catch page, it will show the canonical that we actually choose, which would be the one that we would actually be showing in search. So it sounds like some of that might be working as it should be, and some of that might be confusing. So happy to take a look at that with the team again. OK, I can also open up the Google please. Yes, cool. All right, so I should have lots of information there. That sounds awesome. Cool. All right, let's jump in. Let's see, where do we go? Is unnecessary white space above the fold pushing content down a problem, possibly algorithmic penalty worthy? The white space is caused by a site design, not intentionally to push the content down. What screen size does Google use to determine the above the fold for mobile and desktop? White space like that is absolutely no problem. For us, it's more of a problem if above the fold is all advertising for other sites, where actually you go to that page, and you think, well, why did I end up here? There's none of the content that I was looking for is actually visible on this page. So that's more of an issue, not white space or design elements if you have a big header image, those kind of things. Generally, those aren't a problem for us. What is Google's view on site-wide, sidebar links? Is this good or bad for SEO, especially with sites that have a lot of pages? That's perfectly fine. That's a UX element that you can use on a web page. And if it works for your web page, if you've seen that users are able to navigate your site that way better, that's perfectly fine. It definitely doesn't cause a problem from our side. As with all of these things, it's worth doing some A-B testing for your users and focusing on what works well for your users rather than kind of changing your site's navigation just to match what Googlebot might be interested in. Because generally, if it works well for users, then that's something that's really important for us, too. All right. And a long question from Chris about Android headlines. We took a look at this site a few times. I think we touched upon it in one of the previous Hangouts. And I believe you have a forum thread as well about this, where other people are chiming in. In general, I don't see anything really problematic here. Essentially, these are search algorithm changes as they change over time. And if you're seeing changes that have happened over, I don't know, the last year or so, then generally, those are just normal organic search changes as they would always happen. And that's something where there is nothing specific we'd be able to point out and say, this line in your HTML is a problem for us, or this is a bug in our code. It's more a matter of us kind of ranking the site as we think it's generally relevant. And some of that might also play in with regards to how we think the site overall is relevant. So especially, I don't know if this is something that you're doing, but a lot of these kind of phone technology sites seem to be doing that. If you're aggregating content from a lot of other sites, if you're just rewriting news that have appeared on other sites, that's something our algorithms might not consider to be the highest quality content. So the more you can really focus on making sure that across the board, the content on the website is of the highest quality possible, the more likely we'll be able to kind of show that a little bit more visibly in search. The other thing that I ran across when going through these escalations with some of the other people here is that for a large part, it comes across as your site being kind of as good as a lot of the other site. So I think that's a good step. But on the other hand, at the same time, it's kind of as good as all of the others. From a user point of view, why do we especially need to have your site in there as well? If we go to the search engineering teams and say, well, the 10 search results we're showing now are pretty good, but here is this other one that's just as good, they don't really have any incentive to say, OK, we'll swap out those search results and use this one because it's just as good as the other ones. So the more you can really kind of take a step back and try to find an angle that significantly sets your site apart from all of the others so that when we go to the engineering team, we can say, well, these 10 results are pretty good, but this one is the one that should definitely be number one for this kind of query. Then that's something they can take into account and say, oh, yeah, you're right. This is really clearly the best out there and we need to make sure that it's on top. We'll take some time to figure out what is actually happening here in our algorithms to treat the site, this content that it's producing appropriately. And those are things, of course, when you're in a niche like this that is very competitive, that's very hard to do, but it's not impossible. So I'd really recommend trying to take a step back and thinking about what you can do to significantly take your site to the next level, which might be new content that you produce, new angles that you look into, maybe less content that's already out on other sites, all of these things to really make sure that your site is really unique and not just similar to the other ones. In links to your site, does Google take links that are in the Ahref or the destination page? Sometimes I see that a comment is also added in the links to your site. I don't know about comments, but generally these features, we try to take the origin page as the canonical that it is and the destination page, and we show those as the ones for links to your site, so the canonical of the destination page. So those are the pairs that we would use for features like this, where we say, well, this link goes from here to here, we would take the canonical that is chosen from the source and the canonical of the destination and use that to highlight that in these tools. Hey, John, sorry to interrupt. What's the order we should go through the questions, because my question was right in between all of these ones. I was wondering when will you get to it. I have a secret ranking algorithm. That's it. All right. All right, it's not just Google Search. Yeah, let's see how far we can go. And otherwise, towards the end, we usually have time to go through more questions that come from Google. Why do search results sometimes show publication date and time instead of the last modification date and time? Wouldn't it be useful for the user doing it the other way around? That's something that we sometimes argue with the dates team. But I see there are good arguments both ways. And in our algorithms, we don't always pick one or the other as the one that we would show there. So sometimes we feel that the original date makes sense to show, and sometimes it makes sense to show the last modification date. But we know that something significantly changed on this page that affects what the user is looking for. So I think there are arguments that could be made for both directions. And that's kind of why we try to be a bit flexible there with the algorithms. What are some ways to measure customer intent? I don't know. I don't have any good, short, clip-worthy advice with regards to how to measure customer intent. That sounds like a pretty big topic. So it's probably worth digging into some literature and other sites that have been researching this in the past. I get a message in my search console that my website has been enabled for mobile-first indexing. Not sure if it's related or not, but my website disappeared completely from the search results for most search phrases. I can't find any error. My website is responsive and usable on mobile phones. How can I find out what happened to my website's rankings? That should be more of a coincidence than rather related there. But if you want, you can definitely send me your site's URL, and I can double-check to see if there's anything from a technical point of view that might be stuck there. All right, where can I send it to? Oh, you're here. That's awesome. OK, let me just pop my email address into the chat. Thank you very much. Cool. I think that it dropped a few positions in the rankings, just completely disappeared on most of the keywords. And yeah, my website, the domain is like 12 years old, and it's what used to be one of the top players on the market on this area. So that's why it just vanished somehow completely. Cool. Yeah, I'm happy to take a look. I don't know if I'll have anything specific for you. Maybe these are just normal changes as they happen. Yeah, I would expect in this kind of situation that there will be some errors or something in the search console, right? Because much of just dropping a few positions is just completely gone. I don't know. Yeah, if you're still here towards the end, maybe we can take a quick look through it. All right, thank you. Cool. John, sorry, is it OK if I ask them my first question? Would that be OK? All right, let's go for it. Great. You know, it's to do the CSS side of stuff. I think it's obviously been said a few times that with mobile, you might have things that are temporarily hidden. You might have, I mean, a hamburger icon or something to expand stuff out. And that Google is going to be intelligent enough to know what is temporarily hidden on mobile. So you're still ranked for those things, albeit not quite as much of it's hidden to start with. I mean, I think I've asked the question in the past about how the algorithm will know the difference between content that is hidden, but will appear when you click on something that kind of displays it? Or will it be abused by people that just stuff huge amounts of content underneath that, and it never expands? And I think your answer at the time was that's what the engineers are working on. But I've got quite a specific question around that to do with e-commerce sites, if that's OK. I think what, I mean, I run an e-commerce site. But I think what some people do, the thing is that when you have a description for a product on mobile, you often don't want to display every single part of that description. I don't just mean the text. You might have dimensions. You might have some specifications. And often what you find is all this stuff is initially collapsed, and then you will click the title and it will expand, and that's how you see it. But there kind of tends to be two ways, or there are more ways than that. But there are two ways that this is often done. One is that you click it and it will expand it out, just a standard JavaScript to kind of display none type of thing. Excuse me. And the other way is that what people will do is because they don't want to have that hidden by default, they will put that content at the bottom of the page. And then what actually happens is you will click the title and it just does an anchor type of thing. It will scroll it down to the bottom of the page. So they kind of effectively achieve the same thing. But I think the latter is worse for the user because the idea of having kind of all content at the bottom of the page and kind of scrolling down the page doesn't feel right, but at least the content is there by default to start with. Whereas the first option where it actually expands out makes for me more sense for the user, but it does mean from a Google point of view that the content isn't actually displayed when the page is first rendered. So my question for you is, of those two options, is there a particular one that you would say is better? I think from Google's point of view, both of those would be equivalent. That's something where you probably want to do something like A-B testing for your users to figure out if your assumption that scrolling down is really problematic or maybe it's not that problematic. I don't know. I don't have any insight into that kind of thing, but from a Google point of view, both of those would be equivalent. So Google wouldn't, for example, slightly lower the fact that something was hidden to start with. That wouldn't affect things in any way. No. Okay, brilliant. In terms of user side, that's fine. It was more my question was would a page that had it initially hidden kind of suffer slightly compared to a page that actually had it displayed or be at the bottom of the page? No. Now, if it's in the HTML in the sense that when the page is loaded, it's there. Even if it's not visible, then that's fine for us. Fantastic, brilliant. Thank you very much. Cool. All right. My website has been around for a number of years. We recently revamped the site. Through this process, we updated all the page dates. The issue is many sites have copied our content over the years. When searching and Google for certain phrases from our site, we're seeing scraped content ranking above us. Candies, instances of copied content result in a Google algorithm penalty for our site. No, no. So you wouldn't see a negative effect on your site just because other people have copied your content. Obviously, it's tricky when copied content is ranking similarly or even above your website, but you wouldn't see a negative effect just because someone has copied content from your website. What tools do you recommend to find copied content? In general, content is copied all the time on the web, and our systems are pretty used to it, and we can generally deal with that. So unless you're explicitly seeing problems with regards to that copied content, then I wouldn't necessarily spend too much time focusing on it and trying to dig it up, because you'll always find various places where content is copied, be that just purely from a technical point of view and that someone is maybe temporarily keeping a copy of your page or keeping a copy of your page and referring to that, or maybe it's copied for technical reasons and that you have things maybe on kind of a host, a staging site, those kind of things, all of these different kinds of copy content are really common on the web, and if you try to dig in and find all of those versions, you can usually just keep digging and digging forever, and it's not really that productive. So instead of digging and digging forever to try to find all variations of all of the copy content, I'd recommend just focusing on the ones where you're actually seeing problems with it. And if you're not seeing problems with regards to the copy content, then I would generally just let it be and just kind of move on. You have other things that are more important that are actually visible with regards to your website. With regards to scrapers ranking above your website, sometimes that's a sign that overall we have trouble understanding the quality of your website. In particular, if a website has content in different variations of quality, and you can say, well, here's some really fantastic content, but here's a lot of really kind of iffy or problematic content maybe, then it can be difficult for algorithms to figure out where we should be showing this website across the board. So that might be one thing to look at there. The other thing might also be that there are issues on our side where we're trying to do the right thing, but for whatever reasons, especially if you just recently revamped your website, maybe we're having trouble understanding your website, maybe we're having trouble understanding which site does this content actually belong to. And for things like that, I'd recommend maybe going to the Webmaster Help Forum and posting some of the details that you're seeing. So ideally things like fairly generic queries where you're seeing that a page that essentially is just copying your content is ranking well above your own website. And then people there can take a look at the different pages that are involved and give you a little bit of advice with regards to what you might be able to do there. That might be technical things. That might be quality things that you could do. That might be things like reporting it to Google in specific ways. It could even be that maybe there's something that you could do from a legal point of view where you could do something like a DMCA complaint and take care of it that way. But there are various things that could be involved there. So I'd really recommend trying to get some input from people who've gone through this process before. And like I mentioned in the beginning, pretty much every website has copies of its content out there somewhere. So a lot of people have experience kind of running into this frustration that the content is copied and different approaches that they've taken to try to resolve that. Can I use a canonical tag for my web pages that are active in other countries? I mean, the content is already live on our website. And I want to use a canonical tag to some pages of my website that are active in other countries. So not quite sure how you mean there. In general, for content that you have specifically written for other countries, I'd recommend using something like the hreflang link between those pages. And when you use the hreflang link between pages, ideally you would keep the canonical tag on that specific version. So the hreflang link element tells us that this page here, maybe in English for the US, is equivalent to this page here in English for the UK. And because we understand this equivalence and we can trust that link between those two pages, then we can show the appropriate version in the search results. And for that, it's important that we are able to index both of those versions independently. So if you use a canonical tag, the US English one should point at the US English one, and the UK English one should point at the UK English page. And that way, we understand that this is the canonical US English and the canonical UK English. And we have the hreflang link between those. And then we can swap those out on the map. So that's kind of the approach that I'd recommend there. We have some really good help center content on hreflang setups, so I definitely would take a look at that as well. Sometimes getting started with an international website can be really tricky. So if this is something that is really critical to you, I would also recommend getting somehow from maybe some consultants who have gone through this process before, who can give you some tips with regards to technical setups that you could be doing, or even whether or not it even makes sense to set up an international website in your specific case. I see our site ranking when I display 100 results per page. But when I switch it to default 10 results per page, it's not ranking. Why do sites rank or not rank based on the number of results? This is just happening on desktop. Just changing the number of results to display changes if we get an impression or not. So with regards to the last part of the question there, we show this as an impression when we show the site in the search results. So if a user opens a search results page and doesn't see your site, even though your site might otherwise be ranking in that page, if the user doesn't see it, we don't counter it as an impression. So it has to actually be visible. With regards to 100 results per page or 10 results per page, I don't know exactly what might be happening there. But I suspect this is something that settles down over time, where we're maybe not quite sure which way we should be showing your site in the search results and over time that will generally settle down. What also plays a role in all of this is that we do experiments all the time with regards to our search results, essentially ABA testing, like you would on your website as well. And it's very possible that when looking at things in one way and looking at things in a different way, that maybe you're seeing different experiments. And that could be resulting in this temporarily visible or non-visible type thing that you're seeing there. I have a client whose parent pages have links to different subpages, which are all different in nature. For example, a parent page can be everything about London, and the subpage could be about hotels, restaurants, et cetera. Should we open these subpages in new tabs, or should we just open them all in the same tab? That's totally up to you. From a search point of view, that doesn't change anything. Would it affect our bounce rate of the parent page? Please point out any other impact that I might be missing out here. Bounce rate essentially depends on how you calculate the bounce rate for your site. People calculate bounce rates in slightly different ways. I don't know, for example, what happens when a user opens one page and spends 20 minutes reading that page and then closes it again. Is that a bounce, or did they get the information that they need? Essentially, what you measure as a bounce is totally up to you, and that's something that you can use as a metric to determine if your pages are doing what you would expect them to do. And for some pages, it's perfectly normal that a user goes in, and after 20, 30 seconds, they have the information that they need, and they leave. That's perfectly fine. Other times, a user might go to a website and click around a lot and still not get the information that they're looking for, which is a sign that maybe something is wrong. So from our point of view, the bounce rate there is something that is more relevant to your site for you to figure out where you can improve things and for us from an SEO point of view. So then the bounce rate in Google Analytics doesn't technically hurt you? We don't use the information from Google Analytics at all for search. So if whatever you're collecting there is totally up to you and is useful for you, it's not something that you need to worry about for search ranking. Cool. Thank you. Thank you. Cool. Author BioLink, is that really something that Google would consider as a parameter for ranking? And not that unaware of. So we used to have a program called Authorship where you could link to your author profile, and we'd be able to show you the information that we know about you. But as far as I know, that hasn't been used for quite some time. So that's essentially something that you can use on your website. Oftentimes, it makes it a little bit easier for users to understand the context of whatever you've written there. If you have a link to an author profile that shows you really know what you're talking about, you go to conferences to talk about this, you've written about this in various places on the web, then that sometimes gives them a little bit more trust in your content. Or maybe they just want to contact you and give you some tips. I don't know, ask for a link, probably. But all of these things might be reasons why you might have an author profile page that you've linked from your content. We recently launched AMP at a product page level, and we're wondering if you have any tips to accelerate the crawling and indexing other than XML site apps. We haven't implemented hreflang, but we have it on our non AMP pages. So generally, there are two things that happen with regards to AMP. On the one hand, when we reprocess the traditional page that has the link rel AMP HTML on there, we'll automatically go and crawl the link rel AMP HTML page as well. So normal crawling of your normal pages will automatically trigger the AMP crawling for those pages as well. So the sitemap file is actually a really, really good way to tell us about this. Especially if you can tell us the modification date, because with the modification date, we can recognize there's something significant changed on this page, and we should go check it out. And that might be just that link rel AMP HTML that you added to those pages, which could be a good reason to update the modification date. So if we crawl your normal web pages, then we'll find that link, and we'll go and crawl the AMP HTML page as well. If the AMP page is already indexed, then when we show it in the search results, we generally also trigger a re-crawl after that to make sure that we've updated the AMP cache on our side. Finally, what also plays into this a little bit is that we try to reduce the problems that we cause from crawling on your server. So in particular, if we can tell that your server is getting bogged down by our crawling, then we'll try to back off a little bit. We don't want to cause any problems with your server, so we'll reduce the amount of crawling that we do. And if that's the case for your website and you've just added AMP across all of your pages, then that obviously means we're even slower at picking up all of those AMP pages because we have to crawl the normal web page, and then we have to crawl the AMP page. And that's already kind of two URLs instead of one URL. And if your server is kind of limiting us in the amount of pages that we can crawl per day, then maybe that makes it harder for us. So it just makes it so that it takes a lot longer than it would need to otherwise. One way you can kind of estimate if this is the case is to go into Search Console into the crawling snaps to see the number of pages that we crawl per day. That gives you a little bit of idea where you can guess, is this about reasonable or not? And another thing you can see there is the response time, so the time it took on average for a page to respond. That's not the same time it takes to load a page in the browser. So things like page speed insights don't necessarily play into that, but rather just the pure individual requests that we make to the server. We say, I want this HTML page, or I want this one image, or I want this one PDF file. And on average, the time that it takes to get that information back. And it's hard to give any kind of guideline on what number you should be aiming for there. But generally speaking, the sites I see that are easy to crawl tend to have response times there of maybe 100 milliseconds to 500 milliseconds, something like that. If you're seeing times that are over 1,000 milliseconds, so that's over a second per file, not even to load the page, then that would really be a sign that your server is really kind of slow. And probably that's one of the aspects that's limiting us from crawling as much as we otherwise could. I have a question, since we are talking about pages and crawling. Sure. One of my clients recently joined as SEO. So they have 826,000 of desktop pages. And they have no sitemap for now. And they have 93,000 mobile friendly pages and 443 AMP. So I wonder what's the good strategy forward for them to go over? I would say with a site of that size, it definitely makes sense to look into sitemaps and to generally get a better understanding of the pages that are currently being crawled and the pages that are actually indexable or not. So something like looking at your server log files to see which URLs are being crawled can be really insightful. And taking all of that to try to create an audit for the pages that are actually relevant for the website. And based on that, you can make a sitemap file. You can say, well, these are the pages I really care about. These are the pages I don't need to have crawled. With that, you can also think about canonicalization, redirects, internal linking to make sure that those important pages are all findable and easily crawlable. So I think, especially with a site of that size, it's worth kind of trying to get a real better understanding of the URLs themselves. OK, in last few of your videos, I heard a lot about AMP, Accelerated Mobile Pages. So since we have just 443, should we be working on that part also? Maybe. I don't know. I think that's just that the client, my boss, they never did that CEO. I'm the first person there. So it's like no sitemap. They don't know. They don't care about it. They were too much busy with their business and everything. I think that's a good position to be in as well. But I know that too much. So what I would do is really try to understand the pages that they currently have. And then based on that, you can make a more educated decision on, does it make sense to move a significant part of those to AMP or is the current set up OK? That's something where you really need to kind of take stock first and see, this is what we have. Making these pages to go to AMP is a lot of work. Whereas making these pages go to AMP is pretty easy. We can just install a plugin maybe. Do I do it here? Do I do it there? Do I just wait? Because maybe we will change to a different infrastructure or maybe these pages are actually bad and they don't need to have an AMP page. I think that's something that shouldn't be blindly just moving to a new technology just because it's out there, but rather think about the bigger picture of where it fits in with your website. OK, thank you. It would be easier if we could just say everything must be AMP. But nobody has time to do everything. So you always have to figure out where does it fit in? Is it really something that makes sense for your website or not? Yeah, there was another question. Google recently introduced speakable mockup. But how do we know that the speakable, like for the home devices like Alexa, you give a command and then the device will play something for you. Is there any way to find out the volume of those commands people are giving to their home devices like Alexa, for example? I don't think we have any stats on that. Yeah, because if we are going to do something or spend time on implementing it, we need to know if somebody is actually asking the relevant questions. Yeah, I don't have anything on that at the moment. I think there are two aspects there. On the one hand, it would be useful to know how many people are doing it overall. And on the other hand, it would be interesting to see how many people are seeing content from your website like that. And I know that's something that we always push for. When we talk with product teams and they come up with new features, you always tell them, like, if people are going to implement this, they're going to want to know what the return was for them. So they're definitely aware of this, but I'm not aware of anything specific that's planned there. I think it's also still very early days for a lot of these things. So making changes across a website just specifically for that in the hope that you get a lot of traffic, I think it's a big bet. Some of these new technologies, they pick up and they get really popular. And suddenly everyone is using it. And sometimes they just last for a couple of months and then people have moved on to a new variation. So I think there's always an aspect of trust or bet almost with regards to do you think that for your content this makes sense? And similarly, is this something that you could easily implement where you could say, I don't know if it will catch on, but it's like half an hour of work for me. I'll just try it out. I think that's an approach we're taking to. Thank you. Can you jump in with just a quick question? Sure. It's OK if you still answer. It is around indexation, kind of the same topic. Just going to send you a quick link in the chat. That's the first example. You'll notice the link in question. We've run a site command and it shows it's not indexed. And then the second URL I've just sent over. Let me just do that again. We run a general query in Google and it's showing it is indexed. So my question is really this particular URL and is it indexed or is it not? In Search Console it shows it's a duplicate and it's not a canonical. But I can't really understand why it's ranking for a general phrase if the site command shows it's not being indexed, so it's not in the database. In the second example, just after the local listings, the first organic results, that's the exact same URL that we've run the site command on. Cool. And I have no idea. Ta-da-da-da-da. So if I search for it, I get a slightly different one. So one thing I see when I look at the cache page, it shows the slash en page, whereas I think you link to the enus page. It should. That particular result. Yeah, so enus is the example. So we are using a slight modifier in the URL. In the site command, it's the enus version. But in this example, in the live example, I've got the en-us as well. So there is an en as a four slash en that exists, but in both those cases and the examples that I've given, it's the enus for both. It should be the enus for both. I'm looking at them both now. OK. So I don't know whether or not your search is changing the results slightly. I'm not really sure. Yeah. So what might be happening is that we have an hreflang set up, I guess. Is that possible? Looks like something hreflang-like. But what can happen in some cases is we understand the relationship between the different versions of these URLs. And we understand that they're pretty much the same or maybe exactly the same. So we'll do eliminate them. And we'll still use the hreflang to show the other URL. So if you search for the content, you might see that URL. But if you check if that URL is actually indexed, someone will say, well, it's a duplicate. We don't have it indexed. So that's something from our point of view that's kind of working as expected, even though it's sometimes pretty confusing. So in both examples and in both the site command and the general search query we're using the same search engine, which is the US, google.com. So I would have thought that for both, it would return the same URL. It wouldn't. If we're using the google.conf site command, I wouldn't expect it there because it knows and understands hreflang tag, for example. So if anything, it shouldn't have to really swap in anything or swap out anything because it's explicitly defined in the URL. So that's what throws me the most. If it was a reverse, I'd understand. But because it's not showing the US version, but yet we're saying that that version is being used in an American search engine, it seems like it doesn't make that much sense. I think what we have indexed is just the slash en version. So just the slash en. And we swap it out against the en-us version. And that's why the site query doesn't show that because it's not the one that's indexed. And in the search results themselves, we can show that because we understand the link between the hreflang versions. So I don't know what would happen if we did a site query with just en, then it would show. Yeah, it would be there. That's interesting then because I would have thought that if it's showing on the site query, then it's indexed. Are you saying that it would not show that? Are you saying it would show the URL even though it's not indexed because that's what it's doing? Yeah, the site query wouldn't show it. I mean, the site query is somewhat of an artificial query. So I wouldn't necessarily use it for diagnostics purposes. I'd use the inspect URL tool, if anything, to see if it's really indexed or if it's seen as a duplicate or not. Inspect URL. In Search Console? Oh, right, yeah, yeah, OK. The new tool, like if you go to the new Search Console right on top, you can paste the URL in and it'll tell you. You would use that method to see if it's indexed rather than the site command. Exactly, yeah. I mean, the site command is nice because you can send it to anyone to try it out. But the inspect URL tool is really the one that's looking at the current version of the URL, the current indexed URL or, well, maybe not indexed or duplicate or whatever of that URL. OK, so you're saying basically use that tool because it's more accurate, whereas this might not necessarily operate as we'd expect if we use the search command. OK, but from your standpoint, the ENUS version has been crawled but not indexed. But you will place it in the index in a general search query potentially because of the age of long tags. Exactly, yeah. OK, interesting. OK, that was just my question. I'm sorry it took longer than I thought to get it across. But OK, I'll sit at the top. I mean, this is quite confusing because you think if it's not indexed, it would never show up in the search results. But especially with age of long, it could show up. Yeah, you're saying that it will show up with an age of long without the end. OK, that is interesting. I never knew that, but there you go. That's why we're here. Thanks, John. I appreciate it. Cool. All right, let me run through some more of the questions that were submitted. Maybe we'll make it to the bottom or maybe not. I have a bit more time afterwards too. We have over 40,000 pages with medical information back to 2001. Much of these are outdated. Would you recommend that us remove them all together or add a canonical tag to a newer article on the same topic? I think both of those options are possible. So if you think they really shouldn't be online anymore, then remove them. If you'd like to point to a newer version, then maybe put a banner on those pages and rel canonical pointing to the newer version or even just redirect directly to the newer version. All of those are options that could make sense there. Shall we implement in our website content in order to get featured in Google AnswerBox or Google Featured Snippets? Any suggestions? In general, I wouldn't explicitly aim for some of these search features because they're still fairly new and they can change fairly quickly. So if you're doing something just for one of these search features, then just keep in mind that it may change over time as well. Whereas if you're doing things that make sense for your website for the long run, then generally speaking, that's something where you kind of know that you'll have value out of that for the long run. So it's almost like a strategic question. Like, do I bet on this feature sticking around long enough for me to have a positive return? And will Google even feature my content there? Or should I continue working on my website for the long run, maybe not see the short-term gains but more long-term gains? That's kind of a strategic decision you could make. We have a forum and lots of people just delete pages after the discussion is complete and they convert into 404 pages. What should we do? Should we redirect them? Should we just disable the delete option and make it an archive? The forum is used by a million users monthly for legal information. I think this is something that you need to perhaps figure out with your users primarily. From our point of view, it can get a bit confusing if we index pages and then after a short period of time, they're no longer there. But it's normal on the web for content to disappear. So it's not that you need to do anything from an SEO point of view just with regards to those 404 pages. I feel it's more a matter of what could you do to get value out of the content that people are posting into your website for the long run. So is there something that you could pull out? Is there a way for you to get more maybe evergreen content out of these forum threads even if people delete the individual entries? How do you balance the desire of people to keep their legal problems private versus your desire maybe to be found as a source of legal information? That's something that I think is not that straightforward. I see that in our web master forums as well all the time and that people will come in, ask a question. The community will get quite involved and they'll try to find solutions and different approaches to solving a problem, spend a lot of time on that. And in the end, the original poster decides, well, I don't want this problem to be public anymore. I'll delete my question. And then the whole thread is gone. From my point of view, that's ultimately their decision. But I know for the community, that's always frustrating. They spend a lot of time. And then suddenly all of their work that they put into this to try to help this person solve this problem is gone. So it's something that's always kind of have to find a balance. Our site is a progressive web app with a near flawless Lighthouse score, massive social presence, high quality content. All our on-page SEO is optimized. It's built in React and universally rendered, both on the client and the server side. So from my understanding, Google shouldn't have issues crawling the content. The site has been featured in publications, and we've even been approached by some teams from Google as a part of the Assistant Better Program. Yet, search engine traffic is almost non-existent. What can we be doing differently? So I took a quick look at the site, and I agree. It looks nice. It's really fast. But in general, I think this is one of those situations where, from a technical point of view, you might be doing a lot of things really well. But that doesn't necessarily mean that your pages are always relevant for users when they're searching for something. So the technical side of things is one thing. The general quality side of things across the website and kind of a match of what you're doing with what users are looking for, that's something completely different. And some sites are really fantastic on technical point of view, and not so good with regards to quality. Some sites are really fantastic with the quality point of view, and their technical side is really terrible. And our algorithm is trying to figure out, what is the right approach here? What should we be showing to users? Should we be showing things that are closer to what they're looking for, even if they're technically not that great? Or should we just be showing something that's technically really great, but doesn't really answer their question? And I think for the most part, we would tend towards trying to show a good answer, even from a less technically fantastic website. However, if you have a great answer and a technically fantastic website, that's always a win as well. And that's something where you generally see that users come back and go to your website again if there's something that is really easy for them to look into, if they get a lot of information here, if they find that they're able to browse through your content very quickly because you have a fantastic, fast website, then that's always an additional value that you bring into that. But you really need to make sure that from the quality point of view, from the content point of view, you're really on top of the game, especially in a niche like the one you're in, where you have quotes on a website. These quotes are available for millions of websites to use their quote databases that you can download and just copy and paste into your website. And you really need to make sure that you're providing something that's significant, unique, and compelling. That's not just the same thing with a different technical back end and front end attached to it. So that's something where I'd recommend not purely blindly focusing on the technical side, but really kind of looking at the quality as well and thinking about what you could be doing slightly differently to make sure that your website is really on that answers the user's questions. And especially when it comes to quotes and kind of this big database of content, it's also important that you have the basics of SEO done fairly well. Things like titles and descriptions. That's one thing I noticed when I looked across your website, across some of the pages that we have indexed for your site. That's sometimes the titles and descriptions are just pure, essentially filling in the blanks with a simple kind of, I don't know, script that's copy and pasting the individual words from one part of the page to another part. And those kind of search results don't really look that compelling. So even basic things like titles and descriptions can make a big difference, but especially if you're in a niche that is very competitive, you need to be more than just the same as the others. You need to be more than just technically really fantastic. We've identified over a thousand instances of copy content and we believe this is because our website suffered an algorithmic penalty. The DMCA tool only accepts one URL at a time. It's a way that we can bulk upload the URLs. As far as I know, you need to submit them individually. But kind of, as I mentioned before, it's worth thinking about which of these URLs are actually problems and which of these URLs are URLs that might just be known if you search explicitly for them but don't actually show up in the search results for your website. So in most cases, that's not like a thousand URLs that are ranking above your website for these individual queries. City to city query, sometimes people want to see different the distance and sometimes they want to see the time taken. Does Google take the data from other sources to determine results on these queries or only authority pages cannot rank? I see some city pair of bus websites ranking and in some other city pairs, flights pages ranking and also some pairs I see distances like informant, native pages ranking. Tell me which pages will rank and I'll make them, essentially. I think this is not something where we have our algorithms hard coded to say if someone is looking from one city to another city, then we will always show them the specific types of pages. But rather, we try to figure out what makes sense for the user and that's something that can vary depending on what you're looking at. So that's not something where there's a simple straightforward answer that you can always use and will always make your pages rank for these types of queries. So loosely on that then with AHR Flang tags, is that, oh, AHR Flang, does that come in after you've had a rank waiting? So someone says this is for a query. Google understands what the page does about how relevant it gives a weight in. Then does it swap out the URLs? Because I think that can only happen in that process without both these pages being indexed, right? Yeah. Okay, that's interesting. Way at the end. So essentially we put together the normal search results and then we double check to see if we have different URLs that we should be swapping in there that would better fit for the user. We do the same thing for mobile and desktop as well. On mobile it's a lot more common that the mobile version isn't actually indexed but we know that it exists so we can swap out the URL. Okay, but then in terms of content relevance then, you'd have to be looking in that language for that query, right? So would that then override the AHR Flang tags anyway if that is been indexed? It depends on the user. They would look at relevance in one language and then push up forward and then swap out the URLs for the right language, right? Yeah, it really depends on the query. And that's something I find sometimes a bit frustrating because it always throws me off, especially with technical terms where kind of the technical term is the same thing in different languages and I'll search for the technical term and I expect to see English pages that's kind of like the original documentation on that and it always shows me German pages because my browser is set for German. So something where it's like, I'm searching for the word in English but actually it's a technical phrase that's the same as the phrase used in German. So it's kind of normal that we would swap these out. City names are also really common like that or kind of brands, product names. That happens a lot there too where it's the same word, but it's used in different languages. Okay, cool. Thank you, John, again. Let's see, we have three more questions. So let's jump through these. I've seen new errors popping up in my structured data console for breadcrumb stating that there's no value for position and Google says it's required but my structured data is a list. So why do you need the position? We need the position because from a schema.org point of view, as far as I know, we don't have any sense of kind of order within the page. So these elements are essentially completely independent. It's different, I believe, with the breadcrumb markup or micro data or what the other one was where we can pick up the structure based on like where it is within the elements on the page. But with schema.org, we need to have information about the position so that we can pick that up as a clean breadcrumb. I have a client that recently switched from old to new domain in 2015. Well, okay, maybe not recently. And the previous SEO company built a lot of questionable links to the old domain. There are thousands of links still reported by the backlink checker tools and I would like to disavow them since the old domain is 301 redirected to new one. However, the old links pointing at the site are not showing in search console. Could this be because there is no site to crawl? So what is probably happening is we're showing the old links in the new domain because if you're redirecting the old domain to a new one, then we would generally pick the new URLs as canonical. So from our point of view, those links would be from the old page to the new canonical of the page that used to be linking to. So within the links to your site for the new domain, you would probably see these links there as well. And then the speakable, I think we talked about that as well. Will there be a direct link to website with the answer in the Chrome Omnibox? I don't know, I can't speak for the Chrome team. I imagine that would make sense so that you could go there and take a look as well. But I don't know what the plans are there. I know that the Chrome team focuses quite a bit on having a clean UI. So that's probably a tough balance for them with regards to how much room should we dedicate to show a long URL as well. In addition to the answer, I don't know what the approach there will be. If you have strong feelings on this, I would strongly recommend going to the Chrome forum and kind of letting the team know about that. So because the Omnibox, they say, or let's just say Power is in back or whatever they say, the Omnibox will now show you answers directly in the address bar. So how will that kind of affect their search? Now already when it starts to fill in things, you can kind of lead users into a query. And the results there, what's kind of stopping them from even clicking enter if they got the result? And then how do- I don't know. Good question. I think that's something that's worth looking into as well. I know the Chrome team is not focused on preventing people from visiting websites. It's a browser, essentially. So it's in their best interest to make it easy for people to go to these websites to get more information. So that's something where if they find the right balance with regards to what kind of answers would make sense to show like that, I think that could be pretty neat. But at the same time, it might also be that they run into situations where the answer that they show immediately in Chrome is either not correct, where users might want to have kind of more detailed information, or is too limited in that just providing this answer is not enough to satisfy what the user is looking for. And I would assume that's something that the Chrome team will be fine doing over the long run. But I actually haven't had a chance to play around with this feature. So I don't really know what it looks like. The one place I have seen these kind of immediate answers is if you enter like simple math equations that it'll give you an answer, which I find really useful. But I don't think that's something where we would point at websites for anyway, it's like, I don't know, this long number divided by another number, like what website content could be really relevant there? I don't know. So is it kind of like how the Apple Watch is gonna be displaying the web on there that it might just be very, very generic queries or maybe very specific links that people are sending? I don't know, that's a good question. But so the nice thing about Chrome is they have their preview versions available so that anyone can download them. I believe for the most part, you can run them in parallel to your normal version. So you can probably try this out and see what it's like. And if you find a lot of interesting queries that you think should be handled differently, then give them feedback or write a blog post about it and kind of write the discussion on that. I think that's pretty fantastic in direction there, whether or not it catches on, whether or not it kind of finds the right balance there. I don't know, we will see. I have a short and quick question. Since we never did SEO or anything from decades, it's been like 30 years now. So there are tens of thousands of backlinks to our website. And there are so many small crappy websites. I heard a lot from you and others also that there are websites which are product trees or which are spammy websites, maybe penalized by Google in some ways. So what should we do about those websites? Or we should just ignore and act like we don't know anything the way we are reporting. For the most part, unless you're aware of kind of regular activity that a previous SEO or someone in the past has been doing for your website, with regards to links, then I wouldn't worry about that. That's something where if a website has been around for a long time, then it has links from all kinds of crazy places. And even if you look at those pages and you go, like, I don't know why this website is linking to our site or it looks like someone did a typo or dropped a bunch of URLs into a forum and includes links to our site as well, these are things that we see on the web all the time. So unless you're really aware of kind of dedicated activity of someone going out and buying a lot of links and kind of really doing a lot of things that are against our webmaster guidelines, then I wouldn't really worry about this. It's interesting to take a look from time to time. And maybe you can spot things that you should have cleaned up in the past, but for the most part, kind of this mix of weird webpages, everywhere linking to other webpages, I think that's just a normal part of the web. Something that you'd need to kind of go through with a comb and clean out all of the slightly unnatural ones that you run across. I have a quick question on that matter. So we have a client with a disavow file that we are preparing for him now. And the new Search Console is actually reporting very little links comparing to all the third-party tools. So the third-party tools are giving us extreme amount of links, so it will take a lot of time to audit each one of them. Would you recommend to use just the data from the Search Console? Is that just a sample? Because we have a debate here in the company. Some of us think that these are actually the links that matter. I'm on the other side, things that this is just a sample because apparently you don't want us to give us order that. So what's your stance on that? And for my, for another point of view, so we can afford to buy these tools, third-party tools to the link audits, but the small business, if it needs to sign up for all the three or four major companies that do these link reports, the third-party companies, it's a lot of money, so it doesn't make big sense for a small bakery, for example, to do a link audit by paying these big companies. So would you recommend us to use just the Search Console links for the disavow or to spend the time to audit everything that we have for data for? For the most part, the Search Console links would be enough for disavow. I think... Yeah, so for normal websites, even if you've been doing something sneaky with links in the past, then the Search Console links are enough for disavow, for manual spam action as well. If you're a really large website and you have a really big problem with regards to links, then sometimes third-party tools make it a lot easier to find all the folks to aggregate them or to test them and see which ones of these are already no index, no follow. Which one of these are ones that are really problematic? So if you recognize that you had an issue with directory submissions, a third-party tool might go out and check the titles of these pages and say, oh, like, here's, I don't know, 1,000 URLs that have directory in the title. Therefore, maybe these are ones that you should put in your disavow file. So for really large sites, or if you're doing this on a regular basis for a lot of clients, then these third-party tools can certainly make sense. There are some really great tools out there, but on a normal basis for a normal site, the Search Console links shouldn't really be enough. Any tools you can recommend? I hear a lot of things about kind of the more well-known link tools and they sound pretty good to me. I don't use any of them, so I can't speak from first-hand experience. There are lots of really great people behind these tools. The technology behind that is sometimes pretty complex, and I think they're doing a pretty good job with setting these up and getting that data. I imagine all of these tools will have slightly different focus because they use different ways of getting that information, but that's fine. It's like, depending on what you're trying to do, you might need different approaches, or it might be enough to say, well, I understand this tool really well and I can work with it. It makes me productive, so even if this other tool has maybe, I don't know, let's say 10% more or 10% different links, then you still have what you need and you're able to be productive, so maybe you stick with one of them. But it's, yeah. Do you use Lighthouse data from the extension other than the PageSpeed score? So everything else that's shown there, do you use that for a ranking factor or this is just for the user experience report? It's mostly for the user experience report. We use different metrics for the mobile speed ranking. On the one hand, some kind of calculated metrics, like the Lighthouse test would be, on the other hand, also some real world metrics like the Chrome user experience report has as well. So it's something where what we use for search and what we have in the tools doesn't match completely. Part of that is also just because we want to have a little bit of flexibility to adjust what we think makes sense for search that doesn't necessarily map exactly to what we have in any one of the tools. John. All right, yeah, one last question. Can you hear me? Let's do it. Yes. No, I can't hear you anymore. They muted themselves. Oh no. Okay. Heard something. Yeah. Oh no. One last try. Nothing? There you go. Can you hear me? Yes. Okay. So something I thought about was the Panda guidelines that you guys have. Have you guys thought about integrating those with the general Webmaster guidelines that you have? Kind of to expand the quality guidelines a little bit. I don't know. That sounds like something we might want to look at. Yeah. This seems to be such a critical part of how we can understand objectively how to improve. Cause we're dealing with a lot of subjective kind of approaches. People want, right? What is, you know, you guys approach from an objective viewpoint. So having those integrated together and keeping them updated would help us. I look at the last update and I hear a lot of people, a lot of big sites, they said that's the core update. They said, hey, you know, we have great content, great links, but, you know, we're not showing up so much anymore. You know, a lot of sites may have dropped by half or more. And I understand some of it's a relevance factor, right? Maybe the site shouldn't be ranking for these terms over here. It just doesn't, you know, that's not really good for users. But I think in some cases, you know, maybe the site could be doing something better. And, you know, as you said, you know, the web is changing, but people want to be able to keep up with what's changing. And so having some sort of guidelines and keeping them up to date. I mean, has anything really changed in terms of how you guys perceive good content since those guidelines were written years ago? Probably. I guess the one aspect that I'm thinking of at the moment are kind of the quality writer guidelines, which aren't directly ranking factor related, but there's also a lot of good information in there on what we think people could be watching out for with regards to good websites or maybe websites where the quality is not so good. But maybe that's really something that we should be tying in together with the webmaster guidelines. So it's easier to find these additional types of information. So that it's not just for those of you who already know about this, but also those who kind of are stumbling across this for the first time. That's a good point. But was that core update rolled out to everything or is that just kind of a subsection right now of the internet? No, that's, if it's a core update that applies across all of our search results. Okay, I'll save my other questions for next time, it's late, thank you. Cool, all right. Yeah, kind of over time. So let's take a break here. Thank you all for coming. Thanks for all of the questions that were submitted and all of the questions here from Hangout. I've been really insightful, always learning something new. And those of you who want to send me some more information, feel free to go ahead. I don't know if I'll be able to respond to everything, but I do forward these on to the teams that are involved. So someone will be taking a look at the details there too. I'll set up the next Hangouts probably later today. So if there's anything else on your mind, feel free to jump in and drop those questions there. Or of course, come and join us in the Webmaster Help Forum where folks are always welcome to drop by, ask questions, answer questions as well, and kind of discuss these topics too. All right, with that, let's take a break here. Thanks again for coming, and I wish you all a great weekend. Bye. Bye. Yeah, we appreciate the extra time. Thanks, see you. Have a good weekend.