 Hi everyone. My name is Jin, and I'm going to talk a little bit about Googlebot and web hosting. As many of you probably know, before Google can index a page and serve it to the users, we have to crawl the page and render it. This is done by a system known as Googlebot. Googlebot does a lot of things. For example, we follow outlinks and discover new content. We predict duplicate content behind different URLs so we can save crawl bandwidth. We also re-crawl URLs to keep our index fresh. But for today, I'm going to focus on some specific scenes. First, I will share some of our observations, how web hosting has evolved over the years. Next, I will talk a little bit about Robots.txt fetch and also our crawling rate. This is a graph that shows the HTTP server popularity over the years. Popularity here is defined as the percentage of websites that are served by a particular HTTP server. We can see that Apache started out being very popular. About 10 years ago, more than 70% of the websites are served by Apache. But this number has been dropping over time. Now, Apache is serving about 35% of the websites. On the other hand, Nginx has become really popular. 10 years ago, it was serving very few websites. Now, it's serving about 40% of the websites. Microsoft started about 22% of the sites. Now, the number is about 15%. There are a few other HTTP servers. I'm not going into the details. I should note that this kind of data is actually externally available. There are sources of data externally you can find about HTTP server, MAC share or something. So the numbers will differ a little bit with what we see here. But we find that the general trend is consistent. This graph shows from Google point of view what's the relative traffic between HTTP and HTTPS. 10 years ago, almost all of our crawling traffic is HTTP. But in the last several years, HTTPS has really picked up. Today, about 75% of our crawling traffic is HTTPS. HTTP only accounts for about 25% of the traffic. Now, on the one hand, we see that the web has become really more secure. But on the other hand, we can see there's also still quite some room for HTTPS to grow. This graph shows the average URL download time seen by the Googlebot. So we can see about 10 years ago, to download a web page on average, it takes about 800 milliseconds. Today, the number is about 500 milliseconds. Of course, these numbers are specific to Googlebot. But I think the general trend seems to indicate that networks are getting faster, servers are getting faster. That's good news because in the same amount of time, Googlebot can download more web pages. Now, let's switch gears and talk a little bit about robots or text, which has been mentioned in multiple previous talks. So many of you probably know that robots or text is a way for webmasters to specify access to their site. Here you see an example, robots or text snippet. This one actually shows what URLs Googlebot can crawl from this site and what URLs will not allow to crawl. Although robots or text has been used in industry for more than 25 years, there has never been a really standard for it. This creates some issues because the same robots or text may be interpreted differently by different search engines. That's why in the past year, my team and webmasters team here at Google, we work together and we also work with Bing to propose a draft standard to IETF. Once this becomes a real standard, hopefully there's no more ambiguity in terms of how robots or text can be interpreted. Now, when Googlebot is crawling the web, before we crawl every single URL, we have to check it against the robots or text. In order to do this, we have to fetch the robots or text itself and this fetch can fail. When we try to fetch a robots or text, if we get 200 OK, that's great because we have the robots or text, we know how to check if a URL is allowed. If we get a 404 not found, that's also good because it means there is no restriction about crawling the URLs on that site. Sometimes a web server will be temporarily overloaded and they return like 5xx responses. That's fine as long as this is a transient response because we will retry the fetch later. But if the website starts returning 5xx every single time or for some other reasons we cannot fetch the robots or text at all, then it becomes more serious because we don't know the robots or text, we don't have it, so we don't know whether a URL is allowed or not. So to be safe, we don't crawl any URL from the website. This graph shows the distribution of robots or text fetch status on a typical day. We can see about 69% of the time we get either 200 OK or 404 not found. This is great. About 5% of the time we got 5xx. This is fine as long as it's transient. For 26% of the time we get some other kind of errors. This could be that the site is just down or sometimes a website will time out when we try to download the robots or text or they just close the connection before sending anything back. So these are bad. So if the website is already returning 200 OK or 404, that's good. If it's a 5xx, we recommend that you only return this temporarily when the server is overloaded. For the other cases, we recommend that the webmaster either create a robot or text if you want to control access to different parts of your site or just return 404. This will make the life of a crawler much simpler. So lastly, let's talk a little bit about Google's crawling rate. When Googlebot is crawling the internet, we try very hard not to overload your server. That's why we maintain our crawl rate to every website out there. And we have sophisticated ways to adjust our crawl rate in order to meet both the internal demand and also the server's capacity needs. But our crawl rate is not perfect. And the notion of being overloaded can be different for different websites. That's why in our search console there is a tool called a customer rate tool. You can go there and set a crawl rate that will limit how fast we can crawl. There is actually a webmaster article explaining how you can do this. But we should caution that just because you set a customer crawl rate, that does not trigger more crawls. Sometimes the webmaster may have a misunderstanding. They want Googlebot to crawl more, so they try to set a higher number. This does not work. You will not trigger Googlebot to crawl more URLs because the actual crawl rate is determined by many things. Sometimes when you set a customer crawl rate it may actually accidentally reduce your crawl. Recently we have a news website in some Asian country. They contacted one of our webmaster team members at midnight and they complained about their crawl rate drop. Turns out they set a customer crawl rate a few days ago. So for these kind of reasons, our recommendations in terms of Googlebot crawling rate, please leave it to Google unless your website is being overloaded. That's all I have today. I hope you find this interesting. If you have any questions feel free to reach us through the webmaster forums channel. Thank you.