Alert icon
We're changing our privacy policy. This stuff matters.  Learn more  Dismiss

Uncrawled URLs in search results

Loading...

Sign in or sign up now!
31,246
Loading...
Alert icon
Sign in or sign up now!
Alert icon
There is no Interactive Transcript.

Uploaded by on Oct 5, 2009

Matt Cutts explains why a page that is disallowed in robots.txt may still appear in Google's search results.

Category:

Science & Technology

Tags:

License:

Standard YouTube License

Link to this comment:

Share to:
see all

All Comments (15)

Sign In or Sign Up now to post a comment!
  • BUt what if we remove from URL removal tool and then since people will be linking to those specific urls ?

  • This one is very informative. I am learning a lot.

  • @RickettsFish:

    Combining crawling with indexing / serving directives

    Robots meta tags and X-Robots-Tag headers are discovered when a URL is crawled. If a page is disallowed from crawling through the robots.txt file, then any information about indexing or serving directives will not be found and will therefore be ignored. If indexing or serving directives must be followed, the URLs containing those directives cannot be disallowed from crawling.

  • @RickettsFish

    Then, this is my guess: if it's the case that many other websites link to your blocked website, Google may have to do some very basic crawling of your website, just to decide (via presence of "noindex" meta) if it should be listed or not on SERPs.

    But then, I'm just guessing.

  • @RickettsFish Good question, seems like a catch-22 situation.

    As I understand it, if you use robots.txt to block crawlers, your website won't get crawled. And it *may* not get indexed at all, unless other websites link to it. In that case, Google will show it in its index. So far, that's what Matt clearly explained in this video.

  • This was very helpful, but I have a question. If we use noindex, you made it sound like we need to allow Google to crawl the page by not blocking it via robots.txt. Otherwise, if we blocked it, Googlebot wouldn't read the noindex tag. Is that right?

  • Also robots.txt Sites dont have a Cached Version

  • Thanks for this tip.It still avaiable today?

  • that tip is so good for me.. thanx matt!.. :)

  • I think better maintain that way, Matt. As many bloggers can 'ferry' that anchor texts like what you said. Especially, your example for NISSAN. Many small entrepreneurs (vendors) related to the industry able to get benefits from it. Like mine, Nissan Impul.

Loading...

0 / 00Unsaved Playlist Return to active list
    1. Your queue is empty. Add videos to your queue using this button:
      or sign in to load a different list.
    Loading...Loading...Saving...
    • Clear all videos from this list
    • Learn more