Sort by time | Sort by thread (beta)

Link to this comment:

Share to:

All Comments (46)

Sign In or Sign Up now to post a comment!
  • how can I make google give me an iphone4s ? :((

  • Nice Googlebot! This is very relevant to all users. Magaling!

  • If the pages are generated by the search field of the website, then they are linked to somewhat.

  • Lame!

  • "googlebot is very broke and don't have a credit card" lmao best saying ever!

  • Pas si nouveau !

  • Googlebot maybe broke, but Google Inc certainly isn't. :)

  • Another case... The user can search for something in your "search box". Find an interesting result, copy the URL and post on Twitter.

  • אתם טובים

  • All major browsers like FF, safari, chrome, etc, sent all urls surfed to google. So all this video is wrong as he didn't say main thing.

  • And what about The Google Toolbar acting as spyware submitting reports of websites visited for indexing purposes?

  • Matt which are those text you submit, ha vae a ecommerce site. and spider has crawl the XAMPP etc folder..

  • How come when you do a search and get like 300,000 results, only the first 20-30 links are relevant and the rest have no connection to the search query? Hmmm?

  • @tubester4567 Simply because the search results are sorted by their relevance

  • i've also made the experience, that crawlers started visiting my non-linked pages after i had sent an *email with the direct url*

  • But I have seen many search results that had been generated by searching words not just simple drop down.

  • What about the issue raise by @jimboot where he says that he has pages indexed that are blocked by robots.txt

  • @NMITYou If you add robots.txt after the listing has occurred, it may be a while before the pages drop out of the index. And many people have errors in their robots.txt

  • That was longer then normal

  • Excuse me Matt but did you just say that disallowing a directory in robots.txt prevent Google from crawling the content? Normally this does not help at all if there are other external links pointing to URLs in the directory. Only a "noindex" in the header is an absolute guarantee that Google will not index.

    Can you please clarify this?

  • @SEOLEX - Use robots.txt to prevent Google from crawling it. Note that in general, even if a URL is disallowed by robots.txt we may still index the page if we find its URL on another site. However, Google won't index the page if it's blocked in robots.txt and there's an active removal request for the page. Alternatively, you can use a noindex meta tag. When we see this tag on a page, Google will completely drop the page from our search results, even if other pages link to it.

  • @mikegarde I know. But in my ears Matt actually proposed robot.txt exclusion as a solution to the problem. And I find this a wrong answer. Even if excluded by robots.txt the URL's without any obvious links to them will/might show up in the SERP. With a "noindex" they won't.

  • @SEOLEX If you really don't want content seen, why put it on the web? Robots.txt is not a perfect solution, but Google does respect it - many other spiders do not. Or you could use password protection?

  • @heenan73 I don't know. Please understand that I am not the one asking the question in Matts video. I just wondered why Matt said that robots.txt will prevent the content from being visible in Google since that is not how robots.txt works ;-)

  • @SEOLEX robots.txt does work as stated; it cannot prevent a site being picked up if it has other links; but the point remains, if you don't want it listed, don't FTP it - and don't get incoming links. Why is everything always Google's fault? YOU manage your content; Google obeys the rules .... you have to as well. Remember the original question was "pages that don't have any links" - in that context, Matt was 100% right - you seem to be rewriting the question.

  • @heenan73 It makes no sense to me what you say? I'm not the one asking and not the one wishing to FTP content that should not be in the index... I'm only asking why Matt states that robots.txt can prevent it when he in other videos state the opposite. I think you've got me all wrong?

  • @SEOLEX Read the question: "How is Google finding pages which don't have any links to them?" in THAT situation, robots.txt can work; robots.txt cannot work *if there are other links* - but it CAN work where there are none. Which is what he said.

  • @SEOLEX Your right, however I believe that when Matt mentioned robots.txt he was more or less referring to URL's googlebot will discover by self-fulfilled forms and not from other sites.

  • the sitemap could have those pages added... so googlegot knows about them

  • This video is indeed quieter than the average.

  • Googlebot is broke from giving all the pagerank to spammers :{

    Matt we need you to fix this!

  • @TechieGeek1

    agreed

  • Thumbs up to support broke Googlebot

  • Can you record these to be a little louder please?!

  • @calebtheredwood Turn up the volume.

  • Don't Google Toolbar and Chrome report visited URLs back to Google for crawling as well?

  • Can we get links to the form submission blog post & paper? Thanks!

  • Very good question, but I really wanted to know from Matt if there are any other ways for Googlebot to discover and index new pages, besides with link crawling. For example, does Googlebot use the data from Google Chrome browsers too for discovery new webpages?

  • @Robbertbiz Right! Or from the Google Toolbar

  • @Robbertbiz Right! Or from the Google Toolbar.

  • Great video Matt

Loading...
Alert icon
0 / 00Unsaved Playlist Return to active list
    1. Your queue is empty. Add videos to your queue using this button:
      or sign in to load a different list.
    Loading...Loading...Saving...
    • Clear all videos from this list
    • Learn more