Web scraping using Google Docs - Xpath

Loading...

Sign in or sign up now!
Alert icon
Upgrade to the latest Flash Player for improved playback performance. Upgrade now or more info.
10,612
Loading...
Alert icon
Sign in or sign up now!
Alert icon
There is no Interactive Transcript.

Uploaded by on Aug 6, 2010

Scrape any website using xpath in Google Docs. Web scraping can be done easily through Google Documents Spreadsheets using simple xpath statements.

Learn how to make amazing tools with Google docs - easy read and one of the top guides on the web: http://www.distilled.net/blog/distilled/guide-to-google-docs-importxml/

This is a short tutorial, however if you need more information visit W3 schools for xpath syntax. http://www.w3schools.com/css/css_syntax.asp

this is a test 00:20

Link to this comment:

Share to:

Uploader Comments (testresearch099)

  • I'm going to answer some of these questions right here, I'm a bit lazy to make another video to be honest.

    @nigel016 Yes. All you need to do is include a different URL. Right now, Google can support up to 50 different URL calls.

    @earley15 You'll need to experiment with the html elements. I bet if you try "//p" you should get most of the text on a page

    @

    Guys, if you're really stuck, I'll make another video, or you can just ask me questions.

  • @earley15 The next video will be coming soon, I'll show the text portion. Right now, scraping images isn't possible using this platform - will get back to you if anything changes.

    @meenu2511 You're welcome.

see all

All Comments (19)

Sign In or Sign Up now to post a comment!
  • Nice video, short but sweet! This could be used for all sorts of things. Personally though, I prefer PHP and preg_match for my scraping needs!

  • Didn't google have another application that did something very similar to this automatically? I've been racking my brain, but cant remember the name of the app/product

  • @testresearch099 How to do this in excel becoz google doc doesn't support more than 50 XML tags.

  • What character is being used for the "add also"? I?

  • Im new and need to ask this. What exactly am i doing with this. If im new into SEo what good would this doi me. Sorry im trying to figure this out. Thanks.

  • @jolt472 Hey, I'm actually Testresearch099 - I've had this problem as well and haven't managed to solve it. In your case, I've never had a problem before. It would help if you could tell me the URL - send me a message if you like, be happy to *try* and help.

  • Hi,

    I found your video very useful, but I am having a problem where the data on the web page I want to scrape takes a few seconds to load, so the script isn't able to capture that data.

    I'm wondering if there is a delay() or sleep() function that can be used to load the page and then delay the data scrape for approx. 5 seconds?

    Thanks!

  • Anyone figured out how to scrape meta data using GDocs?

  • @testresearch099 thanks for getting back to me and taking your time to put the videos together. It makes computing a lot more simpler seeing it in a movie rather than reading text. Many Thanks.

Loading...

Alert icon
0 / 00Unsaved Playlist Return to active list
    1. Your queue is empty. Add videos to your queue using this button:
      or sign in to load a different list.
    Loading...Loading...Saving...
    • Clear all videos from this list
    • Learn more