Alert icon
We're changing our privacy policy. This stuff matters.  Learn more  Dismiss

Public Terabyte Dataset Project: Web crawling with Amazon Elastic MapReduce

Loading...

Sign in or sign up now!
549 views
Loading...
Alert icon
Sign in or sign up now!
Alert icon

Uploaded by on Apr 22, 2010

Ken Krugler, the founder of Bixo Labs, describes the Public Terabyte Dataset project - a large-scale web crawl that uses SimpleDB, Hadoop, Cascading and Bixo in the Amazon's EMR cloud.

See the slides from this presentation on the Yahoo! Developer Network's Hadoop Blog: http://bit.ly/dpq67T

Category:

Science & Technology

Tags:

License:

Standard YouTube License

  • likes, 0 dislikes

Link to this comment:

Share to:

All Comments (0)

Sign In or Sign Up now to post a comment!
Loading...

0 / 00Unsaved Playlist Return to active list
    1. Your queue is empty. Add videos to your queue using this button:
      or sign in to load a different list.
    Loading...Loading...Saving...
    • Clear all videos from this list
    • Learn more