Hadoop

Loading...

Sign in or sign up now!
Alert icon
Upgrade to the latest Flash Player for improved playback performance. Upgrade now or more info.
6,807
Loading...
Alert icon
Sign in or sign up now!
Alert icon

Uploaded by on Mar 26, 2011

http://www.nearinfinity.com

Scott Leberknight presents on Hadoop.

Hadoop is an open source framework maintained by the Apache Software Foundation for creating fault-tolerant, distributed applications that process vast amounts of data in parallel across a cluster of commodity servers. Hadoop consists of two primary components: the Hadoop Distributed Filesystem (HDFS) and a MapReduce framework. HDFS is a distributed filesystem which efficiently stores very large files across a cluster in a fault-tolerant manner. MapReduce is a framework for dividing data processing into two distinct phases, mapping and reducing, in order to deconstruct a problem so it can be run in parallel across many machines in order to speed data transformation and aggregation. In this talk we'll look at both HDFS and the MapReduce framework. We'll also look at one specific Hadoop subproject, Hive, which provides a data warehousing capability on top of Hadoop and allows developers and analysts to query their data stored in HDFS using SQL queries.

Link to this comment:

Share to:
see all

All Comments (5)

Sign In or Sign Up now to post a comment!
  • Would have been nice, if you can batch some of the poor voice spots. But nice presentation !

  • Awesome Video loved the presentation and ease with which its presented

  • Tough crowd. Great presentation.

  • thank you for this video. it is very informative.

  • Very good, concise, helpful, super.

Loading...
Alert icon
0 / 00Unsaved Playlist Return to active list
    1. Your queue is empty. Add videos to your queue using this button:
      or sign in to load a different list.
    Loading...Loading...Saving...
    • Clear all videos from this list
    • Learn more