Integrating Hadoop Batch Data Processing into your App





The interactive transcript could not be loaded.



Rating is available when the video has been rented.
This feature is not available right now. Please try again later.
Published on Sep 15, 2012

Some data is better generated in batch, for example: user session data, ad click through rates, sales conversions, and recommendations.

Unfortunately, to generate these useful pieces of information you have to process GB, or maybe even TB of log and database data files.

Enter Hadoop Map-Reduce, a linearly scalable way to process vast amounts of data with relatively trivial code.

Getting started with Hadoop can be a total time sync, to prevent this, I'll try my best to lead you in the right direction and go over the following points:

1) MapReduce and Hadoop 101
2) Getting Data into Hadoop
3) Running a MapReduce job in Ruby (!)
4) An overview of Ruby MR helper frameworks
5) Using SQL for ad-hoc data queries via Hive
6) Integrating these jobs into your ruby app framework (scheduling, starting jobs, getting the data out again)

Plus the bonus:
7) How to use this setup to automate A/B testing and totally blow your mind.

Help us caption & translate this video!



When autoplay is enabled, a suggested video will automatically play next.

Up next

to add this to Watch Later

Add to

Loading playlists...