Loading...

Nick Pentreath - Large Scale Data Processing

163 views

Loading...

Loading...

Transcript

The interactive transcript could not be loaded.

Loading...

Loading...

Rating is available when the video has been rented.
This feature is not available right now. Please try again later.
Published on Apr 21, 2015

This talk will give an overview of the MapReduce approach to large-scale data processing popularised by Apache Hadoop, and then move on to introduce the Apache Spark project and the basics of working with Spark. Apache Spark is a fast and general engine for large-scale, distributed data processing. It offers high-level APIs in Java, Scala and Python as well as a rich set of libraries including a SQL engine, stream processing, machine learning, and graph analytics. Spark is currently one of the most exciting and fastest-growing Apache open source projects.

Nick is a cofounder of Graphflow, a big data and machine learning company focused on recommendations and customer intelligence. Nick has a background in financial markets, machine learning and software development. He has worked at Goldman Sachs and as a research scientist at online ad targeting startup Cognitive Match in London, and led the Data Science and Analytics team at Mxit, Africa’s largest social network. He is passionate about combining commercial focus with machine learning and cutting-edge technology to build intelligent systems that learn from data to add value to the bottom line.

Loading...

When autoplay is enabled, a suggested video will automatically play next.

Up next


to add this to Watch Later

Add to

Loading playlists...