Loading...

data.bythebay.io: Matt Dowle, Parallel and distributed big joins in H2O

206 views

Loading...

Loading...

Transcript

The interactive transcript could not be loaded.

Loading...

Loading...

Rating is available when the video has been rented.
This feature is not available right now. Please try again later.
Published on Jul 15, 2016

Matt has taken the radix join as implemented in R's data.table and parallelized and distributed it in H2O. He will describe how the algorithm works, provide benchmarks and highlight advantages/disadvantages. H2O is open source on GitHub and is accessible from R and Python using the h2o package on CRAN and PyPI. ----------------------------------------------------------------------------------------------------------------------------------------

Scalæ By the Bay 2016 conference

http://scala.bythebay.io

-- is held on November 11-13, 2016 at Twitter, San Francisco, to share the best practices in building data pipelines with three tracks:

* Functional and Type-safe Programming
* Reactive Microservices and Streaming Architectures
* Data Pipelines for Machine Learning and AI

Loading...

When autoplay is enabled, a suggested video will automatically play next.

Up next


to add this to Watch Later

Add to

Loading playlists...