Loading...

Best Practices for Using Alluxio with Apache Spark - Cheng Chang & Haoyuan Li

915 views

Loading...

Loading...

Transcript

The interactive transcript could not be loaded.

Loading...

Rating is available when the video has been rented.
This feature is not available right now. Please try again later.
Published on Jun 12, 2017

Alluxio, formerly Tachyon, is a memory speed virtual distributed storage system that leverages memory for storing data and accelerating access to data in different storage systems. Many organizations and deployments use Alluxio with Apache Spark, and some of them scale out to over petabytes of data. Alluxio can enable Spark to be even more effective, in both on-premise deployments and public cloud deployments. Alluxio bridges Spark applications with various storage systems and further accelerates data intensive applications. This session will briefly introduce Alluxio and present different ways that Alluxio can help Spark jobs. Get best practices for using Alluxio with Spark, including RDDs and DataFrames, as well as on-premise deployments and public cloud deployments.

Loading...

When autoplay is enabled, a suggested video will automatically play next.

Up next


to add this to Watch Later

Add to

Loading playlists...