Loading...

Sparkling Pandas - using Apache Spark to scale Pandas - Holden Karau and Juliet Hougland

7,205 views

Loading...

Loading...

Transcript

The interactive transcript could not be loaded.

Loading...

Loading...

Rating is available when the video has been rented.
This feature is not available right now. Please try again later.
Published on Sep 16, 2014

Pandas is a fast and expressive library for data analysis that doesn’t naturally scale to more data than can fit in memory. PySpark is the Python API for Apache Spark that is designed to scale to huge amounts of data but lacks the natural expressiveness of Pandas. We will introduce Sparkling Pandas, a new library that brings together the best features of Pandas and PySpark; Expressiveness, speed, and scalability.

Loading...

When autoplay is enabled, a suggested video will automatically play next.

Up next


to add this to Watch Later

Add to

Loading playlists...