Loading...

H2O's distributed DeepLearning, GradientBoosting and ElasticNet on 8-node EC2 cluster

2,992 views

Loading...

Loading...

Transcript

The interactive transcript could not be loaded.

Loading...

Loading...

Rating is available when the video has been rented.
This feature is not available right now. Please try again later.
Published on Jun 12, 2015

Open-Source Machine Learning on Big Data with H2O: Free open-source download at http://h2o.ai/download

We parse a dataset with 116M rows, 31 columns, and build 3 different models (Elastic Net Logistic Regression, Deep Learning and Gradient Boosting), all in less than 10 minutes. Everything runs on distributed servers. The only dependency is Java.

Software: H2O 3.0 (open-source) by H2O.ai http://h2o.ai/
Hardware: 8-node cluster on Amazon EC2, c3.2xlarge, 8 cores (Xeon E5-2680 v2), 15GB per node (12 GB for H2O)
Dataset: Airlines 1987-2007, 12GB CSV, 116M rows, 31 columns, 725 predictors
Models: Elastic Net Logistic Regression, Deep Learning and Gradient Boosting
Presenter: Arno Candel, PhD, Chief Architect, H2O.ai

Note: Models are built using the Flow GUI, not tuned for accuracy. H2O comes with R and Python client packages, as well as native integration with Java and Scala. H2O runs stand-alone or on top of Hadoop and Spark, HDFS, Yarn, Mesos, etc.

More info at http://h2o.ai Join the Movement: open source machine learning software from H2O.ai, go to Github repository https://github.com/h2oai

Do you like this? Check out more talks on open source machine learning software at: http://www.slideshare.net/0xdata

Loading...

When autoplay is enabled, a suggested video will automatically play next.

Up next


to add this to Watch Later

Add to

Loading playlists...