 This is George Gilbert, we're on the ground at Spark Summit 2015. We are with Shree Ambati, who's CEO and founder of H2O. They are one of the premier machine learning vendors. Shree, tell us, we've heard a lot about different companies offering machine learning services, algorithms, tell us what the different design centers are, where you fit. H2O is the fastest scalable machine learning on top of Spark. And with Sparkling Water, what we've done is bring machine learning and data science to the developers in the data science world, which is basically software is beginning to now encompass even data science. And as we bring powerful data science machine learning tools into software, we're able to build smarter applications. So the entire internet, which is being rebuilt, will be built with smarter applications that are more AI-based, that can take data and learn from data and improve on the business logic on the fly. And that kind of real-time aspect is what we bring to Spark and into Spark streaming. And so we can build these fast, scalable models and deploy those models as scoring engines into the end customer's applications. Okay, so to clarify, are you making the use of machine learning algorithms simpler for developers who want to embed them in the applications, or are they more scalable and fast once they've been operationalized and integrated in the applications that are going to run? So it's a combination, it's a very good question. First we started with the training of the models. And so data scientists use training of large data sets, historically that is encumbered by doing a lot of sampling and trying to find the right dimensions in your data, which you don't need to do with us. You scale across multi-dimensions, across large rows, across multiple machines and still do it in seconds, not hours and days. So no sampling. So you can then throw a lot of data into the H2O and then out comes a lot of interesting models and interesting characteristics of your data. Now the second piece is people spend a lot of time in cleaning data, in feature engineering, in cleansing and trying to build the right joints and curating the right tables. I think that's where we work very closely with our own libraries as well as pipelines in Spark. So then we can build, people, customers can build Spark pipelines and then push really cleaner data into H2O or use MLlib, which is a local library, H2O is a fast, scalable enterprise grade library for machine learning. And once the models are built, they then produce a nano-fast scoring engine. In H2O we can produce a Java-based scoring engine from your R-driven modeling experience or notebook-driven modeling experience gets transformed into a real Java scoring engine that can be deployed into production inside the applications as both a code or as REST API. So both sides we speed up by rewriting the guts of the core implementation. So to be clear, so the large sort of unwashed data you'll work on and extract structure and features and characteristics from that, then that goes through Spark streaming-based pipeline, which could have Spark ML on it or could have more of your algorithms. And out of that comes a production-ready model. Absolutely. And so the end-to-end, how do you get a comprehensive machine learning platform? That's what's at stake and comprehensive data science platform, because eventually data is everywhere, it's table stakes. People are collecting lots of data. What you do with the data is what H2O is focused on, it's what Spark Summit has been focused on. And as bringers of open source machine learning libraries and open source containers for smarter applications, we have come together to build a very comprehensive platform. It's called Sparkling Water. So Sparkling Water has been in the works for the last year and a half and over the last six months has gone into production into some places and a lot of new contributions are happening there where you can take and build applications, whether it's fighting crime with deep learning or using deep learning to apply much more accurate models in real time. So Sparkling Water is kind of the best of breed of open source that's happening in big data. Okay. Sparkling Water and they're finally together in ways that customers, developer data scientists can use it. Okay. Great. George Gilbert on the ground at Spark Summit 2015 to be continued with Sri Ambadi at a future date where we will dive deeper into this most interesting technology.