 This is George Gilbert. We're on the ground at Spark Summit 2015. We're with Kumar Shrikanti who's CEO of Blue Data. Kumar, good to have you. Thank you. So what Blue Data does in a nutshell is takes Hadoop, not just Hadoop, but big data as a service on your premises. Combines the best of the cloud and on-premise. So why don't you tell us a little more how it works? So what we realized is that there's a lot of innovation in the big data, but very little around the infrastructure space. So we're building an infrastructure software that gives you a cloud-like experience all around the big data applications. We are focusing on Hadoop and Spark. That is being one of the most requested applications from the customer. You can configure different versions of Hadoop, unmodified, run on the same hardware. You can also run Spark native on your premises. Okay, so we know from innumerable customers and their struggles with the operational side of Hadoop and potentially Spark, we know that there are a lot of knobs to tune. How do you fix that? So Hadoop has many many configurations, but to bring the Hadoop and Spark to the main stream, we provide configuration, all configuration options available in blue data. But blue data comes with the pre-packaged distributions that are available in open source. You can create a cluster with literally five mouse clicks and you can run your jobs. I believe in what is called fail fast experiment. You can actually literally create a 100 node cluster in 10 minutes. You can run your experiments, you can go back and modify the parameters. We preserve everything. So there is an advanced option to be able to for the people to be able to go and set whatever the parameters they need. Okay, so let's start with on the Hadoop side, which distros do you work with? So today we support Cloudera Hardenworks, all versions of Cloudera, all versions of Hardenworks. We actually working with other vendors, including the ODP, Big Insights and other versions. For us, it's a very, the way we architect it, it's very simple to add the distributions. We also support Spark open source and whatever the latest version 1.1, 1.4 I think if I remember. Okay, now with Spark, you were mentioning to me earlier, there's a new way of managing the underlying hardware, the resources. Spark has MISOs, it can run on Yarn, it has its own, it has its own sort of new file system that's memory oriented, Tachyon. Tell us how you work with those. Spark can run inside the Hadoop Yarn as a scheduler and that is one of the, one of the ways that the existing Hadoop vendors are incorporating Spark. We actually run Spark as a native because Spark does not need any of the Hadoop baggage per se, but it supports HDFS as one of the options. One of the things that we built in Blue Data is actually, we actually support all different versions of the file systems. We separate compute from store. So we not only support HDFS, we support NFS, we support object store, we support other data. And the Spark can run native and then we provide all the data that is required for the Spark to run. And Spark can run independent of the Hadoop on the same hardware at the same time with the Blue Data software. Just to be clear, when you say runs independent, that means multi-tenant where some of the CPU and memory resources are dedicated to Hadoop jobs and Hadoop infrastructure and some are dedicated to Spark. Very well said, exactly. So you, for us the currency is the VCPUs. So you, let's say you have a 100 node physical cluster. To be clear, VCPUs, virtual CPUs. So for example, you have an eight core, let's say you have a 100 node cluster, physical cluster, and each of them has a 16 cores. So you have a 1600 cores. So you can take them and divide them and say, 400 cores, I'm giving it to run a Hadoop cluster one that's running Hadoop 2.0 from Hardenworks. We can allocate 400 nodes to the Spark and they run independent and we manage the resources underneath. You can actually pause the clusters. You can actually move the cluster resources around. All our compute clusters are stateless. They can be saved because they run on virtual machines. Okay, so to be clear, then would it be fair to compare you to like a VMware for Hadoop and Spark except that VMware is too heavy where you don't want an operating system on every core or every server. Exactly. So we just announced in the Hadoop Summit last week, we actually have a Blu-ray software that now is based on the containers. So we support both containers and virtual machines. We want to give that choice to the users, but the user experience will be exactly same irrespective of whether it is on containers or on open style. So just to close your previous question, yes, it is think of us as an Amazon EMR-like experience on premise or like a VMware for big data. What VMware has done for traditional applications, we are doing it for a big data. Okay, just then the last question, simplifying the deployment and operation is a big deal. Is there a way to measure the savings in terms of admins per node or admins per 100 node cluster? We have done TCO analysis. It's very difficult to do a reasonable analysis on the people. We actually did the analysis on the capex. We have done as much as you get a 70% savings with the Blu-Data. What we have not done is because operational expense is very subjective, certain people say I have 100 people, certain people, but suddenly by reducing the number of nodes and number of switches and all the hardware required and the making it very easy to use, you can reduce operating expenses very easily and we can actually demonstrate that. Okay, Kumar just wanted to say thank you. We're on the ground Spark Summit 2015. This is George Gilbert and we will see you later.