 Jeff Frick here. We're on the ground at Spark Summit 2015 in San Francisco. Spark is the hottest technology that's going right now. There's a ton of people here, a ton of good vibe. I think it's the third Spark Summit and we want to come up and give you a taste. So we're joined here on our next segment by Sachin Guy. Did I get it right? Yep. All right. He's the Spark Practice and Innovation Labs leader at Impetus, so welcome. Thank you. Good to be here. Absolutely. So Impetus was just on the Cube at Hadoop Summit last week. What's the big delta between Spark and Hadoop? Are they competitive? Are they complementary? What's kind of your take? So it's quite an exciting space and a quite dynamic space right now. So Hadoop has been here for quite a few years and then Spark has been bringing the unified platform advantage as well as the in-memory processing advantage that we have. So a lot of customers, a lot of interest has been generated around Spark space right now. And we have been doing quite a few things over here in Spark. We are doing research and development, building a deep machine learning engine over on top of Spark. We have been building IPs. So one of these products we are demonstrating over here, Stream Analytics has a Spark streaming engine that we are building on top of that. We are also doing a lot of technology exploration so Spark is quite a dynamic space. It keeps on innovating. So there's data frames which came in March. Now there's ML pipelines. So we keep on doing the exploration which is happening around in the open source space. Also we are doing a lot of consulting around the space. So quite a busy thing. Busy time. Yeah. So we were at Hadoop Summit talking about the big data warehouse. Right now we are trying to build big data warehouse with Spark capabilities for the customer. Talk about some of the customer experiences that, you know, you are out in the field, you said you are doing some exploration. What are some of the applications that you are seeing that people could do now that they couldn't do before? So one of the things Hadoop was primarily famous earlier was for doing a lot of volume data. There could be a lot of variety of data. But the velocity part at the time of consuming it, consuming the data in applications was missing. The in-memory processing was missing. So Spark brings that advantage across. And it opens a lot of floodgates for applications for the customer. So they want to do a lot of sequel type of exploration on this. They want to do machine learning. They want to do graph processing, advanced data science kind of operations. And of course the streaming part of it. Streaming is pretty hard. So all of this, if they want to do it, Spark offers that unified advantage. You don't have to have five kind of tools to do that. You can do it with one platform. That brings a lot of flexibility and a unified advantage to the customer. So most of our customers actually start off doing some certain kind of a POC on why would we like to do a Spark versus what we were doing in a Hadoop environment, a Hive or an Impala kind of environment. They usually start with that analysis. But then when they discover Spark, they actually feel that unified advantage. And that's where we are building our products on. That's where we are building our streaming engine. We are building our machine learning pipelines on that. And it's use cases which have been coming from telecom customers, financial customers are centering around things like could we do kind of fraud detection? Could we have the machine data being coming up? So recently one of our manufacturing customers, they actually came up with a requirement to set up the entire data hub. And a lot of that data hub is components would be Spark over there. A lot of those data hub components. And are they taking out some of that other technology or is this all new workloads that they're building apps on Spark? So it's kind of two use cases that are coming up right now. One is of course migrating the old workloads. So the workloads they could probably do in a database kind of environment, whether they can do it in a Spark kind of environment now. Okay. Second kind of workloads is the new workloads. The new workloads are the advanced data science algorithms they would like to implement. It's a deep learning. So deep learning is something which was pretty much it's a very exciting space right now to be in. Right. And we have built a huge lot of algorithms, library around it. This is something which we could not do with the traditional tools. You were constrained with the kind of data that you could do. And with Hadoop also, people were constrained in terms of the processing power in terms of the power. So it's Spark offers that 100x kind of advantage in terms of processing also. So now those deep learning algorithms, which we have built actually give you a great advantage in exploring those advanced data science use cases. Exciting times, huh? Lot lot of things coming up, lot of queries from the customer, lot of products that can be built in this space. So that's why the research team over in Impetus Innovation Labs is actually trying to be at this cutting edge of the cutting edge. The cutting edge to the cutting edge. Careful. It's dangerous out there on the cutting edge of the cutting edge. It is kind of exciting more than dangerous. Okay, good. All right. Excellent. Well, Sachin, thanks for stopping by. Got to get in the hook here, but good times and sounds like you said you're on the cutting edge of the cutting edge. So I'm sure you'll have a lot more fun than danger, right? Sure. It's going to be fun. And it's going to be a lot of good return on investment that we are going to bring to the customers over here. Awesome. Well, thanks again for stopping by. I'm Jeff Rick. We're on the ground at Spark Summit 2015. You're watching the Cube. Thanks for watching.