 Live from Midtown Manhattan, the Cube's live coverage of big data NYC, a silicon angled Wikibon production, made possible by Hortonworks, we do Hadoop, and when this go, Hadoop made invincible. And now your co-hosts, John Furrier and Dave Vellante. Okay, we're back here at Big Data NYC. This is the Cube, our flagship program. We'll go out to the events, extract the ceiling from the news. This is day three of wall-to-wall coverage of Big Data NYC, the big event in Manhattan, in the Big Apple, doing Big Data, Hadoop World Stratoconference all here, all the players, all the entrepreneurs, all the CEOs, guys making it happen, building a business. This is SiliconANGLE with the exclusive coverage of Big Data NYC. Our next guest is Jigain Sundar, CTO of VP of Engineering of Big Data, company WAN Disco. Jigain, great to see you again. And we have SK, Hadoop Big Data, product management at Pivotal. Two companies that are really lighting it up here, welcome to the Cube, guys. Two companies that are lighting it up here in New York, obviously the buzz is heavy around WAN Disco. You guys have done some great work here in New York City. And Pivotal's Pivotal, an amazing customer event, unfortunately we missed because we were doing interviews up until like eight o'clock on the Monday night. But you had a great event in the New York Stock Exchange, heard some great reviews on the floor with customers, they're a customer of yours. So first of all, guys, let's talk about Pivotal and WAN Disco, talk about the relationship and talk about what you guys are working on together. Thank you, John. We have a deep partnership. And as you know already, we've switched away from the distribution of Hadoop business to the nonstop Hadoop business where we build a disaster recovery and high availability solution for all components of the Hadoop ecosystem on top of existing distributions. And one of the distributions of great significance, of course, is the Pivotal distribution. Our partnership is twofold, one, we run on top of the distribution and two, their application Hock runs on top of our nonstop Hadoop infrastructure. And there are great benefits to that, we can go into more detail about that later on in this conversation. Let's talk about Go Pivotal, which is their Twitter handle. I like to say Go Pivotal because they're really changing the game. They spun out from VMware, basically the biggest startup you've seen in a while. A lot of people were like, okay, that's interesting, but now Paul Moritz is a great leader, visionary, knows his chops. What's going on? What are you guys doing in the relationship? And how does that fit into the green plumb, specifically the Hadoop piece? Yes. So what we did is we're kind of changing the landscape of the Hadoop with the introduction of Hock. We announced it last April, and it's been very well received in the market, right? So with Hock, what we did basically is took the MPP database, green plumb database, and made it run on top of Hadoop. So with that, you get the industry leading optimizer and a sequel, anti-sequel compatibility, anti-sequel 92, 99, 2003, wallop extensions and stuff like that. So it's been very well received so far in the market. So the customers we talked to, they have a specific problem to solve on top of Hadoop and Hock perfectly fits the bill. The big data platforms is the rage. I mean, I would say that there's two big threads here at Hadoop World, big data NYC, and that is the data platform, data operating system, all the nuances that go in and making that a platform, and then business analytics, advanced analytics, things that are happening on top of it. What specifically do you guys see in that data platform to bring that to the enterprise? Because that's really the conversation that people want to talk about is, hey, I have an enterprise. We want to start scaling. 2012 was the year of kick the tires, small POCs. This year was bigger pilots. Next year is broad scale and adoption. And then that second wave of followers comes in and follows that same cycle. So growth is big, right? So everyone wants to know, what is that platform? What is that enterprise platform? What's available today? What's coming? What gives that nonstop operation for enterprises? They need that availability. Can you guys talk about that? Yeah, absolutely. I mean, if you think about today's enterprise customers, they're thinking about, from different angles, you have storage guys looking at from the storage angle. How do I reduce spend on the storage? So HDFS is a very good storage protocol. And at Pivotal, we believe HDFS is going to be a standard protocol. And it's going to be there forever. On top of it, you need services to make use of the storage that HDFS provides. So that's where Hock and other services come into picture. So once you have your data in HDFS, which is highly available and you can build, cheaply build a HDFS cluster, you can run other services, whether it is an OLAP services or OLTP services on top of it. And that's something enterprises customers are looking for today. So what's your take on that? So HDFS, of course, is well proven in the world of internet companies like Yahoo and Facebook. Its scale is well known. Its low cost of storage is well known. The pieces that enterprises are looking for, in addition to these features, is the continuous availability, for example. And what we- That's the non-stop message, right? Absolutely, that ties in exactly with our message. We provide protection against failure of single nodes within a data center, entire racks within a data center, or entire data centers themselves. Your applications will be non-stop and there will be no interruption to your data availability. That's something that enterprises have time and again asked us for. That's the difference between internet scale operations in Yahoo or companies like that, which may do things like analytics once a day, versus some enterprise application use cases where things need to be up all the time. And that's the difference in our product. We provide this non-stop availability. Of course, we have three, five, or seven name nodes servicing a single Hadoop cluster. By that, we can distribute the load across these three, which has other benefits as well. I mean, it kind of smells and feels like the early days of internet working. At a larger software scale, you guys kind of see that parallel and try to dissect it for the users out there that aren't in the weeds and the technology to kind of extract out the value. Because a lot of those enterprises, quite frankly, have Red Hat, they have NetApp drives, they have EMC drives. And then they look at it from storage angles and data coverage. So now this new world is upon them. So simplify it. Is it because map reduces now one element? There's now new elements. So what's the big secret aha this year for you guys in this relationship? So for us, when we talk to customers, we kind of hear three themes of use cases. One is the big data lake where you want to store any kind of data, there's a structured data, unstructured data, semi-structured data. Then you think about processing the data that is stored in the infrastructure. Then the other one is the big data apps. I want to make use of the data and create apps and plug into my existing enterprise applications. The third one is, which is very interesting, is ETL offload. So I'm doing my ETL processing in my traditional ETW systems. Instead of doing my ETL offloading in the expansion systems, I want to offload that to the HDFS platform. And make use of the services available there. This is some of the major themes we see, and we are driving towards addressing those needs. So SK, I got to ask you, EMC, your parent company, is well known for high availability. Why do you need WANDISCO? We are working with the WANDISCO for a solving specific problem in terms of expanding the Hadoop's capability of making sure the HA is available for a large use cases. So I had to ask the question, we know we love EMC, we've been following EMC for, again, every EMC rules is 2010 when we first launched theCUBE, but it's not as easy as saying they have it. So I wanted to ask specifically. Absolutely, we have total respect for EMC as well and Pivotal. We think it's a spin off that's going to work really well. They have great technology in the name of SRDF that was invented for making Microsoft Exchange run across short distance distributed data centers. But it replicates at the block level, at the disk block level. So it's not aware of file systems that run on top of it, and it's not resilient to long latencies. The problem with that is, if something like Hurricane Sandy hits, you've got both New Jersey data centers and New York data centers taken out. So your disaster recovery solution just failed. Our solution is not dependent on distance. You can separate your data centers by a world's length. You can have a data center in Asia and one in Europe and another one in America. And distribute the load too, it's not just availability. Absolutely, they're all active. That's a big part of it, right? Exactly, they're all active. And to go back to your earlier question, MapReduce is taken for granted now. You need more interesting applications such as Hock. The thing that people don't understand is Hock puts more load as any more intensive application would. Puts more load on the name node. Well, we've got a solution for that. You've got three, five or seven name nodes servicing the Hadoop. So you really can balance the load across that. And that's the case for other technologies that compete with Hock as well. And that's not necessarily a bad thing. More load means more pressure. More pressure point means more volume. So essentially what you're doing is essentially relieving that. You know, the stress, if you will, on the name node. And making it less vulnerable. Of course, automated, we talked about that yesterday about the manual reboots of the name node, which is, again, a whole other conversation. Correct. Well, that's great. Well, my observation on EMC is one, they let their companies do what they want. And then if it comes to a certain scale, they'll buy wind disco. So, you know, don't settle for a low price. Well, above my pay grade. Of course, I just, you know, I just had to get that in. No, that's what EMC operates. EMC lets VMware was the same way. VMware, you know, Dave and I talk about this all the time. You know, VMware was left. And they did deals with competitors of EMC. So it's like, you know, hey, whatever. And then ultimately, you know, the federation. Well, that's all in the conversation. Okay, so back to the conversation. So you guys have a relationship with NYSE. We've interviewed them on theCUBE before and they have an interesting solution around distance. They have, what's the product they have? They have a. It's a pivotal data dispatcher. Yeah, you guys now, have you joint developed that with NYSE? That's right. How does that fit it? Does this fit into that at all? Yes, so that fits very well into the enterprise market where you have a large amount of data distributed into different clusters of services, right? So now you want to bring the data and provide access to the data to certain users at a certain period of time. So PDD enables that kind of a control to, and you see the control over the data distributed across the enterprise. And it's been very well-received by our customers and we see a tremendous interest in using that. Do you sell that or do they sell it? We sell it. But they don't sell it. It's a joint agreement with us. So it's a joint venture? Yeah. But joint sales or not? It's sold by us through Pivotal. Great. Okay, great. Any other news on the partnership? You guys want to share with the folks? No, I think it's a great partnership. And both we have brilliant strategists, Scott Yara on their side and David Richards on our side driving this. So we expect nothing. Scott's great. He's been on theCUBE our first strata. We did three years ago with theCUBE. Great to have him on theCUBE. You guys in Green Plums. Especially Hawk is doing great. The data platforms are here, right? They're starting to solidify. Can you guys just quickly talk about how you guys are seeing the ground firm up? Where's the ground firming up where people are starting to really build on? Where's the soft spots that are hardening up in this marketplace? We'll start with you guys. So we see Enterprise again. They're trying to use the HDFS as a common storage substrate for OLDP and OLAP application. Recently we announced Gemfire XT, which is our in-memory data grid run that's running on top of Hadoop. And it uses HDFS for storage. It's essentially an OLDP system on top of Hadoop. So we are kind of expanding our portfolio to address the enterprise market need that are trying to use Hadoop as a common data substrate. So that's where we are going at this point. Guys, thanks for coming to SK. Appreciate it. It's getting great to see you again. VP of engineering, great partnership, great buzz, you guys. Again, congratulations, Wendisco. Really lighten up the show, like I said earlier. And obviously Pivotal, making moves. You guys are well-known. Everyone's aware of you and a force to be reckoned with. Congratulations on all the success you guys had and the new opportunity in front of you guys. Just a great vision, I like that separation. Always been a big fan of when that happened. And again, like Dave and I talked about, when Palmer Ritz in 2010 at VMworld was our first VMworld with theCUBE, he laid out the architecture now. Might have two separate companies with the same game and the stack is filling up nicely. Evolution in this business is great. Big data is hot. This is big data, NYC with Wendisco and Pivotal here inside theCUBE, live from New York City. We'll be right back with our next guest. All right. From Midtown, Manhattan, breaking analysis from theCUBE. Going to Disneyland. I mean, these guys are great. I think this is a revolutionary form. Up till a few years ago, I'd never seen this in my entire career. These guys are great interviewers. They're spot on, they're sharp, they're funny to work with. And they just ask great questions. So it's a real pleasure to be on theCUBE. It's really great. theCUBE is a live mobile studio. When you bring it to events and we say we extract the signal from the noise, what we do is we get the absolute best guests that are at those events, we bring them inside theCUBE and we talk to them, we have a conversation. We really want to make it fun, exciting, but more importantly, extract the data from the guests and extract that metadata and share it with the world. So people can use that information to better themselves, better their companies. More importantly, connect with other people to do more business, to define more about the technology. And for us, this is the future. I watch many of the theCUBE interviews when you're handling other events. And it's both the combination of enjoyable and insightful. And what I like is the interactive banter back and forth, plus the fact that when I think about some of the conversations we have, they're not only deep, they're not only rich, but the audience themselves will really come to benefit from those conversations. When organizations bring theCUBE to an event, it just brings a whole new dimension. It adds a texture of not only independence, but also explodes content from their community into a much, much broader community. We tend to reach about 10 times the audience that's live at an event. So we're a big data-driven organization. We have a data science team that allows us to see not only what's trending broadly with the public, but what's trending in very specific areas in our specialty in tech. That allows us to vector our analysis and relevance from our research and journalist team into everything that we do as a media company. And really the benefit of theCUBE is a place for conversations for people to connect with each other and to learn about things. And it's a revolution in media. We look at the technology and the people behind it as tech athletes, those are the folks making the companies, making the technology, really creating the new value in this modern era. And it's fun, it's exciting, and more importantly, it's very social. theCUBE does an excellent job of taking this very, very broad platform and format and giving visibility to a very broad audience on each of the different key aspects of the technology. And it's a great environment for the broader community who couldn't be here today have visibility into what we're doing, what each of the tracks are, and what are the sort of the core trends that are associated inside of theCUBE and given a very balanced view from multiple dimensions around it. And I think that's invaluable for the community. We always know that your view is right until you hear a different perspective. So you're always interested and give me some neutral perspective. Help me see it from a different light, right? And maybe ask a hard question or two that I might not have considered. You know, in that sense, right, that independent voice that's always the ability to right have, you know, sort of independent, audited sort of perspective right of the world. It's always just good. So these guys bring an incredible wealth of knowledge from their own careers. They've been into a lot of different things in the industry. And they're independent, you know, they're able to bring different points of view. And you know, sometimes they have.