 Live from the Fairmont Hotel in San Jose, California, it's theCUBE at Big Data SV 2015. Okay, welcome back everyone. You are watching theCUBE live in Silicon Valley. This is Big Data SV, our event where we are covering all the action in the Big Data Space Strata conference, Strata Hadoop, Hadoop World's going on in conjunction with Big Data SV and we're pleased to be here. I'm John Furrier, the founder of SiliconANG. I'm John Joe, my co-host Jeff Kelly, chief data analyst at Wikibon.org. Our next guest is Tendu, Yogachu, Big Data, GM. The GM of Big Data for Syncsort, welcome to theCUBE. Thank you for having me. So Syncsort, you guys have been on a bunch of times and we love talking with you guys. One, the team's fantastic, but you're in an area that is constantly in the action, almost out of the action, the mainframe business, but it just doesn't go away. Every big bank pretty much runs a mainframe these days and IBM certainly is kicking back up with the mainframe mojo with the system Z announcement of which theCUBE was in New York City at the jazz center, covering it, brings to the foreground the focus of big, large, big iron compute. So if compute is almost free, storage is almost free, you got to put it somewhere, you need more power. So we're seeing a movement back to this mainframe where real big systems need to work in there. So with Linux, certainly you have DB2 and a bunch of stuff on the IBM mainframes that they sell, but in reality, Linux is now going to be supported, multiple silicon type innovations going on with IBM and others. So what's Syncsort doing there to stay in the mix to help those customers who have mainframes and ones that may want to buy new mainframes? You know, and like I said, big financial students are buying Jeff Kelly's research points to that, certainly in big data, you need to move the data around. So what's going on with you guys? When you look at the mainframe market, there are two different workloads you see. One, the transactional workloads that are very critical and they continue to run on the mainframe and you can also relate IBM C-13 announcements around that because those critical workloads are continuing to run on the mainframe. And then you have the batch workloads that have been historically created on mainframe and that are just being more and more cost for the organization because they are priced by the MIPS even though there are some efficiencies that are created by the Linux on the mainframe, it's still money, it's still cost for the organization. So we see a shift in terms of Hadoop being so disruptive, disruptive because it is really creating a scalable storage and affordable storage and affordable data processing platform. We see those workloads shifting from mainframe. So we are not a disadvantage like some of the blue veils, big veils you mentioned yesterday, we don't have this complete business model created just for mainframe, we have a portfolio of products that really are covering the Unix and Linux world and our product line on the big data with Hadoop and Cloud. So we have a very good opportunity to fill that gap between the big iron and big data and our focus have been around those workloads, moving the batch workloads from mainframe to big data. So explain to the folks out there, Tendu, the mainframe has been kind of like this industry that's kind of been carried along from the legacy of the glass house, the big iron, the IBM days, you just mentioned, now there's new things going on that have mainframe-like characteristics so much mainframe legacy in terms of software components, but the notion of a mainframe, which some say the cloud is the mainframe now, so which is an argument that we always have, we think it's true, you distribute computing in the cloud, Amazon is the mainframe, you can look at it any way you want, but that being said, if I'm a big bank for instance and I'm running full trading deaths on, I got to have a modern mainframe as well as support the legacy. So what is the difference between the old and the new and focus more on what the packaging is with this new mainframe, you mentioned Linux, you mentioned Hadoop, you mentioned Batch, these are all concepts from the mainframe. What concepts carry over into the new modern era and what does it look like? What we see, we launched our Hadoop big data product second half of 2013 and 2014 we were in production and deployment in very, very large enterprises globally. So what we have seen from our customers and the customers have been a really great source of innovation for us is there are use cases that are being discussed over the spectrum of operational efficiency where organizations are trying to sustain or reduce the costs of existing traditional warehouse architecture to the other end of the spectrum where you have new go-to-market opportunities with the new data-driven applications, more transformative applications. So this spectrum is there. However, we see pretty much everybody, 80% of our customers starting from the operational efficiency end because it's a low-hanging fruit. And having said that, low-hanging fruit doesn't mean it's an easy job to do. There are several challenges around that because once you are trying to offload the data warehouse workload, you have to really prove the value of this that it is going to help with sustaining the cost or it's going to help with reducing the cost. That's really the main driver for that use case. So the challenges we have seen are around one. I don't even know what the existing workloads are exactly doing. I don't understand the business logic because the person who wrote those SQL scripts have left a while ago. So nobody understands the 7,000 lines of BTECH script that somebody wrote. It would be great if somebody can understand and recreate that in the Hadoop platform. So that's number one challenge. And the second challenge is that in 2014, really that proof of value first project going into production happens and it happened and it created an opportunity to budget for 2015 where they can grow the cluster and serve more line of businesses moving forward. That's really critical. However, there is such a rapidly evolving and rapidly improving stack of technologies. You see additional projects in the Apache Hadoop appearing every two months, every three months, which is great because the technology optimization has to happen for platform to be stable. However, it's challenging for the organizations. So we started seeing our customers being concerned. Look, I'm just trying to get this production deployment stabilized and run in Hadoop MapReduce. Now everybody is talking about, everybody is talking about TES. Tell me which one is gonna win and how should I feature proof my processes. So that's the number two challenge. So what we are offering, it creates a big opportunity for companies like us. We have been a big believer for the open source Hadoop project and we have been contributing to that, partnering with Cloudera, Hortonworks, MapR, all major Hadoop vendors. And it's an opportunity for us because all Hadoop vendors head to focus on stabilizing the platform and even making it a platform, right? From being a MapReduce batch processing framework on distributed file system to being a platform for big data where multiple workloads can run. Focus has been there. It's an opportunity for us. We just launched last week our new big data edition and that big data edition really focuses those two problems, even an execution layer that completely decouples the user experience and how the data pipeline is created from the compute framework. So it's designed with flexibility. The execution layer really decides what should be created as a map job, what should be created as a reduce job, what should be run on an edge node in the cluster, whether you are running it on your laptop on a Windows or Unix or Linux or Hadoop today and tomorrow in Spark is completely abstracted from the user. So feature-proofing that environment, that's very exciting for us. Do you think that's been holding back the market a lot? Is confusion and concern about, I don't know where this is gonna go. I mean, as you mentioned, the innovation is happening so fast in the Hadoop community, which is great, but in fact, it's almost ironic that the level of innovation is actually stalling the market a little bit because people are now like, well, I don't know what's gonna be next and I don't wanna build out an application or I don't wanna build out a platform that's gonna be obsolete because I didn't build it in a way that allows it to adapt to these new things that are gonna come down the line that I have no idea what they are now, but they could be very important in just two, three months from now. I wouldn't say that it's stalling the market because I think we see the adoption increasing. Certainly, I think it's perhaps the adoption is not as fast as we would like. Because the value is still there. Exactly, the value is still there and I think the major challenge is really finding the skill sets and having that proof of value project that the internal stakeholders can buy in. That has been the common challenge and we do work with our customers. We work with them to kind of help with those projects and that's why companies offering end-to-end solutions for those use cases are very critical for accelerating the Hadoop adoption. We have this project, Silk. It's a tool, for example, for visualizing SQL scripts, legacy workloads and now a new release of that can generate automatically jobs that are going to offload the workload to Hadoop. Those kind of for end-to-end solutions are critical for that option. So talk a little bit about think-source strategy to grow your big data part of the business. You made some acquisitions, William Data Systems you acquired recently. Talk about the role of acquisitions, organic growth, as well as the partnerships which you touched on which are also critical. What's your strategy there to kind of grow this business? Our strategy is basically continue investing and continue investing in many fronts. Continue investing in the open-source project because making that enterprise ready and partnering with the open-source Hadoop vendors, we are a big believer of the Apache project. We will continue investing that. The second investment is organically for innovation. For example, the Hadoop product, the MXH, our launch of the Silk project, the big-frame initiative that we just launched with Clouder and Cognizant for mainframe offload, innovation organically. And the third investment around the acquisitions and William Data Acquisition is very complementary to that. We ultimately want to make all data accessible in the new big data platform, in the Hadoop platform. In order to make all data accessible, all data includes both social tweets, feeds like JSON. It includes clickstream data in web logs. It includes legacy data from relational databases, whether it's on the Unix systems or on mainframe. So William Data Systems is a great example of that because they basically have a network telemetry data and security data on mainframes. If you look at this re-architecting the mainframe data warehouse on the Hadoop platform, this data is critical because you see common use cases, customer churn analysis. You see common use cases about security of the platforms. Security data and network telemetry data being available on Hadoop or being available as part of our Splunk partnership, feeding in the Splunk, is very critical and complementary to that. Make data accessible and also offer end-to-end solutions so the users can future-proof their big data processes for emerging compute frameworks. So we talked a little bit about needing to take a platform approach. So I've got to ask you about the big news this week was the announcement of the open data platform. And there's pretty strong opinions on both sides. Rowan Khan, where do you come down on that? Does it need a new consortium, such as the open data platform, to standardize around core Hadoop? Or do you think the open source community can handle that job just fine? There is already an open course for Hadoop. It's Apache Hadoop project. However, I think it's too early to comment on it because we have to yet see what will come out of it because the claim is to have Hadoop adoption accelerated. And if it helps with that, great. It just kind of creates another. As long as it does not fork from Apache Hadoop and it does not really become an obstacle in terms of the innovation that happens in the Apache Hadoop project, that's fine. I would like to see a little bit more in terms of what kind of initiatives are they going to bring. Hortonworks a couple of weeks ago announced that they will be partnering with the customers and having customers get involved with the data governance, for example. And data governance is certainly a challenge, especially in the financial services or health care life sciences. So if there is an initiative that comes out of it as part of this data platform that's going to solve certain challenges in that option, we have to see. Yeah, it's very early days, but then it's interesting, though, how quickly opinions were formed around it. So I've got to ask you about the IBM. Do you think they have a system Z winner here? When you looked at that announcement, what did you think about the overall mainframe direction? Overall mainframe direction. I think mainframe is somewhat interesting because it's so powerful in terms of the transactional workloads. And mainframe users are also open to the fact that anything that kind of creates efficiency reduces the pricing or the charges that they are paying by MIPS. They are quite open-minded. So I think the Z13 is great for certain type of workloads like hundreds of site per Monday's data being processed and IBM has quite ambitious goals around that. It's great for those workloads. IBM's strategy in terms of the Hadoop, having a big as their own version of Hadoop instead of going, for example, now the partnership with Hortonworks, has been probably a little bit more challenging overall strategy. I think, I still think, despite the fact that Z13 is very powerful and suitable for certain workloads, there will be quite a lot of offloading happening from mainframe to more affordable platforms like Hadoop. Tendu, thanks for coming on theCUBE. Really appreciate it. I want to ask you one final question. For the folks out there, when should they call you guys at SyncSort? I mean, do they just say, hey, I have a mainframe, they call you or I want to sunset certain things, bring in new. When do you guys engage with new customers and existing customers? With both, for new customers and existing customers, you basically call us, when you have a data pipeline that you have expensive workloads that you want to offer the traditional data warehouse or mainframe to Hadoop and you want to create a data pipeline that's future proof that you don't have to worry about whether it's going to run on MapReduce or Spark or Tess or on-premise or on-cloud. If you want to have complete user experience decoupled while taking advantage of running natively in the platform, which is the case for us, then you call us. Okay, we are here inside theCUBE live in Silicon Valley for theCUBE. Big Data SV in conjunction with Strata Hadoop, Strata Conference at Hadoop World. This is theCUBE, all the big data week action here happening at the Fairmont Hotel. I'm John Furrier with Jeff Kelly. We'll be right back after this short break.