 Live from San Jose, California, it's theCUBE. Covering Big Data Silicon Valley 2017. Welcome back everyone. Live in Silicon Valley, it's theCUBE's coverage of Big Data SV, our event in Silicon Valley in conjunction with our Big Data NYC for New York City. Every year, twice a year, we get an our event going around Strata Hadoop in conjunction with those guys. I'm John Furrier with Silicon Angle. George Gilbert, our Wikibon announcer. Next guest is Josh Rogers, the CEO of SyncSort. Then on many times, CUBE alumni, the firm that acquired Trillium, which we talked about yesterday. Welcome back to theCUBE. Good to see you. Good to see you. How are you? So SyncSort is just one of those companies that's really interesting. And we were talking about this. I want to get your thoughts on this because I'm not sure it was in the plan or not, or it was just really genius moves by you guys on the management side, but legacy business, lockdown, legacy environments like the mainframe, and then transform into a modern data company. Was that part of the plan or kind of on purpose by accident? Or what's the... Part of the plan. I mean, so you think about what we've been doing for the last 40 years. We had specific capabilities around managing data at scale and around helping customers process that data to get more value out of it through analytics. And we've just continually moved through the various kind of generations of technology to apply that same discipline in new environments. And big data has frankly been a terrific opportunity for us to apply that same technical and talented DNA in that new environment. So it's kind of been running the same game plan actually for the next few years. You guys have good execution, but I think one of the things we were pointing out, and this is some of those things where certainly I live in Palo Alto in Silicon Valley. We love innovation. We love all the shiny new toys, but you get tempted to go after something really compelling, cool, and relevant, and then go, whoa, I forgot about locking down some of the legacy data stuff. And then you're kind of working down. You guys took a different approach. You're going into the trends from Solid Foundation. That's a different execution approach. And like you said, by design, so that's working. Yeah, it's definitely working. And I think it's also kind of focused on an element that maybe isn't as under reported, which is a lot of these legacy systems aren't going away. And so one of the big challenges. And then the system of record, by the way. Right, large enterprise has had this. How do I integrate those legacy environments with these next generation environments? To do that, you have to have expertise on both sides. And so one of the things I think we've done a good job is developing that big data expertise and then turning around and saying, we can solve that challenge for you. And obviously the big iron to big data solutions we bring to market are a perfect example of that. But there's additional solutions that we can provide customers and we'll talk more about those in the future. Talk about the Trillium acquisition. I want to just take a minute to describe that you guys bought a company called Trillium. What is it? I mean, just take a minute to explain what it is and why is it relevant? Yeah, sure. So Trillium is a really special company. They are the independent leader in data quality and have been for many years. They've been in the top right of the Gartner Magic Quadrant for more than a decade. And really when you look at large, complex, global enterprises, they're kind of the gold standard in data quality. And when I say data quality, what I mean is an ability to take a data set, understand the issues of that data set and then establish business rules to improve the quality of that data so you can actually trust that data. So obviously that's relevant in a near adjacency to the data movement and transformation that SyncSort's been known for for so long. What's interesting about it is as you think about the development and the maturity of big data environments, specifically Hadoop, people have a desire to obviously do analytics in that data and implicit in that is an ability to trust that data. And the way you get there is being able to apply profiling and quality rules in that environment. And that's an underserved market today. So when we saw, when we thought about the Trillium Acquisition, it was partly, hey, this is a great firm that has so much respect in the space and so much talented capability or powerful capability and market leading data quality talent, but also we have an ability to apply it in this next generation environment much like we did on the ETL and data movement space. And I think that the industry is at a point where people, enterprises are realizing I'm going to need to apply the same data management disciplines to make use of my data in my next generation analytics environment that I did in my data warehouse environment. Obviously, there's different technologies involved, there's different types of data involved, but those disciplines don't go away and being able to improve the quality and being able to kind of build integrity in your data sets is critical and Trillium is best in market capabilities in that respect. So Josh, you were telling us earlier about sort of the strategy of knocking down the pins one by one as it's become clear that we sort of took first the archive from the data warehouse and then ETL offloaded now progressively more of the business intelligence. What are some of the, besides data quality, what are some of the other functions you opted to do? So there's the whole notion of metadata management and that's incredibly important to support a number of key business initiatives that people are going to leverage. There's different styles of movement of data, so a thing that you'll hear a lot about is change data capture, right? So if I'm moving data sets from source systems into my Hadoop environment, I can move the whole set, but how do I move the incremental changes on an ongoing basis at the speed of business? There's notions of master data management, right? So how do I make sure that I understand and have a gold kind of standard of reference data that I can use to drive my analytics capabilities? And then of course there's all the analytics that people want to do both in terms of visualization and predictive analytics. But you can think about all these as various engines that I need to apply the data to make to get maximum value. And it's not so much that these engines aren't important anymore, I can now apply them in a different environment that gives me a lot more flexibility, a lot more scale, better cost structure, and an ability to kind of harness broader data sets. And so that's really our strategy is bring those engines to this new environment. There's two ways to do that. One is build it from scratch, which is kind of a long process to get it right when you think about complex global large enterprise requirements. The other is to take existing, tested, proven, best in market engines and integrate it deeply in this environment. And that's the strategy we've taken. We think that offers a much faster time to value for customers to be able to maximize their investments in this next generation analytics infrastructure. So who shares that vision and sort of where are we in the race? Look, I think we're fairly unique in our approach of taking that approach. There are certainly other large platform players. They have a broad set of capability and I think they're working on how do I kind of take that architecture and make it relevant and it ends up kind of creating a kind of cogeneration approach. I don't think, I think that approach has limitations. And I think if you think about taking the core engine and integrating it deeply within the Hadoop ecosystem and the Hadoop capabilities, you get a faster time to market and a more manageable solution going forward. And also one that gives you kind of, if future proofs you from underlying changes that will continue to see in the Hadoop components or the big data components, I guess it's a better articulation. Josh, what's the take on the show this year and the trends? Obviously we've been talking about machine learning and AI, you've seen that. Sure. As you guys look at your execution plan, what's the landscape happening out there in the show this year? I mean, they started to see more business outcome conversations. But with machine learning and AI, it's really putting pressure on the companies and certainly IoT and the cloud growth as a forcing function. What do you, do you see the same thing? What's your thoughts? So machine learning's a really powerful capability and I think as it relates to the data integration kind of space, there's a lot of benefit to be had. Think about quality. If I have to establish a set of business rules to improve the quality of my data, wouldn't it be great if those little rules could learn as they actually process data sets and see how they change over time? So there's really interesting opportunities there. We're seeing a lot of adoption of cloud. More and more customers are looking at how do I live in a world where I've got a piece of my operations on premise, I got a piece of my operations in cloud, manage those together and gradually probably shift more in the cloud over time. So we're doing a lot of work in that space. The, you know, there's some basic fundamental recognitions that have happened which is, you know, if I stand up a Hadoop cluster, I am going to have to buy a series of tools to make, you know, get value out of the data in that cluster. That's a good step forward in my perspective because this notion of I'm going to stand up a team offshore and they're just going to build all these things. Caught the ownership goes to the roof. Yeah, so I think, you know, the industries move past this concept of, you know, I make an investment in Hadoop but I don't need additional solutions. Well, it highlights something that we were talking about at Google next last week about Enterprise Ready. And I want to get your thoughts because you guys have a lot of experiences, something that's, again, in your wheelhouse, how you guys have attacked the market has been pretty impressive and, you know, not obvious and on paper it looks pretty boring but you're doing great. I mean, you've done the right strategy. It works. Yeah, that's the... You know, mainframe, locking in the mainframe system of record. We've talked about this in the cubes. There's a lot of videos going back three years. But Enterprise Ready is a term now that's forcing people, even the best, like Google, to be like looking in the mirror and saying, wait a minute, we have a blind spot. Best tech doesn't always win. You got table stakes, you got SLAs, you got, you mentioned data quality, one piece of bad data that should be cleaned and really kind of screw up something. So what's your thoughts on Enterprise Ready right now? Yeah, so I think that people are recognizing that, you know, to get a payoff on a lot of these investments in next generation analytic infrastructure, they're going to need to be able to run mission critical workloads there and take on mission critical kind of business initiatives and improve out the value. And to do that, you have to be able to manage the environment, achieve the uptimes, have the reliability and resiliency that, you know, quite frankly, we've been delivering for 40 years. And so I think that's another kind of point in our value proposition that frankly seems to be somewhat unique, which is, hey, we've been doing this for thousands of customers, you know, the most sophisticated. What are the ones that are going to be fatal flaws for people that they don't pay attention to? Well, security is huge. I think the, you know, manageability, right? So look, if I have to upgrade 25 components in my Hadoop cluster to get to the next version and I need to upgrade all the tools, like I've got to have a way to do that that allows me to, you know, not only get to the next level of capability that the vendors are providing, but also to do that in a way that doesn't maybe bring down all these mission-critical workloads that have to be 24 by seven. So those pieces are really important in having, you know, both the experience and understanding what that means, and also being able to invest the engineering resources to deliver- And don't forget for Salesforce, you got to have the DNA and the people on the streets. Absolutely. Josh, thanks for coming on theCUBE. Really appreciate it. Great insight. You guys have a, you know, just to give you a compliment. Great strategy. And again, good execution on your side. And as you guys are in new territory, every time we talk to you, you're interested in something new every time. So great to see you. SyncSort here inside theCUBE, always back and sharing commentary, almost going on the marketplace. AI machine learning with the table stakes and the enterprise security and whatnot. Still critical for execution. And again, IoT is really forcing the function of you to get, get a focus on the data. Thanks so much. I'm John Furrier, George Gilbert. We've back with more live coverage after this break.