 Live from Midtown Manhattan, it's theCUBE. Covering Big Data, New York City, 2017. Brought to you by SiliconANGLE Media and its ecosystem sponsors. Okay, welcome back everyone. We're here live in New York City for theCUBE's coverage of Big Data NYC. Our event, we've been running for five years, been covering Big Data space for eight years since 2010 when it was Hadoop World, Strata Conference, Strata Hadoop, Strata Data, soon to be called Strata AI. It's theCUBE, we've been theCUBE all eight years here live in New York. I'm John Furrier. Our next guest is Murthy Mathaprakasam, who's the director of product work at Informatica. CUBE alumni has been on many times. We cover Informatica world every year. Great to see you, thanks for coming by. And coming in. Great to see you. You guys know data, so there's not a lot of recycling of what's going on in the data because we've been talking about it all week. Total transformation. But the undercurrent has been, well, first of all, it's a lot of AI, AI of this. And you guys have the clear product and get a lot of things there. But outside of the AI, the lot of the undertone is cloud, cloud, cloud. Governance, governance, governance. There's two kind of the drivers I've been seeing as the forces this week is a lot of people trying to get their act together on those two fronts. And you can kind of see the scabs on the industry. People have been, some people haven't been paying attention and they're weak in the area. Cloud is absolutely going to be driving the big data world because data's horizontal. Cloud's the power source to that. You guys have been on that. What's your thoughts? What are the drivers and currents? First of all, do you agree with what I'm saying and what else did I miss? I mean, security's obviously in there, but. Absolutely. No, so I think you're exactly right on. So obviously, governance, security's a big deal. Largely being driven by the GDPR regulation that's happening in Europe. But I mean, every company today is global, so everybody's essentially affected by it. So I think data up till now has always been a kind of opportunistic thing that there's a couple of guys in the organization who are looking at it as, oh, let's do some experimentation. Do something interesting here. Now it's becoming government mandate. And so I think there's a lot of organizations who are, like to your point, getting their act together, and that's driving a lot of demand for data management products. So now people say, well, if I got to get my act together, I don't have to hire armies of people to do it. Let me look for automated machine learning based ways of doing it so that they can actually deliver on the audit reports that they need to deliver on, ensure the compliance that they need to ensure, but do it in a very scalable way. I've been kind of joking all week. I've kind of had this meme in my head, so I've been pounding on it all week, calling it a tool shed problem. The tool shed problem is everyone's got these tools. They throw them into the tool shed. They bought a hammer, and the company that sold them the hammer's trying to turn into a lawnmower. So they can't mow your lawn with a hammer, and it's not going to work. And so these tools are great, but it defines work, what you do. But the platforming issue is a huge one, and you're starting to see people who took that view. You guys were one of them, because in a platform-centric view with tools that are enabled to be highly productive, you don't have to worry about new things like a governance policy, a GDPR that might pop up, or the next Equifax that's around the corner. There's probably two or three of them going on right now. So that's an impact. The data, who uses it, how it's used, and who's at fault or whatever. So how does a company deal with that? And machine learning has proven to be a great horse. There are a lot of people who are riding right now. You guys are doing it. How does a customer deal with that tsunami of potential threats, architecture challenges? What's your solution? How do you talk about that? Well, I think machine learning up till now has been seen as a kind of nice to have, and I think very quickly it's going to become a must-have, because, exactly like you're saying, it really is a tsunami. I mean, you can see people who are nervous about the fact that, I mean, there's different estimates. It's like 40% growth in data assets for most organizations every year. So you can try to get around this somehow with one of these Kluji tools or something, but at some point, something's going to break. Either you just don't run out of manpower, you can't train the manpower, people start leaving, whatever the operational challenges are, it just isn't going to scale. Machine learning is the only approach, it is absolutely the only approach that actually ensures that you can maintain data for these kind of defensive reasons, like you're saying, the security and compliance, but also the kind of offensive opportunistic reasons and do it scalably, because there's just no other way, mathematically speaking, that when the data is growing 40% a year, just throwing a bunch of tools at it just doesn't work. Yeah, I would even just amplify and look right in the camera, saying if you're not on machine learning, you're out of business. That's a straight up obvious trend, because that's a precursor to AI, real AI. All right, let's get down to data management. So people throwing around data management like it's like, oh yeah, we got some data management. There are challenges with that. You guys have been there from day one, but now if you take it out into the future, how do you guys provide data management in a totally cloud world, where now the customer certainly has public and private or on premise, but there might have multi-cloud. So now it comes a land grab for the data layer. How do you guys play in that? Well, I think it's a great opportunity for these kind of middleware platforms that actually do span multiple clouds that can span the internal environment. So I'll give you an example. Yesterday we actually had a customer speaking at Strata here, and he was talking about from him, the cloud is really just a natural extension of what they're already doing, because they already have a sophisticated data practice. This is a large financial services organization, and he's saying, well, now the data isn't all inside. It's some of it's outside. We've got partners who've got data outside. How do we get to that data? Clearly the cloud is the path for doing that. So the fact that the cloud is a natural extension of what a lot of organizations were already doing internally means they don't want to have a completely different approach to the data management. They want to have a consistent, simple, systematic, repeatable approach to the data management that spans, as you said, on premise in the cloud. And that's what I think the opportunity of a very mature and sophisticated platform because you're not rewriting and replatforming for every new, you know, is it AWS? Is it Azure? Is it something on premise? You just want something that works and shields you from that underlying infrastructure. So I'm going to put my skeptic hat on for a second and challenge you on this because this I think is fundamental. Whether it's real or not, it's perceived, maybe in the mind, back of the mind of the CXO or whatever the CDO, whoever's enabled to make these big calls. If they're the keys of the kingdom, Informatica, I'm going to get locked in. So this is a deep fear. People wake up with nightmares in the enterprise because they've seen locked in before. How do you explain that to a customer that you're going to be an enabling opportunity for them, not a lock-in and foreclosing future benefits? Especially when I have an unknown scenario called multi-cloud. I mean, no one's really doing multi-cloud. Let's face it. I mean, I have multiple clouds, stuff on it, but no one's- He's not intentionally. Sometimes you've got a line of businesses and doing things, but absolutely. No one's really moving workloads dynamically between clouds in real time. Maybe a few people doing some hacks, but for the most part it's not a standard practice. But they want it to be. Absolutely. So that's the future from today to there. How do you preserve that position with the customer where you say, hey, we're going to add value, but we're not going to lock you in? So the whole premise, again, of, I mean, this goes back to classic three-tier models of how you think about technology stacks, right? There's an infrastructure layer, there's a platform layer, there's an analytics layer. And the whole premise of the middleware layer, the platform layer, is that it enables flexibility in the other two layers. It's precisely when you don't have something that's kind of intermediating the data and the use of the data, that's when you run into challenges with flexibility and with data being locked in a particular data store. But you're absolutely right. We had dinner with a bunch of our customers last night. They were talking about how they'd essentially evaluated like every version of sort of a big data platform, a big data infrastructure platform, right? And why? It was because they're a large organization and different teams start starting up stuff and then they had to like, you know, compete them out and stuff. And I was like, well, that must have been pretty hard for you guys. Well, we were using Informatica, so it didn't really matter where the data was. We were still doing everything as far as the data management goes from a consistent layer and you guys, we integrate with all of those different platforms. So you didn't get in the way. We didn't get in the way. We're facilitating increased flexibility because without a layer like that, a fabric or whatever you want to call it, a data platform that's facilitating this, the complexity is going to get very, very crazy, very soon if it hasn't already. The number of infrastructure platforms that are available, like you said, on-premise and in the cloud now keeps growing. The number of analytical tools that are available is also growing. And all of this is amazing innovation, by the way. This is all great stuff. But to your point about, if you're the Chief Data Officer of an organization, just going, I got to get this thing figured out somehow. I need some sanity. That's really the purpose of- They just don't want another tool for a tool's sake. They, we need to have it be purposeful. And that's why I think this machine learning aspect is very, very critical because I was thinking kind of about an analogy just like you were. You know, I was thinking like, in a way you can think of data management as sort of like cleaning stuff up and there's people who have like brooms and mops and all these different tools. Well, we're bringing a Roomba to the market, right? So, because you don't want to just create tools that like transfer the labor around, which is a little bit of what's been going on. You want to actually get the labor out of the equation so that the people are focused on business context, business strategy, and the data management is sort of cleaning itself up. You know, it's doing the work for you. That's really what Informatica's vision is. It's about being a kind of enterprise cloud data management vendor that is leveraging AI under the hood so that you can just sort of set it and forget it. A lot of the ingestion, the cleansing, you know, telling analysts what data they should be looking for. All of this stuff is just happening in an automated way and you're not in this like tool chaos. Yeah, and that can be, and Wheeler just builds up some tools that sit in the bag for a long time. I mean, my tool shed when I had one back, I think in a property back in Eastland. Palo Alto, no one has tool sheds, by the way. That's right. No one does any gardening. The issue is, at the end of the day, I need to have a reliable partner. So I want you to take a minute and explain to the folks who aren't yet Informatica customers why they should be, and the Informatica customers why they should stay with Informatica. Absolutely, so certainly the ones, we have a very loyal customer base. So in fact, the guy who was presenting with us yesterday, he said he's been with Informatica since 1999, going through various versions of our products and adopting new innovations. So we have a very loyal customer base. And so I think that loyalty itself speaks for itself as well. As far as net new customers, I think that in a world of this increasing data complexity, it's exactly what you were saying. You need to find an approach that's going to scale. I keep hearing this word from the chief data officers. It's like, I kind of have got something Clujie going on today. I don't know how I scale it. How is this going to work in 2018 and 2019 and 2025? And it's just daunting for some of these guys, especially going back to your point about compliance. It's one thing if you just have data sitting around the dark data, so to speak, that you're not using it. But God forbid now you got legal and kind of regulatory concerns around it as well. So you have to get your arms around the data. And that's precisely where Informatica can help because we've actually thought through these problems. And we've talked about- What's the number one problem you solved? I mean, because at the end of the day, we're talking about problems that have massive importance, big time consequences, and people can actually quantify. That's right. So what specific problem, highest level, do you solve this the most important? That's the most consequences. Everything from ingestion of raw data sets from wherever, like you said, in the cloud, on premise, all the way through all the processes you need to make it fully usable. And we don't review that as one problem. There's other vendors who think one aspect of that is a problem and then it's worth solving. But we really think, look, at the end of the day, you got raw stuff and you got to turn it into useful stuff. Everything in there has to happen. And so we might as well just give you everything and be very, very good at doing all those things. And so that's what we call Enterprise Cloud Data Management. It's everything from raw material to finished goods of insights. We want to be able to provide that in a consistent, integrated, and machine learning integrated way. Well, you guys have a loyal customer base bug, just to be fair, and you got to kind of acknowledge that there was a point in time, not anyone can throw Informatica away the big customers, because big engagements. But there was a time in Informatica's history where you went private, there was some new management came in, but there was a moment where the boat was taken on water, right? And you could almost look at it and say, we're in the space, you guys retooled around that, success to the team, took it to another dimension. So that's the key thing is that a lot of the companies become big, it's hard to get rid of. But who's innovating? So the question is, well, just that's the statement, I think you guys done a great job. Yeah, the boat might have taken on water, it's my opinion, but you can probably debate that. But I think, as you get mature in your republic, you guys went private, but here's the thing. You guys have added good product chops to Informatica. So I got to ask you a question. What cool things are you doing? Because remember, cool shiny new toys help put a little flash and glam on the nuts and bolts, data governance stuff that scales. What are you guys doing? I know you guys announced Claire and some AI stuff. What's the hot stuff that you're doing that's adding value? Yeah, absolutely. So first of all, and this kind of addresses your water comment as well. So we're probably one of the few vendors that spends almost about $200 million in R&D, and that hasn't changed through the acquisition. If anything, I think it's actually increased a little bit because now our investors are even more committed to innovation. Well, you're more nimble private. You got a lot more nimble. Absolutely. So there are a lot more ideas that are coming to the forefront. So there's never been any water just to be clear. But to answer your follow on question about some examples of this innovation. So I think yesterday I talked about some of our recent releases as well, but we're really just trying to keep pushing on this idea of, I know I keep saying this, but it's this whole machine learning approach here of how can we learn more about the data? So one of the features I'll give you an example is if we can actually go look at a file, and if we spot like a name and an address and some order information, that probably is a customer, right? And we know that because we've seen past datasets in there. So there's examples of this pattern matching where you don't even have to have data that's filled out. And this is increasingly the way the data looks. We're not dealing with relational tables anymore. It's JSON files, it's web logs, XML files. All that data that you had to have data scientists go through and parse and sift through, we just automatically recognize it now. If we can look for the data and understand it, we can match it. Put that in context in the order of magnitude benefits that from the old way versus the current way. What's the pain levels versus one versus the other? I mean, can you put context around that in terms of, I mean, pretty significant? It's huge because again it goes back to this sort of volume and variety of data that people are trying to get into systems and do it very rapidly. So I'll give you another really tangible customer use case. So this is a customer that presented at Informatical World a couple months ago. It's jewelry TV, I can actually tell you the name. So they're one of these online kind of shopping sites and they've got a TV program that goes with the online site. So what they do is, obviously when you promote something on TV, your orders go up online, right? They wanted to flip it around and they said, look, let's look at the web logs of the traffic that's on the website and then go promote that on the TV program because then you get a closed loop and you start to have like this explosion of sales. So they used Informatica, they didn't have to do any of this hand coding. They just built this very quickly and the graphical user interface that we provide, it leverages Spark streaming under the hood. So they're using all these technologies under the hood. They just didn't have to do any of the manual coding. Got this thing out a couple of days and it works and they've been able to measure it and they're actually driving increased sales by taking the data and just getting it out to the people that need to see the data very, very quickly. So that's an example of a use case where this isn't just, to your point about, is this just like a small incremental type of thing? No, there's a lot of money behind data if you can actually put it to good use. The consequences are grave and I think you see more and more. I mean, the hacks just amplify it over and over again. It's not a cost center when you think about it. It has to be somehow configured differently as a profit center, even though it might not drive top line revenue directly like an amp or anything else. It's not a cost center. If anything is going to be treated as a profit center because you get hacked or someone, the day's misused, you can be out of business. There is no profit. Look at the results of these hacks. The defensive argument is going to become very, very strong as these regulations come out, but let's be clear, and we work with a lot of the most advanced customers. There are people making money off of this. It can be a top line driver when you know how to use it. It should be, that's exactly the mindset. So the final question before we break into our time here is that there are some chief data officers that are enabled some art and that's just my observation. I don't want to put anyone in a pigeonhole on anyone, but some are enabled to really drive change. Some are just figureheads that are just managing compliance risk and work for the CFO and say no to everything. So I mean, I'm generally generalizing, but that's essentially how I see it. What's the problem with that? Because this cost center issue has, we've seen this movie before in the security business. Security should not be part of IT. That should be its own deal. So we're kind of, this is kind of smoke what we're seeing coming out of the jungle here. Your thoughts on that? Yeah, yeah. You're absolutely right. And we see a variety of models and you can see the evolution of those models. And it's also very contextual to different industries. I mean, there are industries that are inherently more regulated. So that's why you're seeing the data people maybe more in those cost center areas that are focused on regulations and things like that. There's other industries that are a lot more consumer oriented. So for them, it makes more sense to have the data people be in a department that seems more revenue facing. So it's not entirely random. There are some reasons that's not to say that's the right model moving forward, but someday, you never know. I mean, there's a reason why this role became a CXO in the first place. Maybe it is somebody who reports to the CEO and they really view the data department as a strategic function. And it might take a while to get there, but I don't think it's going to take a long time. Again, we're talking about 40% growth in the data. And I think these guys are realizing that now. And I think we're going to see very quickly people moving out of the whole tool shed model and moving to very systematic, repeatable practices, you know, sophisticated middle-war platforms and- As we say, don't be a tool, be a platform. Murthy, thanks so much for coming on theCUBE. Really appreciate it. What's going on Informatica real quick. Things good? Things are great. Good, awesome. Live from New York, this is theCUBE here at Big Data NYC, more live coverage continuing day three after the short break.