 from Cambridge, Massachusetts, extracting the signal from the noise. It's theCUBE, covering the MIT Chief Data Officer and Information Quality Symposium. Now your hosts, Dave Vellante and Paul Gillin. Welcome back to MIT in Cambridge, Massachusetts, everybody, I'm Dave Vellante, and this is theCUBE. theCUBE goes out to the events. We extract the signal from the noise. We've been in partnership with MIT. Now this is our third year at the MIT IQ Chief Data Officer Conference. It's our pleasure now to have Murthy Mathi Prakasam here. He's with Informatica, expert in what's going on in that world of ETL, governance, data transformation. Murthy, welcome to theCUBE, it's good to see you. Thank you for having me. So we just had Michael Stonebreaker on who's basically throwing cold water on it, virtually everything that your industry has done, multi-billion dollar industry, governance, ETL, it's all BS. How do you respond to that? Well, actually I think a lot of what he said makes a lot of sense. I mean, the reality is there are growing data sets, the data sets are more distributed throughout the enterprise. And so this is the reality of where data is going. It's not all about having one single place with super duper amounts of control. What I would say is that there's balance and that's what we're seeing the most successful enterprises do. On one hand, you want to acknowledge that you want to have autonomy and freedom, self-service, these are all movements that I think are inevitable and that all of our data platforms are going to eventually evolve to fulfilling. But the reality is you also don't want data chaos. You don't just want data all over the place. I mean, one of the key themes at this event here is security and governance. So if you're a financial services company or a retail company or any company that's got sensitive data, you don't just want data floating all over the place, it's completely ungoverned and insecure. And so the key is balancing the agility of IT, the autonomy of business analysts with the fundamental enterprise needs of security and governance. And that's what modern data management platforms are starting to do. I think it's right on, I mean, it's always been a balancing act. It's fun to have sort of, you know, just take one side, Republicans and Democrats and just make that argument. But the reality is there are a lot of new projects being spun up. That's right. So-called big data space. And many of them, if not most of them are being funded by lines of business who could care less about governance and security and compliance because it just gets in the way of selling. But then it exposes you down the road. So somebody's got to be responsible for the information asset and liability management. And that's kind of where you guys come in. So how are you helping customers manage that balance? Absolutely. Well, so a very simple way to do this. Things like data masking. So we offer data masking natively in Hadoop. So you can actually, before you hand the data off to those business analysts to go off and do whatever they need to go do, just mask the data just so that they don't have access to the raw sense of information. Simple things like that. And these are very, very agile products. So it's very simple to do. A central IT team can do that. They're not doing this long and complicated process that like I think that was portrayed before. You know, there's these simple processes that can just help ensure some protection. Basic, simple security governance principles. That actually enables greater autonomy because now IT can trust the business to having the right data without any of these, you know, risks of a lawsuit or any other kind of compliance failure. So from a product standpoint, you've got the platform, it's very successful. It's mature. And then the market, all of a sudden Hadoop comes along and everything gets distributed. You're shipping, you know, code, five megabytes of code to petabytes of data. That's right. How do you architecturally evolve that product so that it's not a pure bolt-on so that its performance is maintained so that it's, you know, resiliency and recoverability are still there? Absolutely. Well, so Informatica's platform already runs on Hadoop. So we've, for over three years, we've actually had our entire data integration, data quality, data governance portfolio. A version of that now runs directly in Hadoop. So we can now take advantage of all of those features that you're talking about as far as scalability, performance that Hadoop offers. Now, but the enterprises can still get all the best of breed functionality from Informatica at the same time. So I think this is also one of the trends we're going to see as new platforms come on board. Companies like Informatica can easily evolve their platforms so that we can help our enterprises evolve theirs as well. What do you see customers doing? I mean, you know, when Hadoop sort of first came onto the scene, people sort of predicted this big sucking sound, in particular for your business, which is interesting because when you talk and for the enterprise data warehouse business, data integration and EDW were like, oh, dead businesses now. What's happened is, when you talk to practitioners, those are the two most important aspects of their big data initiatives. That's right. So what do you see happening? It's not replacing one with the other. It's sort of a coexistence, which many have talked about. Mike Olson has said this all the time. It's not cannibalizing. It's sort of incremental. Maybe there's some price pressure, but what are you seeing in the customer base? No, I think you're exactly right. It's about an evolution of the portfolio and the data integration evolves along with it as well. So because it is inevitable, there is going to be growing data sources and there's a growing need for data by different consumers like the data scientists and new types of automation systems. So it's inevitable that the one size fits all approach doesn't make sense, but that just means that the CDOs out there are going to have a diverse portfolio of different platforms and now they just need a different kind of fabric that ties all of this together and that fabric is way more agile and enables more agility for the enterprise itself. So it's just a matter of evolution. Evolution of the platform layer, evolution of the fabric that kind of ties all of this together and ensures all of that data curation that we've been talking about here as well happens in a way that is very agile, very, very nimble and very quickly. How do you see initiatives like Spark changing what's happening with Hadoop, with the data sources, with the pace of data ingestion and how does that affect what you guys are doing? Well Spark is just another data platform technology and Informatica always, whenever we see innovations at the data layer we just try to take advantage of them and so Spark will be another technology that Informatica can leverage to perform all of the operations that customers look to us to perform. But as far as the enterprise is concerned they won't have to do any Spark coding. We want to shield them from all of that and because this is ultimately what happens and you have all these new technologies that come out, the average enterprise can't go out and retrain their entire staff to leverage that one technology and who's to say that technology will last for a long time, maybe it'll change in six months, a year or two years. So what Informatica tries to do is our development team learns the latest technologies and we embed that within our fabric. So if you're an enterprise that's been using Informatica 10 years ago the same mappings and same processes you built simply poured over and under the hood we've done all the integration for you as far as these new technologies. So whether that's MapReduce, whether it's Yarn, whether it's Spark and I'm sure six months from now or a year from now it'll be something else. That's a really good point. You talk to the people coming out of Google now they say, oh yeah MapReduce, that's ancient. Spark, I have been there, done that. Oh, you should see what's happening now and now Spark just hitting the market. But nonetheless, it does change the way in which data initiatives are going to be utilized with it expands the scope of those data initiatives. So I mean, from your standpoint that's good news, right? Expans your TAM. As I said, the pricing dynamics are interesting, right? There's a lot of people say that ROI and big data has been a reduction on investment. Some people have this idea of okay, it's cheaper, it's open source, I'm using white box technologies but it seems like in this technology industry all that means is people end up buying more. Exactly. They actually do make it up in volume. So what are you seeing just in terms of the business dynamics? Well I think that it'll be interesting to see. I mean it's hard to tell. Pricing and business, it's very hard to predict trends in this space but there's definitely moved toward more subscription pricing. Move away from big lump sum kind of deals, move to smaller deals, try a project out maybe on a small scale and then expand. So this is definitely one of the patterns that we're seeing, where products, where we're seeing people who will start with maybe a smaller deployment of a product. Once they get one project going, they can see the value of it, then they can expand that project and then keep going. And I think the nature of Hadoop and some of these new platforms actually sort of enables that as well. So I think the business impact might be that instead of you having these big lump sum deals you'll just see things being a little more smoothed out which frankly is a win for a vendor as well as for the customer. Now what about this event? What are you guys doing here? What's it been like for you? What's the discussion going on? I think it's a great event for us because a lot of the things that we do are focused on data management issues. It's about data integration, it's about data quality, it's about data governance, it's about security. And these are the issues that I think that keep CDOs up at night. And so a CDO is less interested in the technology of how it's done and more around the business requirements of how do I actually deliver great data to all of the consumers inside of my organization? And that's what Informatica has always been about. We just never had a CDO as a target persona to go actually have that conversation with. So now I think with the emerging role of data in the enterprise and the CDO as the steward of that function, Informatica just has a new person for us to go talk to. What are you seeing in terms of that role and its evolution? I mean two years ago the number was I think single digits, maybe seven, eight, nine percent of organizations had a CDO. Obviously the concentration of financial services, healthcare and government was higher. Are you seeing that take hold now in mainstream as a persona you're going after? Maybe talk about that a little bit. Absolutely, I think it's definitely, it's still early days, it's taking off but I find that it's not consistent across organizations. So in some places it's still an extension of IT. In some areas it's more of an extension of the business. An exact intersection of where does the business side of data management and the technical side of data management come together. I think it's something that every enterprise is still trying to figure out. And maybe there never will be one single model. I think every industry looks at data slightly differently. The heavily governed industries have to put a lot more focus around security, governance and things like that. There's other organizations where agility plays a greater part and so depending on the industry maybe the role will also be slightly different. But it's exciting times because I think ultimately it talks about the value of data and that's what Informatica's always been about and I think there's just more exciting things that can happen here. So what should we be looking for? Let's see Informatica World was the spring. You guys obviously have a cadence. Kind of let's talk roadmap a little bit. What should we be looking for from Informatica over the next, say 12 to 18 months? What are you going to be working on? There's a lot of innovation that kind of goes along this theme of what we call data intelligence. So one of the unique things that we do because we can see all the data that people are processing through our fabric, we can actually look at the data and make inferences about what kind of data it is. And then we can use that to build specific functionality on top of our products. So we have this product called Securit Source which is a fantastic example of this. Securit Source looks at data streams and it identifies where there are potentially sensitive data sets. So if you've got credit card number somewhere or other kind of personally identifiable information, especially in the world of big data where you're just ingesting so much data, you can't have armies of people just sitting there looking at it to figure out, oh, is this bad or good? So we have technology that looks at the data and will surface in just a nice graphical board that says, hey, you've got some data sitting in this area of your environment. It looks kind of sensitive. You might want to go check it out. So it's just like a heat map and it's just a very, very simple way that we can help enterprises actually get better understanding of their data. And I think this theme of leveraging greater understanding in intelligent ways is something you'll continue to see in future products as well. Excellent, well, thanks very much for coming to theCUBE. It's great that you could come on in and give us your perspectives on the business in Formatica and really appreciate your time. Absolutely, thank you. All right, keep right there. We'll be back with our next guest right after this. This is theCUBE. We're live from MIT IQ in Cambridge, Massachusetts. We'll be right back.