 Live from San Francisco, it's theCUBE, covering Informatica World 2017, brought to you by Informatica. Okay, welcome back everyone. We are here live in San Francisco for theCUBE's exclusive coverage of Informatica World 2017. I'm John Furrier. This is SiliconANGLE's flagship program. We go out to the events, extract the signal from the noise. My next guest is Jatesh Guy, who's the Vice President General Manager of Data Quality and Governance for Informatica. Welcome to theCUBE. Thanks for joining us today. Happy to be here, John, pleasure. So two things right out of the gate. One, data quality and governance. Two of the hottest topics in the industry. Never mind within Informatica. You guys are announcing a lot of stuff. Customers are here pretty happy. You get a solid customer base. That's right. Product's been blooming. You got a big brand behind you now. This is important. This is laws now in place coming online in 2018. I think it's the GDPR. That's right. And there's a variety of other things. But more importantly, we've got all of their data. That's right. What's your take, and what are you announcing here at the show? Well, so, you know, from a data governance and compliance and overall quality standpoint, data governance started off as a stick, a threat of regulatory pressure. But really the heart of what it is is effective access to and consumption of data, trusted data. And through that exercise of the threat of a stick, healthy practices have been implemented and that's resulted in an appreciation for data governance as a carrot, as an opportunity to innovate. Innovate with your data to develop new business models. The challenge is as this maturation and the practice of data governance has happened, there's been a realization that there's a lot of manual work. There's a lot of collaboration that's required across functional matrix to organization of stakeholders. And there's the concept of- There's some dogma too, let's just face it, within organizations. I got all this data, I did it this way before. Right. And now, whoa, the pressure's on to make data work, right? I mean, that's the big thing. That's exactly right. So you collaborate, you align, and you agree on what data matters and how you govern it. But then you ultimately have to stop documenting your policies, but actually make it real, implement it. And that's where the underlying data management stack comes into place. Now that could be making it real for regulatory, financial regulations like BCBS239 and CICAR, where data quality is essential. It could be making it real for security-related regulations where protection is essential, like GDPR, the data protection regulation in the EU. And that's where Informatica is launching a holistic enterprise data governance offering that enables you to not just document it. Or as one CDO said to me, at some point you got to stop talking about it. You actually have to do it. To connecting the conceptual, the policies, with the underlying physical systems, which is where intelligent automation with the underlying data management portfolio, the industry-leading data management portfolio that we have really delivers significant productivity benefits. It's really redefining the practice of data governance. Yeah, I mean, most people think of data as being one of those things that's been kind of like whether it's healthcare, HIPAA, old models. It's always been an excuse to say, well, we don't do it that way or hey, it's kind of in a no-op kind of thing where no, we don't want to do any more than data. But you guys introduced Clare, which is the acronym for the clairvoyant or AI. It's kind of a clever weighted brand. That's going to bring in machine learning, augmented intelligence and cool things. That only to me feels like you're speeding things up. That's exactly right. When in reality, governance is more of a slowdown. So how do you blend the innovation strategy of making data freely available? Right. And yet managing the control layer of governance because governance wants to go slow. Right. Clare wants to go fast. You know. And, you know. Help me explain that. Well, you know, in short, sometimes you have to go slow to go fast, right? And that's the heart of what our automated intelligence that Clare provides in the practice of data governance is to ensure that people are getting access to efficient access to trusted data and consuming it in the right context. And that's where you can set, you can define a set of policies, but ultimately you need those policies to connect to the right data assets within the enterprise. And to do that, you need to be able to scan an entire enterprise's data sets to understand where all the data is and understand what that data is. So talk about the silver bullet that everyone just wants to buy, the answer to the test, which is un-gettable by the way, I believe. We just had a lead just on one of your customers. And their differentiation to their competition is that they're using data as an asset, but they're not going all algorithmic. There's the human data relationship. So there's really no silver bullet in data. You can use algorithms like machine learning to speed things up and work on things that are repeatable tasks. Talk about that dynamic, because governance can be accelerated with machine learning, I would imagine, right? Absolutely, absolutely. Governance is a practice of ensuring an understanding across people, processes, and systems. And to do that, you need to collaborate and define what are the people, what are your processes, and what are the systems that are most critical to you? Once you've defined that, it's, well, how do we connect that to the underlying data assets that matter? And that's where machine learning really helps. Machine learning tells you that if you define customer ID as a critical data element through machine learning, through Clare, we are able to surface up everywhere in your organization where customer ID resides. It could be CMD ID, it could be customer underscore ID, it could be customer space ID, cost ID. Those are all the inferences we can make, the relationships we can make, and surface all of that up so that people have a clear understanding of where all these data assets reside. Chetesh, let's take a step back. I want to get your thoughts on this because I really want you to take a minute to explain something to the folks watching. So there's a couple different use cases that at least I've observed and the Wikibon team has certainly observed. Some people have an older definition of governance. What's the current definition from your standpoint? What should people know about governance today that's different than just last year or even a few years ago? What's the new picture? What's the new narrative for governance and the impact of business? You know, it's a great question. I held a CDO summit in February. We had about 20 chief data officers in New York and I just had held an informal survey. Who implements data governance programs for regulatory reasons? Everybody put their hand up. And then I followed that up with who implements data governance programs to positively affect the top line and everybody put their hand up. That's the big transition that's happened in the industry is a realization that data governance is not just about compliance. It's also about effective policies to better understand your data, work with your data and innovate with your data, develop new business models, support your business in developing those new business models so that you can positively affect the top line. Another question we get up on theCUBE all the time that we also observe and we heard this here from other folks at Informatic and your customers is that getting to know what you actually have is the first step. Right. Which sounds counterintuitive but reality is that a lot of folks realize as an asset opportunity they raise their hand, top line revenue, I mean who's not going to raise their hand on that one, right, get fired. But the reality is this train's coming down the tracks pretty fast. Data as an input into value creation. That's exactly right. So now the first step is, oh boy, I just signed up for that, raised my hand, know what the hell do I have? Right. How do you react that? What's your perspective on that? And that's where you need to be able to Google index the internet to make it more consumable. Actually, a few search engines index the internet. Google came up with sophistication through its page ranking algorithm. Similarly, we are cataloging the enterprise and through Clare we're making it so that the right relevant information is surface to the right practitioner. And that's the key. Accelerating the access method so increase the surface area of data, have the control catalog for the enterprise which is like your Google search analogy. Harder than searching the internet. But even Google's not even doing a great job these days. My opinion, I should say that. But there's so many new data points coming in. So I get that. So now the follow up question is, okay, it's really hard when you start having IoT come in or gesture data or any kind of data coming in. How do you guys deal with that? How does that rock your world? So as they say. And that's where effective consumption of data permeates across big data, cloud, as well as streaming data. We have implemented, in service to governance, we've implemented in stream data quality rules to filter out the noise from the signal in sensor data coming in from aircraft subsystems as an example. That's a means of, well first you need to understand what are the events that matter? And that's a policy definition exercise which is a governance exercise. And then there's the implementation of filtering events in real time so that you're only getting the signal and avoiding the noise. That's another IoT example. What's your big, take your Informatica hat off, put your industry citizen hat on and what's your view of the marketplace right now? What's the big wave that people are riding? Obviously data, you can say data, don't say data because we know that already. What's your people, what do you observe out there that the marketplace is different? That's changing very rapidly. Obviously we see Amazon stock going up, like a hockey stick, obviously cloud is there. What are you getting excited about these days? You know, what I'm excited about is bringing broad based access of data to the right users in the right context. And why that's exciting is because there's an appreciation that it's not the analytics that are important, it's the data that fuels those analytics that's important. Because if you're not delivering trusted accurate data, it's effectively a garbage in, garbage out analytics problem. Hence the argument, data or algorithms, which one's more important? Right, well. I mean data is more important than algorithms because algorithms need data. That's exactly right, and that's even more true when you get into non-deterministic algorithms, when you get into machine learning. Your machine learning algorithm is only as good as the data you train it with. I mean, look, I mean, machine learning is not a new thing. Unsupervised machine learning is getting better. Right. But that's really where the compute comes in. Now you can, and the more data you have, the more modeling you can do, I mean, these are new areas that are kind of coming online. So the question is to you is, what new exciting areas are energizing some of these old paradigms? We hear neural nets. I mean, Google just announced neural nets that teach neural nets to make machine learning easier for humans. Right. Okay, I mean, that's a little bit inside computer science baseball, but you're seeing machine learning now hitting mainstream. What's the driver for all this? The driver for all of this comes down to productivity and automation. It's productivity and automation with autonomous vehicles. It's productivity and automation that's now coming into smart homes. It's productivity and automation that is being introduced through data-driven transformation in the enterprise as well. Right, that's the driver. It's so funny. I got one of my undergraduate computer science degrees with databases and back in the 80s, I mean, it wasn't like you went out and said, hey, I have a database. Wow, hot that. And now it's like the hottest thing, being a data guy. Right. And what's also interesting is a lot of the computer science programs have been energized by, you know, this whole software defined with cloud native because now they have unlimited potentially compute power. Right. What's your view on the young generation coming in? As you look to hire and you look to interview people, what are some of the disciplines that are coming out of the universities and the master's programs that are different than it was even five years ago? What are the trends you're seeing in the young kids coming in? What are they gravitating towards? Well, you know, there's always an appreciation of, you know, a greater appreciation for, you know, the phrase I love is in God we trust all others must have data. There is an increasing growing culture around being data-driven. But from a background of young people, it's from a variety of backgrounds. Of course, computer science, but philosophy majors, arts majors in general, all in service to the larger cause of making information more accessible, democratizing data, making it more consumable. I think AI, in my, I agree by the way, I would just add, I think AI, although it's hyped and I kind of don't really want to burst that bubble because it's really promoting software. I mean, AI is giving people a mental model of, oh my God, this is pretty amazing, things are happening. I mean, Toronto's vehicles is what most people point to and say, hey, wow, that's pretty cool. A Tesla's much different than a classic car. I mean, you test drive a Tesla, you go, why am I buying a BMW Audi Mercedes? Right, exactly. There's a no brainer. Right. Except for the like chisel part you got to get it installed. But again, that's going to change pretty quickly. At this point it's becoming a table stakes exercise. If you're not innovating, if you're not applying intelligence in AI, you're not doing it right. All right, final question. What's your advice to your customers who are in the trenches? They raise their hand, they're committed to the mandate, they're going down the digital business transformation route. They recognize that data is at the center of the value proposition and have to rethink and reimagine their businesses. What advice do you give them in respect to how to think architecturally about data? Well, you know, it all starts with your data-driven transformations are only as good as the data that you're driving your transformations with. So ensure that that's trusted data. Ensure that that's data you agree as an organization upon, not as a functional group, right? The definition of a customer in support is different from the definition of a customer in sales versus marketing. It's incredibly important to have a shared understanding and alignment on what you are defining and what you're reporting against because that's how you're running your business. So the old schema concept, the old database world, know your types. Right. But then you got the unstructured data coming in as well. That's a tsunami, IOT coming in. Sure, sure. That's going to be undefined, right? And the goal and the power of AI is to infer and extract metadata and meaning from this whole landscape of semi-structured and unstructured data. So you have the opinion, I'm sure you're biased with being informatic about it. Let's say it. I'm sure you're a favor of collect everything and then connect the dots as you see fit. Well, or is that? It's a nuance. You can't collect everything, but you can collect the metadata of everything. Metadata is important. Data that describes the data is what makes this achievable and doable, practically implementable. Shitesh Guy here, sharing the metadata. We're getting all the metadata from the industry, sharing it with you here on theCUBE. I'm John Furrier here, live at Informatica World 2017, exclusive CUBE coverage. It's our third year. Go to SiliconANGLE.com, check us out there, and also wikibon.com for our great research. YouTube.com slash SiliconANGLE for all the videos, more live coverage here at Informatica World in San Francisco after this short break. Stay with us.