 From Boston, Massachusetts, it's theCUBE. Covering Actifio 2019 Data Driven. Brought to you by Actifio. Welcome back to Boston, everybody. You're watching theCUBE, the leader and on the ground tech coverage. My name is Dave Vellante, Stu Minimus here. John Furrier is also in the house. We are covering the Actifio Data Driven 19 event second year for this conference. It's all about data. It's all about being data driven. Charlie Kwan is here. He's the director of data and AI offering management at IBM, Charlie, thanks for coming on theCUBE. Happy to be here, thank you. So Actifio has had a long history with IBM. In fact, when the company got started, it had a time to market play. It took a virtualization product and it allowed them to be first, really, and then get heavily into the data virtualization. They've since evolved that, but you guys are doing a lot of partnerships together. We're going to get into that. But talk about your role with an IBM and what is this data and AI offering management thing? Yeah, absolutely. So data and AI is our business unit within the IBM overall corporation. Our focus and our mission is really about helping our customers drive better business outcomes through data, leveraging data in the context and in the pursuit of analytics and artificial intelligence or augmented intelligence. So the portion of the business that I'm part of is unified governance and integration. And if you think about data and AI as a whole, you can think about it in the context of the latter Oftentimes when we talk about data and AI, we talk about the foundational principles and capabilities that are required to help companies and our customers progress on their journey to AI. And it really is about the information architecture that we help them build. And that information architecture is essentially a foundational prerequisite around that journey to AI and analytics. And those layers of the latter to AI are collecting the data and making sure you have it easily accessible to the individuals that need it, organizing the data. That's where the unified governance and integration portfolio comes into play. Building trusted business ready data, high quality with governance around that, making sure it's available to be used later. The analyze layer in terms of leveraging the data for analytics and AI, and then infuse across the organization, leveraging those AI models across the organization. So within that context of data and AI, we've partnered with Actifio at the end of 2018. So before we get into that, so I have to, sorry to interrupt you, but Rob Thomas is, and I want to double click on what you just said. Rob Thomas is famous for saying there is no AI without AI. Meaning no artificial intelligence without information architecture. So sounds good, you talked about governance. That's obviously part of it. But what does that mean, no AI without AI? So it's really about the fundamental prerequisites. To be able to have the underlying infrastructure around the data assets that you have. The fundamental tendon is that data is one of your tremendous assets any enterprise may have. A lot of time and effort has been spent investing and man hours invested into collecting the data. Making sure it's available, but at the same time it hasn't been freed up to be able to use for downstream purposes. Whether it's operational use cases or analytic use cases. And the information architecture is really about how do you frame your data strategy so that you have that data available to use and to drive business outcomes later. And those business outcomes may be the results of insights that are driven out of the data. But they could also be part of the data pipeline that goes into feeding things like application development or test data management. And that's one of the areas that we're working with Tech2Fi on. So the information architecture is a framework that you guys essentially publish and communicate to your clients. It doesn't require that you have IBM products plugged in. But of course you can certainly plug in IBM products. If you're smart enough to develop an information architecture presumably and you show where your products fit and you're going to sell more stuff. But it's not a prerequisite. I can use other tooling if I want to do that. I think the framework is a good prerequisite. The products themselves of course know, right? But the framework is a good foundational construct around how you can think about it so that you can progress along that journey. All right, so you started talking about Actifio, your relationship there. So you've created the InfoSphere virtual data pipeline. Correct. Why did you develop that product? So then we'll get into it. Sure, it's all part of our overall unified governance and integration portfolio. Like I said, that's that organized layer of the ladder to AI that I was referring to. And it's all about making sure you have clear visibility and knowing what data assets you have. So we always talk about in terms of no trust and use. Know the data assets you have. Make sure you understand the data quality and the classification around that data that you have. Trust the data. Understand the lineage. Understand how it's been touched, how it's been transformed. Build a catalog around that data. And then use and make sure it's usable to downstream applications or downstream individuals. And the virtual data pipeline offering really helps us in that last category around using and making use of the data that assets that you have, putting it into directly into the hands of the users of that data. So whether they be data scientists and data engineers or application developers and testers. So the virtual data pipeline and the capabilities based on Actifio Sky Virtual Appliance really help build that snapshot data, provide the self-service user interface to be able to get it into the hands of application developers and testers or data engineers and data scientists. And why is that important? Is it because they're actually using the same or substantially similar data sets across their workstream? Maybe you could explain that. Yeah, it's important because the speed at which applications are being built, insights are being driven is requiring that there is a lot more agility and ability to self-service into the data that you need. Traditional challenges that we see is if you think about preparing to build an application or preparing to build an AI model, building it, deploying it and managing it. The majority of the time, 80% of the time it's built upfront, preparing the data, talking to IT, trying to figure out what data you need, asking for it, waiting for two weeks to two months to try to get access to that data. Getting it and they're realizing, oh, I got the wrong data, I need to supplement that or I need to do another iteration of the model going back to try to get more data. And that's the area that application developers and data scientists don't necessarily want to be spending their time on. And so we're trying to shrink that timeframe. And how do we shrink that is by providing business users or line of business users, data scientists, application developers with the individuals that are actually going to be using the data to provide their own access to it, right? To be able to get that snapshot, that point of time access to that point of production data to be able to then infuse it into their development process, their testing process or the analytic development process as well. Where do traditional tooling, where does traditional tooling fit in this sort of new world? Because, yeah, I remember when Hadoop came out, it was like, oh, Enterprise Data Warehouse is dead. And then you ask customers like, what's one of the most important things that you're doing in your big data pipeline? And they'd say, oh yeah, we need our EDW. So I could now collect more data for lower cost, keep it long and all that stuff. But the traditional EDW was still critical. But what you would just describe it, building a cube, you guys own Cognos, obviously, that's one of the biggest acquisitions that IBM ever made. But it's a critical component. You talk about data quality, integration, those things. It's all a puzzle that fits together in this larger mosaic. Can you help us understand that a little bit? Sure, and well, one of the fundamental things to understand is you have to know what you have, right? And the data catalog is a critical component of that data strategy. Understanding where your enterprise assets sit. They could be structured information, they may be unstructured information sitting in file repositories or emails, for example. But understanding what you have, understanding how it's been touched, how it's been used, understanding the requirements and limitations around that data, understanding who are the owners of that data. So building that catalog view of your overall enterprise assets, that's the fundamental starting point from a governance standpoint. And then from there, you can allow access to individuals that are interested in understanding and leveraging the data assets that you may have in one pool here. The challenge is data exists across enterprise everywhere, right? The silos that may have rose in one particular department that then gets merged in with another department and then you have two organizations that may not even know what the other individual has. So the challenge is to try to break down those silos, get clarity of the visibility around what assets you have so that individuals can then leverage that data for whatever uses they may have, whether it be development or testing or analytics. So if I could generalize the problem, too much data, not enough value, and I'll talk about value in terms of things that you guys do that I'm inferring, risk reduction, speed to insights, and then ultimately lowering cost or increasing revenue. That's kind of what it's all about. Yeah, we talk about business outcomes in terms of increased revenue, decreased costs or reduced risk, right? In terms of governance, those are the three things that you want to unlock for your customers. And you don't think about governance and creating new revenue streams. You generally don't think about it in terms of reducing costs, but you do think about it oftentimes and in terms of reducing your risk profile and compliance. But the ability to actually know your data, build that trust, and then use that data really does open up different opportunities to actually build new applications, new systems of engagement, new systems of record, new applications around analytics and AI that will unlock those different ways that we can market to customers, sell to customers, engage our own employees as well. Yeah, so the initial entry into the budget, if you will, is around that risk reduction, right? People understand that I got all this data and I need to make sure that I'm managing according to the edicts of my organization. But are you actually seeing, are you playing skeptic? Are you really seeing value beyond that risk reduction? I mean it's been nirvana in the compliance and governance world is not just compliance and governance and avoiding fees and getting slapped on the wrist or even something worse, but we can actually through the state of quality initiative and integration, et cetera, drive other value. You actually seeing that? Yes, we are actually. Particularly last year with the whole onslaught of GDPR in the European Union and the implications of GDPR here in the US or other parts of the world, really was a pervasive topic. And a lot of what we were talking about was specifically that, compliance, make sure you stay on the right side of the regulation. But at the same time, investing in that data or information architecture, investing in the governance program, actually allowed our customers to understand the different components that are touching the individual. Right, because it's all about individual rights and individual privacy. So understanding what they're buying, understanding what information we're collecting on them, understanding what permissions and consent that we have to leverage their information, really allowed our customers actually to leverage that information in for a different purpose outside of the whole compliance mindset because compliance is a difficult nut to crack. There's requirements around it, but at the same time, there are best effort requirements around that as well. So the driver for us is not necessarily just about compliance, but it's about what more can you do with that governed data that you already have because you have to meet those compliance requirements anyway to be able to flip the screen and talk about business value, business impact, revenue, and that type of thing. So you're only about, what, six months in? Correct, part of the partnership, yes. All right, so it's early days, but how's it going and what can we expect going forward? Going great. We have a terrific partnership with Actifio. Actifio Virtual, or the IBM Virtual Data Pipeline offering is part of our broader portfolio within Unified Governance and it fits nicely to build out some of the test data management capabilities that we've already had. Often the portfolio is part of our capability set and it's really been focused around test data management, building synthetic data, orchestrating test data management as well. And the Virtual Data Pipeline offering actually is a nice complement to that to build out a pretty robust portfolio now. All right, Charlie, well, hey, thanks very much for coming on theCUBE. How's the event? It's been terrific. It's been terrific. It's amazing to be surrounded by so many people that are excited about data. You don't get that everywhere. Hey, we're always excited about data in theCUBE. Charlie, thanks so much for coming on theCUBE. All right, keep it right there. We'll be back with our next guest. Dave Vellante, John Furrier, Stu Miniman in the house. You're watching the CUBE Actifio Data Driven 2019. We're right back.