 From around the globe, it's theCUBE with digital coverage of Enterprise Data Automation and event series brought to you by Io Tahoe. Okay, we're back. Focusing on Enterprise Data Automation, we're going to talk about the journey to the cloud. Remember the hashtag is data automated. And we're here with Lester Waters, who's the CTO of Io Tahoe. Lester, good to see you from across the pond on video. Wish we were face to face, but it's great to have you on theCUBE. How's the way, Dave? Thank you for having me. Hey, you're very welcome. Hey, give us a little background. CTO, you've got a deep, deep expertise in a lot of different areas, but what do we need to know? Well, David, I started my career basically at Microsoft where I started the information security cryptography group there, the very first one that the company had. And that led to a career in information security. And of course, as you go along with information security data is the key element to be protected. So I always had my hands in data and that naturally progressed into a role with Io Tahoe as their CTO. Guys, I have to invite you back. We'll talk crypto all day. We'd love to do that, but we're here talking about, yeah, awesome, right? But we're here talking about the cloud. And we talk about the journey to the cloud and accelerating everybody's really interested obviously in cloud even more interested now with the pandemic. But what's that all about? Well, moving to the cloud is quite an undertaking for most organizations. First of all, we've got a, you know, probably if you're a large enterprise, you probably have thousands of applications. You have hundreds and hundreds of database instances and trying to shed some light on that just to plan your move to the cloud is a real challenge. And some organizations try to tackle that manually. You know, really what Io Tahoe is bringing is trying to tackle that in an automated fashion to help you with your journey to the cloud. Well, I mean, look at migrations or sometimes just an evil word to a lot of organizations, but at the same time, you know, building up technical debt veneer after veneer and year and year and year is something that many companies are saying, okay, it's got to stop. So what's the prescription for that automation journey and simplifying that migration to the cloud? Well, I think the very first thing that it's all about is data hygiene. You know, you don't want to pick up your bad habits and take them to the cloud. You've got an opportunity here. So I see the journey to the cloud as an opportunity to really sort of clean house, reorganize things, sort of like moving out. You don't, you know, you might move all your boxes, but you're going to probably cherry pick what you're going to take with you. And then you're going to organize it as you end up at your destination. So from that, you know, I get, there's seven key principles that we, that I like to operate by when I advise on the cloud migration. Okay, so let's, you know, where do you start? Well, I think the first thing is understanding what you got. So discover and cataloging your data and your applications. You know, if I don't know what I have, I can't move it. I can't, I can't improve it. I can't build upon it. And I have to understand those dependencies. So building that data catalog is the very first step. What do I got? Now, is that, is that a metadata exercise? Is it, sometimes there's more metadata than there is data. Is metadata part of that first step or? You did, metadata is the first step. So the metadata really describes the data you have. So, you know, the metadata is going to tell me I have 2,000 tables and maybe of those tables, there's, you know, an average of 25 columns each. And so that kind of gives me a sketch, if you will, of what I need to move. You know, how, how big, how big of the boxes I need to pack from my move to the cloud. Okay. And you're saying you can automate that data classification, categorization discovery, correct, using math, machine intelligence. Is that correct? Yeah, that's correct. So basically we go and we will discover all of the schema, if you will, that's the metadata description of your tables and columns in your database and the data types. So we take, we will ingest that in and we will build some insights around that. And we do that across a variety of platforms because everybody's organization has, you know, you've got one, you know, an Oracle database here and you've got a Microsoft SQL database here. You might have something else there that you need to sort of bring site onto. And part of this journey is going to be about breaking down your data silos and understanding what you've got. Okay, so we've done the audit, we know what we've got, what's next? What do we go next? So the next thing is, remediating that data. You know, where do I have duplicate data? I may have, you know, oftentimes in an organization data will get duplicated. So somebody will take a snapshot of a data, you know, and then end up building a new application which suddenly becomes dependent on that data. So it's not uncommon for an organization of 20 master instances of a customer. And you can see where that will go when trying to keep all that stuff in sync becomes a nightmare all by itself. So you want to sort of understand where all your redundant data is. So when you go to the cloud, maybe you have an opportunity here to consolidate that data. Yeah, because you kind of like to borrow an Einstein apply an Einstein bromide, right? Keep as much data as you can, but no more. So, okay, so you get to the point to the second step you kind of want to reduce costs, then what? You figure out what to get rid of or actually get rid of it, what's next? Yes, yes, that would be the next step. So figure out what you need and what you don't need. Oftentimes I've found that there's obsolete columns of data in your databases that you just don't need or maybe it's been superseded by another, you've got tables that have been superseded by other tables in your database. So you got to kind of understand what's being used and what's not. And then from that, you can decide, I'm going to leave this stuff behind or I'm going to archive this stuff because I might need it for data retention or I'm just going to delete it. I don't, you know, you don't need it at all. Now, unless the most organizations if they've been around a while in our so-called incumbents, they've got data all over the place, they're data markets, data warehouses, they're in, you know, all kinds of different systems and the data lives in silos. So, you know, how do you kind of deal with that problem? Is that part of the journey? That's a great point, Dave, because you're right, the data silos happen because, you know, this business unit is charted with this task and other business unit has this task. And that's how you get those instantiations of the same data occurring in multiple places. So, you really want to, as part of your cloud migration journey, you really want to plan where there's an opportunity to consolidate your data because that means it'll be less to manage, it'll be less data to secure and it'll be, it'll have a smaller footprint which means reduced costs. So, I mean, people always talk about a single version of the truth. You know, data quality is a huge issue. I've talked to data practitioners and have indicated that, you know, the quality metrics are in the, you know, single digits and they're trying to get to 90% plus. But so, maybe you can address data quality. Where does that fit in on the journey? That's a very important point. You know, first of all, you don't want to bring your legacy issues with you, as the point I made earlier. If you've got data quality issues, this is a good time to find those and identify and remediate them. But that can be a laborious task. We've had customers that have tried to do this by hand and it's very, very time consuming because you imagine if you've got, you know, 200 tables, 50,000 columns. Imagine, you know, the manual labor involved in doing that and you could probably accomplish it but it'll take a lot of work. So, the opportunity to use tools here and automate that process is really will help you find those outliers that's that bad data and correct it. Yeah, and you talk about that automation. I mean, the same thing with data cataloging, you know, one of the earlier steps. I mean, organizations would do this manually or they'd try to do it manually and that's, you know, a lot of reason for the failure. They just, you know, it's like cleaning out your attic. You just don't want to do it. So, okay, so then what's next? I think we're plowing through your steps here. What's next on the journey? The next one is in a nutshell, preserve your data format. Don't, don't, don't, don't boil the ocean here to use a cliche. You know, you want to do a certain degree of lift and shift because you've got application dependencies on that data and the data format, the tables in which they sit, the columns and the way they're named. So some degree you are going to be doing a lift and shift but it's an intelligent lift and shift using all the insights you've gathered by cataloging the data, looking for data quality issues, looking for duplicate columns, do it, you know, in planning consolidation. You don't want to also rewrite your application. So in that aspect, I think it's important to do a bit of lift and shift and preserve those data formats sort of as they sit. Okay, so let me follow up on that. It sounds really important to me because if you're doing a conversion and you're rewriting applications, that means that generally, you're not generally, you're going to have to freeze the existing application and then you're going to be refueling the plane as you're in mid-air. And a lot of times, especially with mission critical systems, you're never going to bring those together. And that's a recipe for disaster, isn't it? Great analogy. Unless you're with the Air Force, you won't bring them together. No, that's correct. It's, you know, you want to have bite-sized steps and that's why it's important to sort of plan your journey, take these steps out, you know, using automation where you can to make that journey to the cloud much easier and more straightforward. All right, I like this. So we're taking a kind of a systems view and end-to-end view of the data pipeline, if you will. What's next? I think we're through, I think I've counted six. What's the, lucky seven? Lucky seven. Involve your business users. Really, when you think about it, your data is in silos. Part of this migration to cloud is an opportunity to break down these silos, that naturally occur as part of the business unit. You've got to break these cultural barriers that sometimes exist between business and say, so for example, I always advise, there's an opportunity here to consolidate your sensitive data, your PII, your personally identifiable information. And if three different business units have the same source of truth for that, there's an opportunity to consolidate that end-to-one as you migrate. That might be a little bit of tweaking to some of the apps that you have that are dependent on it. But in the long run, that's what you really want to do. You want to have a single source of truth. You want to ring fence that sensitive data and you want all your business users talking together so that you're not reinventing the wheel. Well, the reason I think too that's so important is you know, I would say you're creating a data-driven culture. I know that's sort of a buzzword, but it's true and what that means to me is that your users, your lines of business feel like they actually own the data rather than sort of pointing fingers at the data group, the IT group, the data quality people, data engineers say, ah, I don't believe it. If the lines of business own the data, they're going to lean in, they're going to maybe bring their own data science resources to the table and it's going to be a much more collaborative effort as opposed to a non-productive sort of argument. Yeah, and that's where we want to get to data. Data ops is key and maybe that's a term that's still evolving, but really you want the data to drive the business because that's where your insights are, that's where your value is. You want to break down the silos between not only the business units as I mentioned, but also as you pointed out, the roles of the people that are working with it. You know, a self-service data culture is sort of the right way to go with the right security controls and putting on my security hat, of course, in place so that, you know, if I'm building a new, I'm a developer and I'm building a new application, I'd love to be able to go to the data catalog and go, oh, there's already a database that has, you know, the customer, you know, what the customers have clicked on when shopping. I could use that, I don't have to rebuild that, I'll just use that as for my application. That's the kind of problems you want to be able to solve and that's where your cost reductions come in across the board. Yeah, you know, I want to talk a little bit about the business context here. So, okay, so we always talk about data, it's the new source of competitive advantage. I think there's not a lot of debate about that, but it's hard, a lot of companies are struggling to get value out of their data because it's so difficult, all the things we've talked about, the silos, the data quality, et cetera. So you mentioned the term data ops. Data ops is all about streamlining that data, pipelining, infusing automation and machine intelligence into that pipeline. And then ultimately taking a systems view and compressing that time to insights so that you can drive monetization, whether it's cut cost, maybe it's new revenue, drive productivity, but it's that end-to-end cycle time reduction that successful practitioners talk about as having the biggest business impact. Are you seeing that? Absolutely, but it is a journey and it's a huge cultural change for some companies that are, I've worked in many companies that are ticket-based IT driven and just to do even the marginalist of change or get insight, raise a ticket way to week and then out the other end will pop, maybe a change that I needed. And it'll take a while for us to get to a culture that truly has a self-service data-driven nature where I'm the business owner and I wanna bring in a data scientist because we're losing, for example, business might be losing to a competitor and they want to find what insights, why is the customer churn, for example, happening every Tuesday? What is it about Tuesday? This is where your data scientist comes in. Last thing you want is to raise a ticket, wait for the snapshot of the data. You kinda wanna enable that data scientist to kinda come in, securely connect into the data and do his analysis and come back and give you those insights which will give you that competitive advantage. Well, I love your point about churn. Everybody talks about the Andreessen quote that software's eating the world and all companies are software companies and SaaS companies and churn is the killer of SaaS companies. So very, very important point you're making. My last question for you before we summarize is the tech behind all of this. What makes IOTAHO unique in its ability to help automate that data pipeline? Well, we've done a lot of research. We have, I think now maybe 11 pending patent applications. I think one has been approved to be issued, but really it's really about sitting down and doing the right kind of analysis and figuring out how we can optimize this journey. Some of this stuff is in rocket science. You can read a schema into an open source solution, but you can't necessarily find the hidden insights. So if I wanna find my foreign key dependencies which aren't always declared in the database or I wanna identify columns by their content which because the columns might be labeled attribute one, attribute two, attribute three or I wanna find out how my data flows between the various tables in my database. That's the point at which you need to bring in automation. You need to bring in data science solutions. And there's even a degree of machine learning because, for example, we might deduce that data is flowing from this table to this table and present that to the user with a 87% confidence, for example. And the user can go or the administrator can go, no, it really goes the other way. It was an invalid conclusion and that's the machine learning cycle. So the next time we see that pattern again in that environment, we'll be able to be able to make a better recommendation because some things aren't black and white. They need that human intervention loop. I just wanna summarize with the Lester Waters sort of playbook to moving to the cloud and I'll go through them. Hopefully I took some notes, hopefully I got them right. Step one, you wanna kind of do that data discovery audit. You want it to be fact-based. Two is you wanna remediate that data redundancy. And then three, identify what you can get rid of. We don't oftentimes, we don't get rid of stuff in IT or maybe archive it to cheaper media. Four is consolidate those data silos, which is critical, sort of breaking down those data barriers. And then five is attack the quality issues before you do the migration. Six, which I thought was really intriguing was preserve that data format. You don't wanna do the rewrite applications and do that conversion. It's okay to do a little bit of lifting and shifting. After the fact. Yeah, and then finally, and probably the most important is you gotta have that relationship with the lines of business, your users, get them involved, begin that cultural shift. So I think great recipe, Lester, for safe cloud migration. I really appreciate your time, but I'll give you the final word if you bring us home. All right, well, I think the journey to the cloud is it's a tough one. You will save money. I have heard people say you go to the cloud it's too expensive, it's too this, too that. But really, there is an opportunity for savings. I'll tell you, when I run data services as a SaaS service in the cloud, it's wonderful. Sorry, a PaaS service in the cloud, it's wonderful because I can scale up and scale down almost by virtually turning a knob. And so I have complete control and visibility of my costs there. And so for me, that's very important. I also gives me the opportunity to really re-infense my sensitive data because let's face it, most organizations kind of like being in a cheese grater when you talk about security because there's so many ways in and out. So I find that by consolidating and bringing together the crown jewels, if you will, it's as a security practitioner, it's much more easy to control. So, but it's very important. You can't get there without some automation and automating this discovery and analysis process. Well, great advice, Lester, thanks so much. I mean, it's clear that CAPEX investments on data centers are generally not a good investment for most companies. Lester, really appreciate Lester Water, CTO of Io, Tahoe. Let's watch this short video and we'll come right back. You're watching theCUBE. Thank you.