 And here we go. Hello and welcome. My name is Shannon Kemp and I'm the Chief Digital Manager of DataVersity. We'd like to thank you for joining the latest installment of the Monthly DataVersity Webinar Series, Advanced Analytics with William McKnight. Today, William will be discussing the data needed to evolve an enterprise artificial intelligence strategy. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. For questions, we will be collecting them via the Q&A in the bottom right-hand corner of your screen, or if you'd like to tweet, we encourage you to share highlights or questions via Twitter using hashtag ADV Analytics. And if you'd like to chat with us or with each other, we certainly encourage you to do so. Just click the chat icon in the bottom middle of your screen for that feature. And if you'd like to continue the conversation after the webinar, you can follow William and each other at community.dativersity.net. As always, we will send a follow-up email within two business days containing links to those slides, the recording of the session, and any additional information requested throughout the webinar. Now let me introduce to you our speaker for the series, William McKnight. William is the president of the McKnight Consulting Group. He takes corporate information and turns it into a bottom line producing asset. He's worked with major companies worldwide, 15 of the Global 2000, and many others. The McKnight Consulting Group focuses on delivering business value and solving business problems utilizing proven streamlined approaches and information management. His teams have won several best practice competitions for their implementations. He's been helping companies adopt big data solutions. And with that, I will give the floor to William to get today's webinar started. Hello and welcome. Hello and welcome everyone. Thank you, Shannon, for that. And boy, has the world changed since the last webinar, hasn't it? I hope you all are staying safe and sound and not going too crazy during these times and making the most of it. But, you know, these webinars are part of that. I'm glad to be a small part of that. I know I've been indulging in much more of that type of activity since my travel is limited these days. But I really do hope you're well. And I thank you for joining here today. My topic is the data needed to evolve an enterprise artificial intelligence strategy. So I'm talking about a major trend out there, which is artificial intelligence and how we as analytic and data professionals support that strategy in a big way. We're a big part of it or we should be. And so let me try to make that case for us as we get into the topic today. You know, I will also add, given the current state of the world, that it seems to me like business intelligence professionals, analytic professionals are as busy really as ever. So you're not alone if you're feeling like, you know, you're still busy. Busy is good, I'd say. And more to come, we are fairly essential. And I'm glad that that is recognized at the least. As companies evolve from reestablishing their core business, which is what many are doing right now, making sure that it's sound to getting back to normal, which has to have some external factors happen and then thriving. And so needless to say, thriving will involve artificial intelligence. Some things have been pushed out, but I think it's all still true. So it's five o'clock somewhere. So let's start with that. Start with whiskey. Actually, it's not five o'clock somewhere. I actually did my research and found that at this point in time, it is not five PM anywhere in the world. But as the saying goes, it's five o'clock. I am trying to impress upon you some, some more, you know, pedestrian type of influences of artificial intelligence in our lives. I don't know that much about whiskey, but apparently there's hundreds of variables that go into the making of a barrel of whiskey. And now we have artificial intelligence that can suggest which recipes should be made next. And generate more than 70 million different recipes and it will highlight those that it predicts will be most popular. Of course, it's been fed the data of what is selling and it's been fed the data of what goes into those whiskeys and it can make new whiskeys for these distilleries. So moving into that area, what else is it moving into? Well, how about the, how about music? And let's play a few clips of this. This is called Daddy's car. And it's nothing I'm going to be playing. I can tell you that. But it was made by artificial intelligence. And that's enough of that. Okay, is made by artificial intelligence, a system called flow machines that analyze the database of songs and created similar compositions to those that apparently have some wide appeal. And there's many songs that are done this way. This is actually not the most recent vintage of a song that has been done with artificial intelligence. What else? Well, you see that painting in the upper right hand corner that is called Edmund the Bellamy. Well, I don't think it's a real person, but it looks like a real person from back in the day. Of course, it was a 2018 painting created using a type of artificial intelligence algorithm called a generative adversarial network. And it was sold at Christie's auction house in New York for $432,500. There you go. So there's money in artificial intelligence in the painting. And now let's move to another side of artificial intelligence deep fakes. Let's hear this deep fake. Scoot it along a little bit. No, I won't keep it here. Okay. Does that sound pretty definitive? That is, of course, a deep fake that I'm showing you here. And other things on here are this robot. She's called Sophia. That's her name. She's actually a citizen of Saudi Arabia and she's popular for saying, I have feelings too. And we could probably last profound on that for a few minutes, but I'll leave it at that. And then this artificial intelligence issue from MIT technology review, which I thought was really good. And it shows just here on the cover all the points on the face that goes into facial recognition. And it's just getting to be more and more of them. Well, the point of this is the mass of humanity is about 10 years behind the possibilities. Let me say that again. The mass of humanity is about 10 years behind the possibilities. And I'm afraid we're entering a window here of about the next 10 years probably starting about now where they're just not going to believe the things that are happening around them. The possibilities around artificial intelligence. They don't believe cars can drive themselves and they don't believe the depth of manipulation. And then I believe that what they see is fake. So there's a profound opportunity here. So let's all be watchful for this. I shouldn't even call it an opportunity. It's rather devious. But let's be watching out for this. And artificial intelligence has that side to it as well that we have to be mindful of. But if you're thinking, well, how does it apply to the enterprise? And that's really what we're here to talk about. Let's get into that. But one more thing before we get into that is if you're thinking, well, there is something. Or two or three that, you know, we can still do better at than robots. Yes, of course, there are there are many, many, but it's not reading. This algorithm developed by Chinese retail giant Ali Baba outscored humans in the Stanford question answering data set. Now I've looked into this data set and could talk about it for a while, but suffice it to say there's so many different dimensions of comprehension and artificial intelligence is doing an increasingly better job than humans on all of those dimensions. So just reading, absorbing information, things like that, processing information, artificial intelligence is really stepping up and stepping up in our enterprises as well. Here's a few examples of and I'm going to have a few as we go along here. So I want to try to hit it home for you. I don't know that I will touch on one that makes sense in your enterprise, but hopefully you can extrapolate from what I'm talking about here today. So financial fraud. AI in the enterprise is improving the financial fraud detection possibilities and reducing costly false positives just getting it better just dialing in what is fraud. So, so it's obviously had so much history data now that it's accumulating that's analyzing that's understanding after the fact that we're informing these algorithms say that was fraud that wasn't fraud. It's incorporating all that information getting down to a such a degree of nuance with stuff like this, that you know humans won't just never be able to do it. So, while a lot of these things I'm talking about. Sure, you can be you're doing it now, right. If you're a financial institution, you're doing financial fraud, you're probably doing it with AI but if you're not, you could be doing it better with AI. As a matter of fact, I suggest that in the future you almost must do it with AI to stay a parody. As is true with a lot of other things. Unfortunately, chatbots his flood of them and we know they have some room for improvement, but sometime around this year people manage 85% of business relationships without human act interaction. So, we're seeing that already in car navigation which obviously drives, you know, self driving cars and so forth, reducing the cost of handling this place items automating anything and everything that can be automated predicting flight delays and so on. This represents profound change in the enterprise. It requires a commensurate strategic focus and urgency should disrupt your current thinking process and it can produce high impact enterprise outcomes. Now, there's a few more. You might be doing some city management. Well, the city of Melbourne in Australia, among others has put the artificial intelligence to use at helping it plan its city layout. So, city layout has to do with streets with materials, parking locations, obviously tree locations, building locations and materials and distance from the sidewalk and all sorts of business campuses, school campuses and so on. Tree planting is, it's interesting. I was looking into this. Of course, it's nice for cooling the area, but it also can make some people feel unsafe by giving hiding places. So, a balance has to be struck with all of that surface materials for rows and pavements have a lot to do with, you know, being able to withstand the pressure of the cars coming through for a long period of time. So, you don't have to get back to it, but also it can draw down a lot of heat and expand that heat, which is a big problem. This has led to a lot of so-called green roofs and green walls and we're getting better about that and they're affected. So, the density of that the color of that all impacted through artificial by artificial intelligence. Video management. Feature coding is really the kind of the word for the ability to do feature extraction and the compression process on that feature extraction. So, it can really understand what's going on as it's looking at a video in real time. This is a real key to self-driving. So, those cameras on the front and the sides in the back of a self-driving car have to be able to process that information, coordinate that information and so on. This has to do with anything that has a movement pattern to supply chain, bringing it home here today, the movement of caregivers in nursing homes, the movement of doctors within healthcare facilities. We want to make that more efficient if they're running around getting supplies and so on. We want to make that more efficient. So, if we're watching this, we can make suggestions if we have the artificial intelligence turn on it. And finally, I'll bring up satellite imagery. Again, it's imagery in real time. This one coming from above, coming from satellites, which very few of us actually own, but it's third party data. Tracking, for example, delivery trucks to a retail facility. That's one of the goals of satellite imagery analysis is to optimize the build up of delivery trucks as well as anything that has build up to it. So, the build up of war material, for example, in unstable regions. And also around COVID-19, the build up of clusters of people that may be putting them at risk for something and trying to optimize travel routes for more distancing, which is going to be necessary and so on. If you're more interested in more about this, look into the cars overhead with context data set, COWC. Very useful for training a device such as a deep neural network to learn to detect and or count cars from above. So, you're into that. Really cool stuff. We've been able to help a lot of our clients step into the world. And these are some of the business use case examples we've been a part of. And again, you look at these, and now I think we're bringing it home to many of you, you look at these and you go, well, you know, we're doing that already. Sure, you're doing supply flow, you're doing customer flow, you're trying to optimize that, for example, and reach home manufacturing. My point here is that these are done best with AI. One thing maybe you've heard me say, I like to say it a lot now. AI is better than BI. So, BI involves the human analyzing information, hopefully doing the right thing. But AI is taking that to yet another level. So, some of these examples I'm going to bring up a little bit later, marketing, of course, cybersecurity, smart cities. That's a big one. Retail and manufacturing, oil and gas, finding, you know, where to drill, where to utilize assets and so on. Life Sciences, studying the human genome. Hundreds of megabytes per person for improving health, which obviously we're all about right now. What's new is deep learning. We've had some semblance of artificial intelligence around since the 1950s. Yes, we have. And it's moved along into machine learning and now deep learning. And I know I said, say 2010s here, it's really kicking in now. And beyond this, there's another dimension to this. There's the data that is able to process. It's not just supervised data, which is data that's labeled because you have a lot of data that you've been able to put a label on either before the fact or after. So, a simple example would be, here's a bunch of pictures of dogs and cats. And these are the dogs and these are the cats. Now, here's the new picture. Which one is it? AI is great with stuff like that, but what's also new is unlabeled data sets or unsupervised learning where the challenge is to discover implicit relationships in an unlabeled data set. So without those labels go for it. And it starts to look at different characteristics and segment what it's seeing. Really cool stuff. AI affects the entire organization strategically. I'm going to come back to this. So these are the different dimensions that I think each organization has to be addressing now for AI strategically. Where are we going with this? I'm going to give you some ideas, technical effects, operational effects, the talent that you acquire. And you may or may not be able to acquire talent deep in AI right now. But how about the ability about acquiring people with the ability to go there and the data, which is, of course, important to us as data professionals, analytic professionals. So where do you look for these opportunities? Look at the products that you make and the services you offer. Can there be any AI embedded in what you're making and the services that you provide? How about the supply chain for those products and services? Can that be optimized? I talked a little bit of go about, you know, satellite imagery, watching your supply chain physically from above. What about things like that? Business operations, your hiring process, your procurement process, after-sales service, et cetera. The intelligence in designing your product and service. How about the intelligence in the marketing and the approval for what it is that you do? All of these are ripe for opportunities for AI. Yeah, for a lot of you, it's a matter of, how do we get started? You know, where do we go? Where do we go first? This is from my maturity model, my analytic maturity model, which I will present next month at the Advanced Analytics Webinar Series. So come back next month. I'll be talking all about this. This is just number three out of five. And number three out of five. My point is not to go through everything on here, but to show you that right now, at that midpoint in the maturity model, I have all in on AI under data strategy. Now, this is a slide. This thing gets broken down in many different ways, but fit for a slide. And what do I mean by that? I mean a lot of what I'm talking about here today, getting the data straight, getting your process straight, getting all the skills that you need in-house. I'm going to be going over a lot about here today. Now, lefty fret, maybe you're looking around my slide and going online. Lefty fret, most companies are going to be below this level. That's just the way it is with maturity models. It's the few that will reap the majority of the rewards. And those few are maybe here. Maybe here at three right now on the journey, maybe even to four or five. Okay, good on that. The door may still be open, or I should say the door is still open for others to move in here. Maybe you're a two, but you got to get to, you got to be pushing hard right now to get to this. And I'd say by, I give you to maybe a year, maybe nine months to get to what I'm showing you right here in order to be a survivor in five plus years for many industries. So that's kind of the, that's the broader, that's the pace. We've got to do it. So we've got to get our data under management. Now, you'll hear me use these words a lot under management. This is what I mean. It's an a leverageable platform. It's an appropriate platform. Okay, it's appropriate for the data and the usage of that data. You're not for spitting data into a wrong platform for it. You will pay that price if you do that. It's used effectively by multiple business groups. You're not building things as throwaways as only for this and that's it and nobody else will ever see this data. It's not very a very efficient way to go about things in an enterprise. Somebody needs to step in on that and provide the some leadership high non functional requirements. That's your NFR availability, performance, scalability, stability, durability and secure. Yes, you're high on all of those things. That's data under management. You're capturing at a granular level because you're going to need it. I'm coming back to that data is that a data quality standard is defined by data. Not arbitrary, not bits and pieces, maybe here in there hit or miss, but it's out of standard. Do you know where it's that's not perfect, never perfect, but at least it's to a data quality standards, you know where it is it's high. And it's usable and people can trust it and they know what the quality is. This is what I mean by data under management. So if I go forward into some of the data you're going to need to collect. It's going to need to need to be put under management. This is what I mean. So that's the key slide data to collect for artificial intelligence. It's wide ranging spending all your current data. I don't like to say things like, well, get all your data under management, but I'm starting to hear myself say it because there's, there's use for all of this data now in artificial intelligence. You don't see it. The thing to do is maybe not foresee it into, you know, some sort of architecture, but to figure out why don't you need this data. And what is what is wrong with our maturity that we can't see a use for this data. We're not to that level of granularity yet in our business. Why not. Why can't we get there. So it's e commerce data, European CRM data, which largely we're doing a pretty good job of, but this IOT data. Sometimes that's more transient, but there's a lot of a lot of data there and some of it's going to be collected at the edge. Some of it at a central place publicly available like governmental data. And then third party kind of contractually based type of data context is key here and all of this data. So you have to understand the data that you have where it comes from and your abilities with that data. So, for example, if you're a data scientist, you're trying to try to understand the meaning of this article based on a headline. You are excluding yourself from Twitter data because Twitter data does not have headlines. It's just 140 characters or whatever. You also have to be able to screen out data that doesn't make sense. You can just grab it and use it. There is, you know, news that is, you know, fudged out there for different reasons. You got to be careful about that. And we also have other goals with the data than just collect it and use it in AI and go go straight headlong towards the goal by, you know, excluding everything else like ethics. Ethics is becoming increasingly important. There is a movement, a trend towards being able to explain our AI and making sure that that process is not fraught with ethical considerations. Like, for example, you may recall Microsoft had a project called Tay, which went from an innocent chatbot to a crazed racist in a day by exposing itself to a lot of social data out there. It must not be the social data I'm looking at, but it found it. And that was not the intention. So here's some more data examples. You want to get your call center recordings in your chat logs for content and data relationships, as well as the ability to answer some questions that are going to come from that data set. Sensor data, customer data where you're going to get similarities and buyers and the ability to predict responses to offers, email data, which helps you to surface buyer segments, product catalogs and other data, great source of attributes and attribute values, and then public data, YouTube data, whatever, user website behaviors, sentiment analysis and so on. All of this type of data and increasingly the data you're going to need is actually coming from outside of the four walls of your organization. I have clients that have more third party data than they do internally generated. Yeah. Let's get into some specific examples. Now, I've picked off some of the things I've already mentioned in this presentation to try to fit it onto a nice slide. Some of the main ones here. We got fraud detection, call center chatbots, self driving, some of the things I've talked about. Okay. So, and on the right here, we have enterprise data domain. And hopefully that's a concept that you relate to within your enterprise. And maybe you have some data stewardship assigned, data governance assigned, you know, by these data domains also known as subject areas, customer employee partner patient and so on. This is by no means exhaustive. Every client is unique in terms of what their real data domains are. And I'm constantly surprised that the depth that we can actually get to with data domains at a client once you get to know them. But anyway, wouldn't it be great when to be something if you step into these projects and these are AI projects. Let me step back. You might call them AI projects, or you might call them projects that happen to use AI, okay, whatever that has something to do with where they might be placed within the organization. Okay, but I've tried to say already that these types of projects within your enterprise are going to be best done with AI, not just be I, but with AI. And robust data is going to be at least half of the success of that project, probably more, maybe a lot more, maybe a lot more. And I'm going to go through some examples. I'm going to take each one of these and go through them really quick here. And you can see some of the subject areas that are interesting for that function. And it's not every one of them, although I believe that I have gone very light on the connection. The more the merrier in terms of these data domains, if they are under management, remember that these data domains have to be under management. If they are under management, so much of the work is already done to do, for example, in this case fraud detection, which needs customer data, of course, store data, maybe. I suggest most of the time your contract data to see if the contracts are being adhered to financial data. Product data, and you could probably look at this and say, well, what about that? What about that? Yeah. The more the merrier, like I say, but at a foundation, if you had those five subject areas or domain under management, you'd be so far along in your ability to do fraud detection. And again, I've gone light on these connections. And I will suggest at this point that one great way to master these data domains is with master data management. That's not the point of my talk today, but I do believe that, that that is a discipline that is going to be great for artificial intelligence success. However, what I've experienced in most enterprises is those are two completely different worlds. You've got your grizzled MDM veterans from the data governance world and the data management world. You've got data scientists and and the programmer types in the other world that really like to run off and do things on their own. And those organizations that can bridge that gap and bring those worlds together are going to go fast and far. So I encourage you to look at that for your own organization and consider that that might be. That might be a source of why we're a little bit stuck in the mud is you've got some different organizations with different perspectives on things. So we need we need leadership over the top of that. Everybody can be themselves, but we've got to come together. The other thing that I will point out on this slide as an example is that all these data domains that are associated to this thing we want to do, which is fraud detection. Let's say this starts to form your data roadmap. Your data roadmap, the data you're going to get under management. What data you're going to get under management. Excuse me. Well, eventually you'll get it all. I think that should be your goal. But in what priority, obviously you can't do it all at once. On what priority. Well, as we go through the different things that you might want to do as an organization that speaks volumes in terms of which data domains to get under management. Initially, and again, I'll suggest for a lot of these, the thing to do to get it under management is to do master data management, but data warehousing data lakes. That's all good as well. Now, I'll pick it up a little bit as I go through some other examples. All center chat bot to do that well. It needs a customer profile. It needs account information. It needs contract information that can pull up and talk to back and said we're talk. Your product information so it can make suggestions and other assets that it might be able to offer like, oh, here's a webinar. Here's a link. Here's a list of that that you might be interested in all these things to talk to the customer. Self driving or transportation. Yeah, I don't have a lot here, but look at that geography dimension. Think about that for self driving. Wow. That's not your grandfather's geography dimension that has road signs and all sorts of different things that it might be seeing out there on the road, which is way too numerous to to mention here today, but it has to know what to avoid. What to what is, you know, what what I mean, mainly it has to read road sign and be able to interpret that. And so that's a whole new set of geography dimensions and there are a third parties that are collecting this information for anybody that may be in that business. If you are good for you, that sounds like a lot of fun, but these geography dimensions that we're seeing now are not just. Well, we pulled it down from the US Postal Service website. It's going to be much more than that. Okay, predicting flight delays. You got to know your policies. You got no weather. Third party data there again. I'm not sure in the weather business asset information media information assets there. I'm kind of saying that's your. That is your airplanes. You might also argue. Well, you know, the pilot pilot style or pattern may be interesting here as well. And whatever media that the flight lay may need to go into again, could be more probably is more, but the more data domains that are relevant to the thing it is you're trying to do that you have under management. The way more faster that you're going to be able to actually do that thing with artificial intelligence. So you've got the data people building these enterprise data domains. For all this consumption, and then you're left to do the algorithms. So that's a whole lot better than what I'm seeing today out there a lot, which is your data is not under management. Data is not in a leverageable platform, etc, etc. And that's, that's what you got to do. And that gets to be the, the hump of work in these artificial intelligence projects, which should not be the case. Marketing now going on here we're into another area. Again, take a look at all of these do this for your upcoming initiative. Do what I'm showing you here for your upcoming initiatives. Obviously do it at a greater level of detail and figure out what's important across the board. What I usually find what you're probably customer product, things like that are going to be very interesting across the board and you really better master them. Number one, if you do nothing else master your customer and your product and then you'll, you'll enjoy it so much you'll get into there. But anyway, marketing, you need customer information, need product information. Equipment media. Yeah. Yeah. Smart cities. Let's add some different things. You need your facilities, your policies assets, equipment, media, geography, citizen. Now I'm using that. What else? What else retail customer employee partner supplier product bill of materials store account contracts. All this for your supply chain flow oil and gas exploration. If you're into that, well, I'm not going to read them all this a lot. A lot that goes into that. This is again that super geography dimension that I talked about here. Now, the heading on my last eight slides was the same. And that's sort of subliminal, but I wanted you to be impressed that data is integral. Artificial intelligence success. So I didn't want to leave this slide without pointing that out. Okay. So we've talked about the data. We've talked about some initiatives that are artificial intelligence based. Where to get the data. Why we need it does the level that we need to bring it to the so called under management level. So where do you put it all. Where do you put all this data for machine learning artificial intelligence, etc. I get asked this question a lot because many enterprises are starting to really make this journey and push on that gas pedal. And they're looking around at their current data infrastructure and wondering if it's going to be sufficient. Well, my best answer right now, and I don't know if I like it completely, but it's put it in a great architecture, put it in a great architecture. And that means you have the data in the right platform to succeed for the data characteristics as well as how that data is going to be used. Today, what does that really mean cloud storage or data lake. Okay, those terms are used kind of interchangeably. But I'd say, I'd say that there's a lot of data that's going to be useful and artificial intelligence is going to be in your data lake. So the data lake and I go hand in hand. Now, this does not mean that it's originated there might be originated in master data management and flowed into the data lake. As a matter of fact, that's what I do in my reference. But one way or another, one way or another, all this data needs to get into, well, I shouldn't say a lot of it needs to get into the data lake, which is completely appropriate for majority of unstructured data for high volumes of data and so on. A database management system. Yeah, they're not going away, by the way. And that's the foundation for your data warehouse. And most of your operational data, which is still going to play a part in this artificial intelligence world. HDFS still plays a role here. Many of you already have data in there. You're keeping it there. And that's fine. It's optimized for sequential reads and writes. So it has its good points for sure. Many have said, I don't need some of that structure that HDFS provides that little bit of extra cost. Certainly, it's more geared to on premises and I don't want that. So I'm moving into cloud storage. And that's fine as well on structured data stores. Now, here I'm talking about things like Amazon cloud search, elastic search, solar spanks, these sorts of things are appropriate for completely unstructured data. Yes, you can put your unstructured data in the data lake, but you want a little functionality around that a little bit more management around that. You might look into those products that product set because that is completely appropriate in some enterprises for that type of data. And finally, text-based serializations like of RDF, like turtle, N-triples, RDF, XML, or the latest W3C recommendation, which is JSON-LD. These are very handy since both human and machine-like can read them. Yes, we need to find a way for them for those sources to be in our architecture. So one of the biggest issues that comes up with me isn't so much there isn't enough data, but the data is locked away in hard access. It wasn't fit for purpose to the platform that it was put into. And according to a survey released in January, 49% of IT decision makers say they can't deploy the AI they want because their data isn't ready. That's half. Can't move on AI because the data isn't ready. Data is integral to AI success. So again, where are we going to put that data? I'm still on that question. Still addressing that with a bit of a reference architecture here for you. Some of the things that are new-ish here are that you have a difference between low-latency data and batch data. And batch data, we're pretty good at that. We ETL that or we ELT that, as the case may be, into I'm suggesting Azure staging area cloud storage. And I use S3 as an example. Of course, Google, of course, Azure has their own version of that, which is great as well. Okay. So, and one thing that we like to do to that S3 data is to put it in part K format, especially if we're going to be accessing that data a lot, because that seems to lend itself to analytics way more better than native cloud storage. But anyway, back up to the upper left, we have the batch data, yes, but we have the low-latency data as well. And that's increasingly becoming important. I use Kafka here, but some sort of streaming product. It could be MQRabbit or one we like a lot, which is Pulsar, with, for example, the commercial version is Stream Leo. So anyway, something to parse topics and put those process topics into stream processing, either from the tool or with Spark and get it into cloud storage. So cloud storage being the first port of call here for the analytic environment for this data that I'm speaking about. Now, I did talk a lot about master data management. That would be one of the things that you're ETL-ing or ELT-ing that data in. I don't know if that makes sense. And the M will be one of the batch sources of that data, and that will be brought into the cloud storage. Now you have your data warehouse, and most of us have one or more of these data warehouses. That is great. That so much is happening there. They're not going away. But they are being, may I say, married with the cloud storage. So I don't know if we want to start calling it a data ware lake or data lake house. Lake house might flow a little better, but I'm trying to figure out what I want to call this. But it's becoming one, one continuous set of data here that you can access through the data warehouse. So when you need access to data in the data warehouse, it has that reach-through capability into my data lake. And furthermore, in the data lake, all the data scientists, I shouldn't say all, but a lot of the data science, a lot of the AI that we're talking about here today is going to happen there. And that data is going to get pushed into the data warehouse. The subset of data in cloud storage gets ETL, ELT, stream processing, or Spark into the data warehouse. And of course, we have all sorts of things going on at the data warehouse. But if you're doing everything in the BI layer in the upper right-hand corner of this slide, if you're doing everything there to fix your problem, you are not fixing it in the right place. The right place to fix it is in right here in the middle of what we're looking at here. The cloud storage, the data warehouse, making those great assets, making sure they have data under management in them and ready to go and be married with all sorts of transactional data that is required. This is an AI pattern for an organization, hiring growth data science. You've got to have a little bit of that to get started with, at least. You don't want to hit a wall in this process. Uncouple AI from organizational constraints while conforming the organization. And that's big. I've seen other studies, and it's certainly my experience bears this out, that many organizations are stumbling when it comes to the people side of AI and the convincing side of this. And unfortunately, or fortunately, however you want to look at it, like it's very required today. It's very required. So we need to find ways to move the organization along with the changes. There is ideation, compiling your data, which I just, which is two words, but I just kind of went over that. That's a lot of work. Internal data and external data. Hopefully I'm enforcing upon you the importance of external data. Labeling the data, if you can, building the model, prototyping it, iterating and production alizing it, putting all those NFRs in place, maybe doing ML Ops and scaling your operation. So it gets to the entire enterprise. So start, start small, think big and scale out your AI. Now I wanted to point out here that data is very important in this, obviously. But there's more to it. There's more to it. You can't just build the data and you're done with your AI. Of course you got to apply these algorithms and there are so many I've just put in some of my favorites here. Some of the basics that you'll want to understand in order to understand AI and the things that your users, if you're a data professional things that your users, data scientists and whatnot are going to be interested in. So let's do it. NAVE based classification, which is a simple probability given past data decision tree which creates decision points around when to split and do different operations in the business and regression, which I think we're all familiar with. We're looking at the past to predict the future. The corporate requirements are more than data. Hopefully you know now that data is very important in the process it integral to AI success, but there are requirements beyond data. There's math requirements. Bring that back into our organization. It's important. GPUs, Python, TensorFlow are in MATLAB, Java and scaler or what have your tools of choice in that area, but they're probably new, at least in the past two years to your organization. Be aware of that. It's going to be some new uses of data as you begin to build out the data that is needed to evolve and enterprise artificial intelligence strategy. And that brings me to the end of my formal part of the presentation, but I will take your Q and A. I hope you have some Q and A. I failed to mention it. But if you do have some questions, go ahead and post them. And Shannon will let me know about that. William, thank you so much for another fantastic presentation. We do have some questions coming in. And if you have questions for William, feel free to submit them in the bottom right hand corner of the screen in the Q and A section. And just to answer the most commonly asked questions, just a reminder, I will send a follow-up email for this webinar by end of day Monday with links to the slides and links to the recording and anything else requested throughout. So diving in here, William, where and how does ethics play into AI? Well, that's a big question. If you are, I touched on it a little bit in the presentation, but if you completely gear your AI operation towards achieving a business outcome at all costs and forget to put in some guardrails around that to make sure that, for example, you don't have bias in the data that you don't want or that is illegal or that is unethical in your view. And there are increasingly becoming third parties that are coming up with ethical standards for the AI industry as it moves forward. There are things like CCPA, GDPR, and so on that is restricting the use of information. But furthermore, starting to force us to surface, well, how do we come up with that information? And is yet to be determined in the courts if we can say, well, my naive base algorithm came up with it. If that's going to be sufficient or not, and that will be something that's very much a bellwether to AI ethics as we move forward. So I think you just have to be sure that you put those guardrails up as you do these artificial intelligence things and you know, make sure that all considerations are in play in the algorithms and that it does not overly bias the data based upon, you know, anything that you might consider an ethical boundary. What is the best distinction between AI and machine learning and what's the importance of no sequel in these fields? Well, artificial intelligence is comprised of more than just machine learning, although we tend to hyper focus on machine learning is the largest part of artificial intelligence. But there's also things like natural language processing, which is, it's huge as well and it works together with machine learning, but it's considered another segment of artificial intelligence. So artificial intelligence is really the big, big label on things and it gets broken down into machine learning, among other things. And the second part of the question, Shannon, what was the second part of the question? And what's the importance of no sequel in these fields? No sequel, okay. Yeah, I didn't even bring up no sequel, did I? No sequel still has a legitimate place in a lot of architectures. If you're going to have one reference architecture as I did today, I'm not going to put it on there because what I am, what I have become quite fond of is the no sequel capabilities within relational databases and their ability to store and manipulate JSON data, XML data, Abro data, etc. And those, those capabilities are really what I was looking for in no sequel. Now, there are certain use cases that no sequel is going to be very geared towards like an online product catalog. I want to do that with Cassandra or something like that, for example. So obviously it has its use cases. But you don't have to have no sequel. We would want to look at the specifics of your implementation to see if it's no sequel still would make sense for you. And then again, just like with Hadoop, a lot of companies out there already have a lot of no sequel going on. And so, you know, is that is it worth it to rip and replace that now? You know, it's probably a recent vintage. Probably not if it's working because there's so much else to do. So you can work that into a reference architecture as well for sure. But if somebody's given me a blank sheet of paper, I want them to prove that no sequel is necessary. And today with databases and their ability to store all kinds of complex data types, I'm trying to keep it a little bit, a little bit simple. And do you have any feeling on data vault 2.0 in this discussion? Well, I think it's become come into more prominence and importance because we are starting to capture all data and data vault in my view, never discriminated very much about data that you're bringing in and getting under under management. Not in my words, but getting it modeled and in the right data store and so on. And I think the world has kind of moved in that direction. So it's a fair way of expressing your dimensions in a model. However, you're not going to do things like that for most of the data that you're putting in cloud storage. Now we can talk incessantly about what data you're going to model for cloud storage. It's going to be a segment of that data. And one thing that I've been doing is if the data is coming in structured, we want to keep whatever structure that data has put that in the cloud storage. But if it's coming in unstructured, that's not something that we're doing. We're not doing the old school data modeling for it. However, I have made a strong point here. I hope that the data warehouse is still important. And the data vault will be a very viable means of expressing the model within the data warehouse, especially as we think about all data being required. So William, in terms of your company, what's your sweet spot? Is it the planning and prep of the data or is it the prep plus the using machine learning and other AI methods to provide a holistic solution to the customer? It's the planning and the prep of the data. We're data experts. We're experts at building those warehouses, those data lakes, and getting the data under management, master data management, and so on. We certainly know quite a bit about BI and dashboarding and all that sort of thing. And we do a fair amount of that as well. And we definitely do a fair amount of artificial intelligence, but what we like to think of ourselves as is supporting data science within an organization that the flow of need will go well beyond just prepping the data, but that's a big part of it. So we're in the sweet spot, I think, of artificial intelligence by getting that data ready. Can you give an example of machine learning ops process? What's the last word you said there, Shannon? Process. ML ops process. ML ops process. Yep. Well, that's a whole presentation there and I tried to kind of sum it up there in the presentation toward the end where I talked about how you're going to need to get data science and getting the data together. One thing I'll add here is that we have to divide and conquer this in order to be efficient about it and data professionals still need to do what they do. Only they need to do it with more alacrity and definitely in cloud storage and cloud and with streaming tools and so on. So that needs to be evolved. Come back next month, I'm going to talk about that evolution process for analytics within the organization. But you're also going to have to, the way things have played out is your data scientists are going to be, if they're true data scientists, they're going to really be smart about machine learning algorithms, when to apply which, when and where, of course, there are methods now for machine learning to be fully operational and choose its own algorithms or choose many and have contests and so on and so forth. But we're not going to lose the need for data scientists to really specialize in that sort of thing. So I think there's some foundational elements. I like to focus on the foundational elements and make things at the end point a lot easier. So when you say you're going to do something like, oh, I don't know, you're going to do retail supply chain management, something like that. And you don't have your data act together. When you want to step into that world, that's where you'll spend most of your time. And probably you'll be spending the time with not data professionals doing that because many organizations don't get that, and they don't do the matching properly. So when you don't have data professionals doing the data for something as important as that, that can really be a drag, not only on the project, but on the long-term viability of the things that you build. So, you know, I'm going to say focus on your foundation. Your data is an important part of that. And then you would add the algorithms on top of that. Love it. So what would be your take on streaming well-structured data near real-time directly into the data warehouse where it can be enriched with master and reference data before being pulled into the data lake for further analysis? So that's a little bit of six and one half dozen the other. And many of you had already have already gone down that path. And now you're adding a data lake. And so that's what you do. We're not looking to undo things for the sake of undoing things. But if you're, if you don't, if you have some semblance of a data lake going, maybe you're not too far along in your MBM where you have it like so many organizations out there, they haven't made that connection between MBM and the data warehouse yet. I've been talking about this for years and getting master data management data into these other places where it can succeed like the data warehouse. If you've already done that, good on you. And if you didn't have a data lake, well, that's something that you need to improve on and add. And you may or may not want to undo, redo, re-engineer what you have going on right now to get that data into the warehouse. So I'm going to do my integration now in cloud storage. And I'm going to do my ETL in cloud storage, if at all possible. But everyone's different. Everyone comes to this at a different point of maturity at a different, a different architecture. You know, no one size fits all. So the end result is what's important here. And the end result has got to be the data professionals. We have to think about this. The end result has to be a data warehouse that has MBM data in it and a data lake that has MBM data in it as well. And so there's different ways to get there. But if you, if you haven't moved on that, I would move the MBM data through the cloud storage data lake and they're there into the data warehouse. But if you're doing it the other way, that's okay too. The ECM have a role in the prep of the data. Hmm. Hmm. Hmm. I'm thinking about that. I don't know what that would be. So, I'm sure there's an edge case out there for it, but certainly not in in something that I want to promote to the masses that they need to do, but everybody should be attentive to their own needs. And again, where they where they've come to in their in their own enterprises. So, if you're great at ECM, I wouldn't necessarily be looking to change that. But if you're trying to step into artificial intelligence, you've got 10 things or 100 things that need to do that wouldn't be something I would necessarily add for data cultivation. I think we have time to slip in a couple more questions here. Are there any tools available to assist program and corporate areas to identify where their needs are. They don't understand how AI can assist them. They may not know how to identify their needs. Yeah. I like it. I like the question because you're thinking I've tried to give some examples here. I don't know about tool for this. I'm a consultant. That's my disclaimer. I think about these things as consulting engagements or, you know, kind of human human look at kinds of things. I think that'll change in the future. But for now, I'd say get some AI data experts aboard to have a look, maybe do a day whiteboard or something like that to get you trained down a direction. It's an absolute must. I mean, you're not going to be able to go if you're a big enterprise. You're just not going to be able to survive for too much. I won't say too much, but, you know, for the long haul here, if you don't adopt AI. So I like the way you're thinking, but I think right now it's more of a consulting look see than anything else. But, you know, get your antenna up and the organization. If I did nothing more than get your antenna up today, I've done my job here today. And I think with all the smarts that that you all have out there in your enterprises, you can start to start those processes. And one thing I will, I have said to clients is that don't know is look to automate. Look to automate something that you're doing now that can be automated and let's do it with artificial intelligence. And another thing I'll add here to this question is any, any place you're thinking bi on your roadmap. Think AI. I didn't say do AI necessarily, but think AI before you default back to bi because a lot of what I've done over the past two decades basically in business intelligence. I look back now and say, well, that could be done with artificial intelligence so much better, but you have to look at not the that the idea of the bi is to produce a report. The idea of the bi is to produce maybe, you know, our interim report that goes on and does something that drives the bottom line of the business. And so that's what we're trying to do here. So we're trying to drive the bottom line of the business work backwards from there. You'll be in a much better position to try to apply AI in your environment. Well, William, thank you so much, but I'm afraid that is all the time we have for today. And thanks to all of our attendees for being so engaged in everything we do. Love all the questions that have come in. And just a reminder, I will send a follow up email for this webinar by end of day Monday with links to the slides and links to the recording of the presentation. And thanks again, everybody. I hope you all stay safe out there. William, thank you so much. Thank you. Stay safe.