 Hello, and welcome. My name is Shannon Kemp, and I'm the Chief Digital Manager for Data Diversity. We want to thank you for joining the latest in the monthly webinar series, Data Architecture Strategies with Donna Burbank. Today, Donna will discuss artificial intelligence, real-world applications for your organization. Just a couple points to get us started. Due to the large number of people that attend these sessions, he will be muted during the webinar. And we very much encourage you to chat with us and with each other throughout the webinar to do so. Just click the chat icon in the top right hand corner of your screen to activate that feature. For questions, we will be collecting them by the Q&A section. Or if you'd like to tweet, we encourage you to share highlights or questions via Twitter using hashtag DA Strategies. As always, we will send a follow-up email within two business days, containing links to the recording of the session and additional information requested throughout the webinar. Now, let me introduce to you the speaker of the series, Donna Burbank. She is a recognized industry expert in information management with over 20 years' experience helping organizations enrich their business opportunities through data and information. She currently is the managing director of Global Data Strategy Limited, where she assists organizations around the globe in driving value from their data. She has worked with dozens of Fortune 500 companies worldwide in the Americas, Europe, Asia, and Africa, and speaks regularly at industry conferences. And with that, let me turn it over to Donna to get today's webinar started. Hello and welcome. Thank you, Shannon. Always a pleasure. Glad to see a good group today. Also, just for those of you who are Twitter folks, there is a hashtag today, hashtag DA Strategies, if you want to continue the conversation online. I am on Twitter at Donna Burbank, if you wish to engage and chat with me. So as you may know, and it's always nice to see some familiar faces or familiar names on these webinars, this is part of a series that's been ongoing all year. And so you'll see that there's a range of topics across data architecture. If you've heard me speak, you've probably heard me say this. And one of the reasons I'm still in data architecture today after almost close to 30 years now is just because it is evolving so quickly and there's so much exciting stuff going on. So for those of you who've been involved in some of our webcasts over the years, we've mixed it up a bit as you'll see things from traditional things like metadata to new things like AI and artificial intelligence today. So hopefully you'll find a good mix of things that relates to your business. And as Shannon will mention, she'll send a follow-up at the end as well. All of these webinars are on demand. I understand a lot of you have busy days and I hardly ever am able to catch one live. So I always go back and I'm a big fan of on demand. So hope to see you on some of the past ones and this one will be on demand as well. Again, Shannon will be sending that out to proactively answer that question because we generally do get that. So without further ado, some basic definitions because we're data people and we like definitions but like any good data person, we don't agree on definitions, right? So there's a lot of words that fly around the industry, things like data science, machine learning, AI, what's the difference between them? What is it a name? Are they all the same? So I would say just probably a safe thing is a machine learning is a subset of AI, sort of a broader and there's sort of deep neural learning and a lot of different pieces of it. I would put data science in sort of a different category where that's more of, we're doing some analysis, 75% of our customers are based in New England, they're more likely to, less likely to purchase during a snow storm, you know, more of that kind of exploratory analysis. Think of machine learning, we're actually learning and doing predictive models on training data, quote, predicting whether there's an image, you know, looking at image recognition, things like we're familiar with if you've ever been on the internet, things like chatbots, recommendation engines and again, everybody has different definitions and we could argue we spend a whole webinar just on that. But I mean, one way to think of it, a lot of folks in the artificial intelligence is some of the stuff that's actually doing action, like a chatbot, you know, robotics, things like that. So just to get some of those basics out of the way before we get in, so we know what we're talking about because we are a fan of metadata in this world. So another thing just curious and Shannon and I were chatting before we came on, this topic actually, although I find it fascinating, you got one of the lowest registrations of the year which surprised me, but maybe it shouldn't have because we did a survey and I've referenced this in some of the other webinars because I find this stuff really interesting. This was a survey done last year, we'll be doing another one this year of trends and data architecture and what are the hot trends? So this is typically your data version of the crowd, so we'll have that, you know, folks like us, but only a little under 19% of respondents were actually using AI or machine learning, we kind of put them in the same group as a key driver. So you'll see that's way behind things like BI and reporting and governance and some of the stuff that's near and dear to our heart. So I am curious on this topic, we have a little survey spun up, if Shannon could spin up the survey, we're gonna ask you, are you using AI in your organization? So you should see a little polling thing on the right and I'm curious your answer, you click on the button and hit submit and there will be a pause as I talk. So again, it's yes, either I'm using it now or no, I'm not using it now, but I'm planning it for future use or no, I am not using it or planning to use it or I'm just kind of curious. So give it a moment as the survey does its thing and I'll be curious what this audience is given that the topic here was AI. So give it a moment, I know that a lot of my customers actually are, as you know, I also run a consulting company to kind of see the gamut and I don't see them as mutually exclusive. So I have folks that are doing data warehousing and metadata and governance and AI and we'll talk a lot in this presentation how they do fit nicely together. So I am curious, here is the poll. So a couple of yous were shy and that's not allowed to be shy on these things. So a few, small percent, 17 was kind of matched the survey. Many were planning to use, so that's good. Kind of a higher percent is about 32% and about 9% were not used to get it all. So that's probably fair, given that we're a webinar on this. So for those of you who are planning, this may be helpful for you and those of you who are not, maybe it will make you interested, we'll see. So given that this topic is probably new for a lot of folks, I want to again start with some basic definitions and for those of you who are experts in AI, I am oversimplifying and I know. But I know when I learn something, I love to just have someone, just could you really dumb it down for me and hopefully this does that for you. So I, this is a good webinar of the slides, my webinar just went to funny. Kind of some of the basic steps for AI and machine learning. So the big thing is those of you who are data folks, again, this is not mutually exclusive data is key to all of this. So a lot of it is just gathering the data and getting the right data set. And we'll talk a lot about that in that a big part of AI is having the right data and the right data sets and that can cause a lot of problems if we don't. Next, preparing the data. And I think a big fallacy that I've heard from too many people actually is, well, we don't need to do things like data cleansing, data quality with AI because it's just the algorithm takes care of that. And oh no, even more so when you have a lot of algorithms making decisions based on the data, we better make sure that's right. Then we have the right data set and the right scope of a data set. Anyone who just took statistics in college knew that you have to have a certain volume of a data set to make sure it is reasonable, right? So also again, choose the model. So this isn't data model, like a lot of us know and love boxes and lines and the relationship. This is your algorithmic model and they're not all the same. So again, a lot of you probably know this better than I am but there's linear regression, naive bays, random forest. I thought that'd be a great name for a band or something, I don't know. But so anyway, there's a lot of different models and I haven't said this before. I love my job because I get to work with a lot of different companies and they're so different culturally from a retail company that just wants to move fast and let's build a chat bot and get it right. One of my clients is a university. And for the first time, I actually had the whole group of business stakeholders arguing over what algorithm was used in the model because they're academics, right? So I found that refreshing that I think they went too much on the extreme of worrying about gathering and preparing and choosing the right model but I kind of wish the world were more like that. So again, it was a refreshing argument to say which is the best model for the state. Not that other folks don't do that but this was a little more on the extreme on the other side. The other part is training the model. And as these algorithms get better, they can be trained faster, they can be trained with lots more data but that is just a basic, if anyone, I don't know. This isn't necessarily a new concept and we can talk about this. I'm old and I did this way back in university. We did a little mini model, you train the model, you go through but there's just so much better tools and so much better data out there, the volumes of data we can work with. Now if you take your parameters and you tune, you adjust and you kind of get it right and then you do run that model and that will always learn over time. That's sort of the benefit of that but you'll see the big foundation at the bottom is that idea of quality data. If you're basing your decisions on data, it should be good data, right? So that should be obvious and we should all take care of that. And good is anyone who's experts in data quality so I'm sure there's a lot of you, it's fit for purpose. So what quality is for one situation may not be quality for another. So a lot of my clients who are using AI or doing just more advanced analytics, part of the metadata is how that data was collected, what was the intended use for that data as well as your traditional metadata of what does the data mean and what the data types are and things like that. A lot of it is context. So that becomes even more important with AI and some machine learning. So that's kind of a slightly high level techie version of that, but it should be fairly intuitive because the idea is that machines learn in a lot of ways like people learn, right? So who on the call has kids, right? I don't, but I've met them. So we all do that, you're a little baby there, cute little baby. Look at the dog, Johnny, look at the dog. There's a dog and that's where sort of people who don't have kids and hear other parents doing that, it can get rather annoying but it's just what human we do. I see a kid and you start doing the same thing. Look at the dog, Johnny, the dog, the dog. And eventually the little boy goes, dog, oh great, yes, Johnny, that's a dog. And that's so normal to us, but that really is what a baby does. We sort of learn by that repetitive. So if you think back to the model, basically, and this shows what a heartless geek I really am. But in a way, that's very similar thing. You sort of sent the data to the baby, the baby trained and repeated its algorithm and eventually it tuned over time and actually said the word dog. You didn't say dog right away. I'm sure you've had a kid who sort of said, duh, and you're like, he's brilliant, that meant dog. But over time, I can now pronounce the word dog because I've been doing this for many, many years. So that was really strange, I know. But anyways, this is really in a very similar way. The other thing is that there's patterns, right? So none of these dogs is the same. One is a cartoon dog, one is a beagle, one is an odier, is that a Boston terrier? I have no idea. Something like that. What I find interesting, and this is getting way off track, but if you've listened to me, you know that happens. I'm a big fan, I've read some of the books of Temple Grandin, who's an autistic woman, who's very brilliant and high functioning and has written a lot of books on how the autistic brain works. And I am not an expert, I'll disclose her on that. But one of the things she said that she had difficulty with is that she said she can only picture a literal description of a dog. She can picture a dog she has seen. She can picture, when you say dog, she pictures an actual dog, she has trouble, she said, with this generalization of a dog. I think I told that story just so I can tell this story. So I'm not sure how many of you, there was a recent conference sponsored by Data Diversity in San Diego, the Data Quality and Information Governments Conference. And thank you for any of you who were there. I was telling Shannon before the call. I asked in my, I had a workshop on data strategy and I asked how many people attended these webinars. I think it was like 75%, a lot of folks actually weren't sick of hearing me from the webinar, saw me in person. But the point of this is that on the flight home, I was sitting on a plane and I look at this woman next to me and it was Temple Grandin of all people. So I am not being into movie stars or musicians, but a writer on autism and brain function, I found fascinating. So we actually had a really cool conversation that just made my day. So it has nothing to do with machine learning but I wanted to tell a story. And it does relate to data because I was coming back from a data conference. Actually, one of the things she did say is that she was hoping, actually she was working with a lot of autistic youth to try to get them into programming. And data science, because it is very logical and she said that's something she does very well. So we did talk about data at the end of the day because I can't not talk about data. So I digress, but this is sort of the basic algorithm of how you would teach a child. You just keep telling it and eventually the child gets it. So some real, I could promise real world practical and not just rambling. So here's one that I find because if you know me, I'm a metadata fan. I've been doing metadata forever and it's sort of fun to see how metadata has evolved. So instead of a cartoon, all the sexy things in machine learning and they say, if you stop misbehaving, I'm gonna send you back to data cleaning. But that's sort of how we feel as humans. So a lot of us in previous presentations have given statistics of the average data scientist spends anywhere from 50 to 90% depending on the survey of their time just cleaning the data. And you saw that that first step in data science is making sure the data is right and preparing it. And I'm a geek, I love data, but that's not the sexy stuff. You'd rather be getting to the answers and the models and the algorithms. So, I mean, I'm old enough to remember the data and this still has a place that I'm going to map my data lineage and map my field. Social security number is this field. And sometimes you still have to do that and there is a place for that and we can talk at length when that right place is. But I love that stuff just is a pattern and it can be related. So some of the new metadata tools can do that so they can look and parse your data and security tools can do that too and say, wow, there's a pattern. I see number, number, number, desk, number, number, desk, number, number, number, number. That probably looks like a social security number or an SSN. So, again, a human should not have parsed all that data. Yes, we could learn that. That's something that a machine can learn much, much better. So, and again, my twisted brain, I sort of see that as there's little baby computer and you're telling it, look, Johnny, that's a SSN. Look at it, it's a social security number. Look at this, Johnny, and eventually he finally gets it. Oh, that's a social security number. So again, it's just like you've chained the baby. Chained, that's horrible, just like you've trained the baby. The same thing you're trained in the computer, you just keep showing it the patterns and eventually this algorithm, this algorithm and this little baby. Isn't that adorable? A baby computer, I've got a baby computer. So I found it funny, but you're trained in the little baby computer's brain just like you were trained in humans' brain. So that is something computers are very good at, especially those patterns that can be picked up. Yes, we can do that as humans. A computer can do that much faster and better and that's why I mean some of the early quote AI type things where the chess algorithm that could be people. Well, and again, chess is just made up of a lot of very simple algorithms and you put them together and you become a chessman. I am not a chess person and I could be oversimplified again, but that is something a computer is very good at. Look at all the different combinations and permutations and eventually I come to the right result. That just takes us the average human a lot longer. Some of the human things we're a little bit better at are social situations and conversations. So again, if anyone's had a kid this should be a similar situation. You know, you give a little baby a present. What do you say, Marco? Marco, you say thank you. Thank you, say thank you, Marco. Marco, say thank you. And if you have a kid, they never will. Tell me they're about 18. She's like mine, is this my toy? Right, so similarly over time, hopefully we get better. I mean, we're all adults and how often does this happen to you, right? You've had a long day at work, your kid won't say thank you to its grandparents, its grandparents and your stress and you haven't gotten up to sleep and you're sick and someone says, how are you on fine? Cause that's sort of the socially conditioned response. And again, back to Temple Grand and because did I tell you I'd met her? No, sorry. As she said, I read a lot of her work and she says she had to learn that behavior. So she didn't know that the average thing when you say how are you, you just say fine. She might have just answered or, and she said that was a big challenge to her and she said in the way I'm speaking for her and other autistic people, which I am not so I can't relate, but that a lot of that is what comes to a quote, human typically functioning brain that's the typical human reaction you're supposed to give seems odd to them and they have to learn it as if a computer would. So I found that rather interesting. I also find again, like my conversation with Temple Grand that I was just trying to be gushing over her and I said, I've read a lot of your books. She said, which book? I just found that, I found that interesting because back to her conversations, like can't generalize, I don't remember the title, I just remember it was good and some of the things you said. So anyway, so this is a little harder for machine because again, these are the social cues we pick up and learn, but all of us are really doing that. You just learned that the first time someone has hired you and you said, I've got a bunion and my shoe is rubbing against it. No one really wants to hear that. You say fine, thank you, how are you? So computers need to learn that as well. So when we think of a common application of AI chatbots, so again, I'll start with the example on the right and I hope it's okay to share APD because that's what they do. So I'm an ATV customer and I went online to their chatbot and it said, how can I help you here today? And I just said, I'm looking to upgrade my phone, which I'm not, but I wanted to sample again. And very well, they came back and said, well, this is where you go, this is how you upgrade. And this can be super helpful for these common things that you don't need to call a person and the person would get board stiffs having to say that all the time, this is how you upgrade your phone. So great, you probably don't wanna have to talk to a person just to upgrade your phone, right? But then I try to think outside the box and I say something more like, what's your opinion on metadata management and its impact in the industry and it has a little more trouble with that. Could you please rephrase that question? Actually, their answer was kind of cute. Could you rephrase that? We're not used to that sort of thing. Because you're basically training your model with common situations that come in handy. And of course they wouldn't ever, I was doing that on purpose, right? There's nothing wrong that I know with AT&T's chat bot. It did what it meant to do. I was giving it something it wasn't meant to do, right? So that sort of shows the limitations that you wanna, not limitations, the fact, what is the limitation, but it's not gonna have a normal conversation with you. It's trained to do certain scenarios and it can improve that scenario over time. We have a bunch of people asking to, I don't know, get international service. Well, maybe we don't have any answers on international service. Maybe we should learn that. Or, I don't know, I found, I attend to use Siri or any of those when I'm in a crisis and I think one of it, I wanted to find something and I didn't use the right word. It was like, find, search, get. You know, I'm trying to drive the car and get directions and eventually it'll learn that people don't just say find, they also say get or something like that. So these things can get better over time. The one on the left is sort of a fun example and shows my full nerdery and this is an old one. So I remember when I was just out of university and my first programming job, we just had a lot of fun weird little things. So someone had come across this chatbot and it's still around and it honestly hasn't proved that much. I don't think, and not to offend any therapist on the call, but a stereotype of therapist is sort of, you know, I feel like I've had a bad childhood. Tell me about your childhood. Well, I feel like my mother wasn't kind to me. Tell me about your mother. Is they really just sort of repeating things back? So this is sort of that stereotype where I just put things in, hello, let's talk. I'm working too hard. Did you come to me because you're working too hard? Yes, it makes me feel, you know, I just sort of very, I said, no, you're being a little negative, right? So it has sort of canned therapisty kind of responses and then I of course did my nerdy thing. What's the best artificial intelligence algorithm in over search? Open search. And he just didn't answer. What does that suggest to you, right? So it kind of broke down there. Anyway, but it admits that instead of Eliza, which is the name of it, still learning. Please let us know and give us your feedback. How can you think this might be improved? I don't know, if you're bored, it can be sort of fun. You can, yeah, anyway, I shared, I had one though. It's something like that. I wanted to kill myself and it wasn't very nice at all. It just said, tell me more. And I thought that was horrible, but you can get very strange. If you're angry at your boss one day, put it into Eliza and it will say, why are you angry at your boss? But anyway, that's sort of an example in a funny, specificist way that it can only, it has a limited range and it can get better over time. But just like a child sort of learns over time, the certain bright answers to certain questions in certain contexts, that's what a chatbot sort of does. So to break down that in terms of some of the basics. Again, just like that similar one we had at the top, you wanna gather the data. So you can train algorithms sometimes. You'll feed support logs from previous on the phone calls. You can do voice to text and kind of use that as a training set. As it gets better, it uses chatbot conversations to get better. You wanna prepare the data. I'm just like, you would make sure the responses fit realistic use cases. If you're a support, you probably wanna talk about support calls and not therapy calls, they're very different. It would be really strange. I wanna upgrade my phone. Tell me about your feelings. You know, that wouldn't fit, right? So you wanna use the language and the scenario that fits. Choose the right model. Again, there's certain models that work better for chatbots than others. And then again, that training, right? So again, if we are a computer in our brain and we're back to this latest conversation, how are you, I'm fine, right? That's what you're just supposed to say in our society. This sort of programs that into you, right? So you have a different class of greetings. Could be, how are you? Good morning, hi there. And that's again, when I was screaming in the car at my iPhone to try to get it defined or get or search or whatever it was, it wasn't in the greeting list, right? So whatever I said, wasn't the right thing to say and then you learn and it learns and we all get along better. And so this can actually, when someone says how are you, you can kind of break it down. This hall, this R, that's got a good classification. I should just say, I'm fine, thank you, right? So that's one way to do it, there's others. But it just shows, you know, this national language processing and breaking it down. And it's just what we sort of do in a limited way in our brains, but more what a computer would do to make that happen. The other one that's sort of fun or scary or creepy, whoever you wanna look at it, is this idea of image recognition, right? So one thing humans are very good is recognizing patterns as historically computers weren't because it was just very hard. And so this showed my full nerdery also. I was telling a friend that I was giving his presentation. I said, you know, I'm using that muffin in Chihuahua graphic. He's like, the what? And I don't know, I just assumed everyone's seen this, but you know, I'm kind of in data internet on the web lands. I'd be curious in Q and A if I'm the only one. But I digress. I've had too much coffee this morning if it's not clear. And I have this fun little game in my head. Sometimes you hear a phrase and I stop back and I'm like, has that phrase ever been uttered before? I had one friend and I won't tell you the context. One of his phrases was, I was walking down to see the camel and I told her I don't wanna be a fireman. And that was actually like part of a conversation. And I said, Jake, we're gonna stop there. And that is the most bizarre combination. Like has anyone ever uttered those words in combination before? Which makes me think like I'm a computer and I'm a fan of parties. But you step back and say, of course we've seen the muffin in Chihuahua graphic. You know, you kind of wonder 10 years ago, was that ever uttered? Anyway, the idea of how a computer can see, understand or predict, when we think of predictive analytics, right? You're predicting sort of what this is. So it's sort of funny, because even my brain, I look at this and sometimes you get confused. Like this one actually looked a whole lot like a Chihuahua to me. Right? If you have a lot of time in search to when there was one, they were trying to train the algorithm. It was a Chihuahua eating a muffin, right? So I had both of them in there. I have no life. So anyway, but that is, I mean, well, if you look at what makes one a Chihuahua and one a muffin, right? We just sort of know, but there's certain patterns. And a computer can get better at recognizing those patterns and categorizing. And that is something a computer has actually gotten very, very good at in terms of facial recognition or image recognition. So again, you have to train the model. You have to build the right app algorithm. There are, there's a lot out there and there's so many APIs and things available. This is actually become pretty commonplace. I mean, I'm sure the first person who built it thought it must've been really fun and now it's almost just standard, right? So there's a lot of good training data sets. So I actually was nerdy enough to go look up. And yes, there are labeled data sets around Chihuahuas if you want to go in. This is one image net, there's others. Also, there's sort of APIs that are available that you can use on open source or for pay. This is Google's, I'm sorry, Amazon. I haven't, they're giving them that attribution. But again, Amazon has one, Google has one. I mean, in Washington, there's several. But you'll see here, they've kind of done that image tagging. So you give them a picture, I can say with this probability that's a person, that's a rock, that's a mountain bike. And it's amazing. So I did nerd out for a while on some of these image data sets in the accuracy of how specific they get. Like even the muffin, whether it was a baked good, muffin baked good, cookie baked good, whatever. And so when you think of it, mountain bike versus street bike versus whatever, it's amazing the number of, again, this technology's been around for a while and it's getting more and more sophisticated, which is sort of fun. So some real world applications other than that or joking around with Chihuahuas and muffins. There are some realistic real world use cases. So one of some folks, and again, we'll talk a lot at the end of whether it's cool, whether this is creepy, whatever, right? So any of you who kind of use the cloud or some of these picture storages, you'll see that, hey, all of a sudden they'll group all of my vacation pictures or all of the pictures with my brother. That's where it starts to get creepy. Or all the pictures where I'm smiling or whatever. Again, all of this sort of very detailed analysis, they can see what's a mountain bike. They could say, here's all the pictures with you mountain biking in it, right? So that can be handy because it can auto organize your image library. So again, when we get back to the idea of that metadata can be automated. I remember the days and it's still true. We have to go in and tag every photo. And that still does exist. So be careful. I guess there's also a lot of discussion if you've ever gone through and just try to do a search on Google. There's also the idea of user tagging. In fact, I even wanna post a picture and I'm a photographer. I wanna put as many good meta labels as I can. Some people get very creative on there and they sometimes you search for something and these odd things come up. Humans can tag it however they want. So often it's a mix of both that kind of auto organizing AI as well as human tagging. This one I found kind of interesting and it makes so much sense. We can think of facial recognition but I'm a machinist in the field and I need a part and I'm under the engine and I scan it and it can tell me what part number and order it, right? I mean, that's to me an amazing real world application. And something probably humans can do better. I know I've gone to the hardware store and I have a thing like this in my hand and I just kind of say I want another one of these or I go look through the bins and I'm trying with my eyeballs kind of figure out. It's amazing how many screws of different kind of shapes and sizes once you start to do a project there are. So this just make it easier. They great for a home depot kind of thing. This is the part you need, you go ahead. So also when we get into facial recognition, a lot of uses for that, creepy or non. I don't know how many of you work in an office where you can get in without, not with a badge, but with facial recognition. There's about five years ago, my first customer was a customer switch when we went and it was like, I was the only one with a badge. Everyone else just looked at the door and let them in. So again, this has been around, has a lot of different uses that you can do. But it's again, that's something a computer can get very, very good at. Another one kind of that we're used to is this idea of recommendation engines. So the North Face has a good example of you may view. So this was an example of using other presentations, but I was giving this presentation in Barcelona. I'm a big runner. So just as an example, maybe I wanted to buy a jacket from the North Face. And it'll ask you, when and where will you be using this jacket? Well, I'll be jogging in Barcelona and talk about not just AI, but just data lakes and data collection. They were able to mix that with weather data, right? And say Barcelona is going to be this temperature. And by the way, I think, based on your needs of what you wanna do, what the weather is going to be, I can look through my product master data. So this is a good example. It doesn't mean because we have AI that you don't need things like master data. Because you have AI, things like master data are even more important. How do I look against my list of data if it's just randomized, right? So it could say, based on this list of information, you're gonna be at the certain temperature, the certain activity. These are some jackets that may be good for you. And I'm having trouble moving my own slides. And it actually came out pretty well. So it gave me some choices based on my selection. And again, it used several things. So one was again, product master data. It also looked at customer purchasing patterns, customers who bought this, also bought this, kind of usage ranking and things like that. So again, because these things have become so normalized and out there, this was actually made open source. So it's called Destiny or deep scalable sparse tension network engine rolls right off the tongue. But again, a lot of this stuff is sort of becoming much more common and easily to embed in your application. So I've seen a lot of just little YouTube videos of how to build your first chat box and put that in, right? Or fun little applications people have built. I had one lady I saw on YouTube that had built an algorithm and it was a shy algorithm. So when she looked at the screen, it would sort and when she looked away, it would stop sorting. You know, just, I don't know, she was just playing around with an open API just to play around with it, right? But these things are out there. So if you're interested, you can spend your weekend or your working hours or whatever to kind of start looking into this. It's much more accessible. You don't have to write all of your old algorithms to start. You certainly can and that's the competitive advantage obviously, but there's a lot of stuff out there that's embedded in a lot of these platforms. So that was sort of the good side, right? So this one was from Amazon where I kind of did some, you know, AI kind of looking for, you know, but customers who bought this also bought this. But then what happens and I'm sure you've all had this happen. You've done that and I looked for a jacket. So in this case, these examples came from, I was doing an outdoor sporting goods conference in Europe, amazing, I love that sort of stuff. So I was just trying to search some examples. So I searched for an axe. I'm gonna go camping and I'm gonna have an axe, right? So again, this shows the value of the data set. So when I searched for a jacket on Amazon, it came out with a lot of very good examples, right? Maybe a men's jacket versus a women's jacket or running tights which aren't exactly a jacket, but that's fine. And then I searched for an axe just because as you know, I'm strange. And it came up with customers who bought this also bought this coffee filter which made no sense. When I thought about actually this is a coffee filter for a typical camping pot. So maybe someone got up in Canada, bought an axe to go camping and he also bought coffee filters at the same time. So it wasn't actually their algorithm was wrong but they probably did not have a very big data set whereas millions of people across the globe are buying jackets and customers who bought this also bought this makes sense. This one didn't make so much sense. So I've seen that again, I have odd interests and sometimes I'll order a book on Amazon and often if I'm ordering a data modeling book or something they'll get other books on data modeling but I'll order a different kind of book and the recommendation makes no sense. It's just like the other three people who happened to have bought this book also bought. So again, with better data sets, better quality, it gets better over time. So the other piece we're going to talk about on this is just because you can should you and again, I find on social media sometimes I don't forget the business values. This one cracked me up. So I think she was British, she writes like a Brit. Here on Amazon, I bought a toilet seat not because I needed one. Necessity, not desire, I do not collect them. I am not a toilet seat addict. No matter how temptingly you email me I'm not going to think, oh go on then just one more toilet seat, I'll treat myself. But how often have we had that you go, I've bought the axe, I purchased it and for the next six weeks you have this axe sort of coming up or toilet seat or whatever you just purchased and is that an effective mechanism, right? So that's really not AI, but this example but it shows you that it isn't always helpful just because you think that through. And this is not limited just to technology. When people call my house at dinnertime asking me if I want new windows, I'm just amazed that ever works, right? So and here's a similar one that actually some of the algorithms didn't work. And again, this lady had a good sense of humor. Boeing took a look at my profile and thought, now there's a woman in the market for a military sub promoted this tweet. And I'm sure you've all had that. I keep getting promoted to funeral insurance and I'm really wondering what in my profile makes them think funeral insurance. It always seems like when I'm having a bad day it'll get out of shape, ugly, wanna feel better or something, you're like, what? So again, make sure the data sets right, the algorithms right. Some of the technology is easy to integrate but it's how you customize it, how you use it, how you align that with your business value. So hopefully the end, I don't know, again, I run with a funny crowd but if you do search Twitter or any of these you get some really funny examples and I'm sure you've had them yourself of AI gone badly. Who've asked Siri something and you get a very strange response, right? And I found it has gotten better over time because it's been more data sets and more people using it and it gets better over time. Just like a child, we kind of laugh. Children have say things in a funny way or if you've learned a second language, I always, when I was learning Italian I still do say really funny things and I generally don't say it funny the second way. The laughter from my peers cures that but we all do it. You have to take training. I'm training the algorithm. That's what I'll say next time I make a mistake. I'm training my algorithm, I'm not finished yet. So there's a whole governance and metadata around the data you're using as well as the algorithms themselves, right? So here's a cartoon, this your machine learning. Yep, you pour the data in, you do a bunch of linear algebra stuff and then you do stuff on the other side. What are they wrong? Well, you just kind of mix around, keep trying until it looks right. And to be fair, to a certain extent that is how they work but it isn't just willy-nilly, right? Well, are we using the right algorithm? Are we using the right data set? How are we governing it? How in an organization do we know that this is just a sandbox and we're playing or is this going to be operationalized as an algorithm that's going to make decisions on customers or in the case of a police department on people's lives, right? So again, the more these algorithms are used to make decisions, the more and more we should govern them. Which brings in this topic of ethics or kind of think before you code, right? Especially when we get into AI and all the things you can do, kind of I wish more folks thought of the mantra just because we can, doesn't mean we should, right? And I think as an industry and as a world we're still kind of working that one out. So privacy, consideration of the consumer's rights, is it right to do face recognition of everyone that comes into our store and all of that? We kind of mentioned this, but errors, if I'm going to classify a decision based on data, is that right? I'm a police department predicting crime, I'm using the right algorithm, self-driving cars, how do I do the right algorithm not to hit the person walking by? How do I know that's a person? I mean, that's combined a lot of things of AI, image recognition, decision-making, that kind of thing. Hey, this one comes up a lot job loss if we're automating people's jobs or could they learn AI and become programmers? But anyway, that is an issue, right? It's becoming a different world as more programmers, AI starts to automate things. Bias, we'll talk about that a bit in the next slide. So what training sets are you using, right? So I've trained a data set with my child, this is a dog, this is a dog, this is a dog, and different kinds of dogs come in and they don't recognize, we never saw those dogs, right? So make sure you're not promoting bias. There's a lot of discussion there in the news and if you're interested in that sort of thing, it's a very valid question. Security, can your data sets or your algorithms or with anything nowadays? But again, if you're using this to make decisions, you'll be pretty control of it. You know, if you wanna read up on AI, there's sort of different phases of AI. So we're training the machine. The more the machine can train themselves, if you've seen 2001 Space Odyssey with Hal, you know, I guess that's everybody's fear of will the machines take over? I think we're quite away there. I just talked to Siri for a few days and I think he'll fear, but I don't know, there's been a few things in Google that I've seen that have been doing some amazing testing of voice recognition and voice conversations that seem all too real, right? So the technology is just exponentially improving. So, and then I labeled the last one, the creep factor, and I think we should all think of this, right? It might not be officially illegal. Maybe it doesn't officially break a privacy rule, but if I were the customer, would I want this? Does it feel right? We were in a workshop a few weeks ago at one of the universities I'm working with and that sort of became our mantra. Is this the creep factor? Like let's do the creep radar on this. We were thinking of all the ways you could use data. People I thought it was a good conversation to the self-policing of, I know that would be cool and I know we could play around with that, but let's think, is this really what we want to do? So good to think of that. We think of bias. Computers can learn bias. If I'm trying to have it identify a doctor and I only have old white guys and then we have a doctor on the right that doesn't look like that, will it not represent a doctor? Again, there's a lot of interesting stuff on AI on the web and out in the industry and they've had sort of stories of AI kind of trying to learn conversations from the internet and if you spend any time on some of these forums and things, words can get nasty, right? And they had these horrible chatbots just swearing at people and calling them epithets and instead of it was a bad reflection on our society but be careful what you're learning, you know, how you learn. We all learn from our environment. You learn a certain accent, a certain way of speaking. So be careful of that. It is just true with any data set but especially if you're making decisions on people's lives. Think of the data set you're using and is it inclusive enough to really be the right data set that you should be using. So as we go through data governance more and more for AI is critical and I like to break that down for several layers. So what is that data foundation? We all know we should manage our quality data. Do I have, I don't know, I've worked with a bunch of healthcare companies and even just as simple as do we have the right gender code or the right race code or is the race code, is it African-American and one and black and the other and other and another one and just trying to get that right. So again, if you're making decisions on some of these parameters is the data of high quality. I'm trying to predict medical issues versus men versus women on heart attacks and I don't have the right gender. I mean, that's, I can't, there's not good analysis there, right? So you have to have the right quality data. You have to have the right volume of data. Again, as we think why AI has taken off in recent years, partly because it is the volume and high quality of data sets, the images we can collect, like the storage mechanisms in the cloud or elsewhere that we can store that upon is a big part of the rise. As well as the semantic layer, right? I called that thing to the little boy, a dog. How do we know a dog's a dog? How do I have these labels? And again, is it meta tags done, was created, is it labeled, machine is created? What's our glossary? I had one customer that was doing AI and we actually built good old-fashioned data models, conceptual data models to feed their AI engine, right? So what is the customer versus the client? What are the, you know, some of the very basic rules was what they used to kind of see their learning algorithm, which is kind of fun, a good mix of old and new, right? And then how do you, how do you manage the models themselves? What techniques you're using? That argument I had at the university if they were arguing over the modeling technique, they should, that's what you should be doing. You know, is this the right algorithm to get the right results? What were the business rules and understanding? What were the assumptions and biases you put in that model? How did you prep the data, et cetera, et cetera? So there's several, one of my customers is using something called the Chris methodology to kind of govern their analytical modeling. There's others, not saying pro or con that one, but just as an example, think of it, it's not governances and governances and governance, right? And then there's a higher level of governance of the, should we even be doing this, you know, kind of the why and the how and the what and all of that. So that sort of leads me to the next slide of when to use this. So there's a lot of cool stuff. I mean, we could spend hours playing with it, it's really fun, but what are some guidelines? So the big one, if you've heard me speak before, always go back to this, is this useful in supporting my business initiatives? Only do it if yes. Which slide into the second one. Are you doing this to play with some cool technology? Yes, we all want to do that, but that shouldn't be your main driver. And I've had too many companies where I've come in and it was someone that just wanted to play with something and that's how it got decided and that's not a good business value. Is it ethical? Explanation point, only if yes. Just because you can, doesn't mean you should. And I don't think our laws have caught up with what data can do. So use your own mental compass, moral compass, especially if you're retail, et cetera, these are your customers, right? Do unto others as wish, who do you want to do it? And then do I have the right data system to support it? Only if yes, right? I'm trying to do a suggestion engine that I only have five customers, it's probably not the best thing. Maybe it's not the best data set to support that. So there's plenty of other guidelines, even just starting with these, I think might go a long way. So here's some examples that are fictitious, but some are related to the things I've run into. So maybe here was your business driver here on the left. Our customer satisfaction rating is low. What can we do? How can we use data to support that? And then some comes up and says, I know let's implement a facial recognition program that detects whether a customer is smiling when they're online. Okay, some of you might think that's a bad idea, some of you might think it's a bad idea. So let's go through the decision tree, right? Is it helpful to support my main business citizens? I don't know. I don't often smile when I order online. It could be that only 25% of our customers own or online, so that's not even the right data set. I think it's strange, often. I might be wrong, but to me that doesn't seem that that's probably the best way to assess customer satisfaction, right? There's probably other ways to do that. Do I get to play with some full technology? Yes, but don't waste my money and only customers do it, right? Is it ethical? I don't know. Is there a law that you can't do that? It means so far as the world there are, but it does seem sort of creepy. When I want my company to be doing that for me, I know if it were I and I wanted to do that, I would probably not order online. I mean, I'm bad. My customers, I actually wanted to drop off my dry cleaning the other day, and they asked for my phone number, and I said, why do you need my phone number? It's a new dry cleaning. Well, it's a primary key. They didn't say primary key. We need to identify you. And I said, well, that's a terrible primary key. We had this whole argument, and I actually went away walking away and not getting my dry cleaning done. And so they should not have to give my personal information just to get my dry cleaning. And she said, well, then just make up a phone number. So I did, and could you repeat that? And I said, I don't know if I just made it up. So I walked out, but I did notice I actually liked that dry cleaner. I went back a month later and they stopped asking for a phone number. They don't need to get my phone number dry clean. Anyway, but we should think of that, right? Is it ethical? And then don't have the right data sets. And that's sort of what I got to before of, is smiling when you're ordering the right indicator? I have ordered several things I've liked, and I don't tend to smile while I'm doing it. So maybe that's the wrong and only 25% of our customers are online, so that's not really good, et cetera, et cetera, et cetera. So I would vote that this is not a good indicator. I don't know, I might have different folks in the chat feel differently. But I was in a meeting just the other day with a customer, and it was a mix of all ages, all genders, all whatever. So you can't say young people thought a certain way. I was surprised with the different reactions. And we were trying to say how we could support the customer journey. And some folks went right to, I thought, the creepiest thing in my life. We'll film everyone coming in. We'll do facial. And then some folks were thinking, but we can't even get their invoices right. Maybe that would help customers, right? So I'm often surprised that I think some folks go right to the extreme, when there's a lot of ways you can use data to monetize and maybe it's some of the old-fashioned stuff. Could I get their name right on the bill or something like that? That might be the better thing to do. So before you think I'm an old buddy daddy that doesn't wanna do anything fun and new, let me just evaluate that notion. So here's an example of, maybe it's the university I've been doing to work with universities lately, find it fascinating. And here's an example of, and it's true, a significant percentage of students who are accepted to college in the spring just don't show up in the fall. They call that melt. So this example, if anyone's an NPR geek, they actually did it with a podcast, an NPR about this. And I picked up on it because one of my customers was doing the very same thing. So Georgia State was in the NPR, so I could mention this, but I tend not to share companies information unless they want to, but that was the publicly stated thing. So again, that is a problem. And why the heck is that? They took the SATs, they got their grades, they even submitted financial aid, they've gotten everything, but between spring and fall, something went off. And one of the things I'm doing with one of my customers is actually mapping the student journey. And we kind of did a process map and all the things that happened. And one of the things that was eyeball moment to me of how many things happened from graduation day of high school to even your first day on campus. And I had a nephew going through this. So did you get your immunizations? Did you put in the financial aid? Oh, your name's wrong in the financial aid, you have to do this. And you call on the phone and I can see where kids were frustrated, especially they were saying kind of first generation where they didn't have a support group. And in high school, you had a support group. And in the summer, it was known to even ask. So they implemented a chat bot and it was basically some of your basic considerations. My past performance can go through, what do I do? And they interviewed in this NPR thing actually, just the other day. And the student said, I like that because then I didn't feel dumb asking some, I could ask the chat bot, no one's gonna judge me, right? So this is a case where I thought it was great outside the box. I probably would have not have thought of that, right? Even though I am brilliant. All right, so does this support my main business? Is this necessary? Yes, this was one of the main pain points of not just financially but ethically. These are the kids that are your prime candidates and they're not getting there for what reasons? Because they can't figure out financial aid form that's wrong, we should help them with that. Do I get to play with some cool technology? Yes, another neat thing is, so does your target customer, right? Students live in their cell phones, they're used to the world of chat bot. Very different situation from this store where most of your customers buying brick and mortar. I mean, just a thing to think of. In that case, the solution matched the audience. Is it ethical? I think in this case, fine, is students choose to interact, they went to the bot. There's nothing there that they wouldn't have gotten online by calling, this is a lot easier. And then do they have the right data set to support? They did, I mean, all this information existed. They knew how to, you could have called the admissions office and gotten this information but it just made it a lot easier. So this is a great example of asking those questions where it really did help and they had a proof of benefit and it worked out really well. So good mix of the right data, the right issue, the right technology and the right consumer of that technology. So to kind of summarize this random journey, AI and machine learning are here to stay. There's some exciting opportunities. I do kind of follow the industry in this and I think it's just amazing, some of the new things that come up that we haven't thought of. But some of the things, again, AI, artificial intelligence sounds so future-generated. We're using it every day and almost every product now is starting to embed it in some way, your car, your bank, everything. So, but again, if the quality isn't right, it's not gonna be so good. Make sure we're governing that data and have the right ethics in place. And choose the right scenario. I think we all, well, speaking for myself, I love to play with this stuff. You just wanna pick the right tool for the right job at the right time. I do this for a living if you need help. Let us know. This is the white paper that I mentioned. It's downloaded both from Dataversities. Well, our website has a lot of good facts on everything from AI to metadata, et cetera, et cetera. If you enjoyed this or one of the past, next month we're talking about data as a profit driver, which is kind of a nice little extension to this one. Again, monetization of data is huge. A lot of my customers are trying to kind of figure out what makes sense to there. And that'll be a panel, which is a little different. So hope you can join us. And at this point, Shannon, I guess we can open up to questions. Absolutely, Donna, thank you so much for this great presentation. If you have questions, submit them in the bottom right-hand corner in the Q&A section. And just to answer the most commonly asked questions, just a reminder, I will send a follow-up email by end of day Monday to all registrants with links to the slides and links to the recording of this session. And Donna, I gotta tell you, I am right there with you. I am totally geeking out through this whole presentation. And I want you to know that I asked Alexa the other day when she was gonna be cognizant. She didn't know. And if there's anybody on in the hospitality industry, I was so annoyed with traveling recently when my hotel room was not automated and not voice-controlled through AI. The fact that I had to look for the remote thingy that is just so old-fashioned. I can't tell my- Well, see, I will differ with you because I will not stand in the hotel room with AI because that means they're recording you. They're turning that on or off. So that's where the ethics security comes in. You heard something, Mary, it was gonna make that standard. And I said, creep Ola. See, my creep factor is different than your creep factor. Oh, I was missing it, man. See, that gets back to this know-your-audience, right? Right. I'm a part of the minority. Other folks love it. I have a question about the previous recording. So, Janet, they are on dataversity.net under on-demand recordings. And I will also send a link that just lists all of Donna's recordings. And what are some of the data security concerns that need to be considering as companies begin to use AI? Excellent question. I mean, the obvious one is are we protecting and understanding personal information? That's one. Are we sharing anything? I mean, I'm working with one company right now and it's an internet of things, kind of like a Fitbit kind of thing. And talk about personal information. They have customer, not only the identity of me, but my heart rate, my health. And that could be, think of anything that could be nefariously used, right? A good topic on that, just to think, unfortunately in this world we have to think of it. How could this go wrong? So protecting your data, customer's data, photos, couldn't get in the wrong hands, we used the wrong way. So that's some of it. Protecting your algorithms, again, that utility that I mentioned that was going in and out of things, they had an amazing, it was their data in general, but also their algorithms. So they were doing a lot of, it was a water company, a lot of data-driven decision making through their algorithms. And just think, getting into the wrong hands of a public, this is a public, water of thousands of millions of people across this area. Someone had hacked that or gotten into the code. So, a part of it is an ethical concern of am I sharing information that shouldn't be shared? Can someone hack and do it? Should I be putting this on the cloud? Are there regulations against doing some of this? Where am I storing the data? And so some of it is similar to your typical data management, but I think when we're getting into algorithms, and especially personalization with consumers, just the one that Shannon and I were arguing about, there's been cases of children's toys that record your child and then sort of publish that out to folks and do you want that, maybe it makes sense to send back to development, but can that get escaped and somebody else can suddenly hear what your child was saying and their privacy of their home, right? So, a lot of that just to think about. I guess the main thing is what could go wrong? You know, think of it as, it could be scary. It is scary because you're right. I mean, there's so much that can be recorded and is recorded, but I still love my convenience, I gotta tell you. Well, and it's funny, and I think part of, well, now we're a similar age, so it's not generational, but I'm working in one company with all millennials and they just, they laugh so much at my rants about security because I am a data person and I will not give out my phone. They always ask, I'm trying to buy a pair of shoes. What's your home number in your zip code? Like, you don't need to know that. I don't know, you were paranoid. I'm like, have you not been listening to the news? Like, no, I don't want to, I'm the Zuckerberg, I have a tape over my laptop right now with the camera. No, you're not fixing it. That as well, yeah. You don't want to be as paranoid as me, but you know, maybe don't want to be as open as Shannon, right? There's probably a happy meeting. I do tape over my webcams, I do do that. All right, yeah, you're learning. You just don't want to see me in my sweats right now. I mean, I'm just saying. I mean, I'm just saying. I mean, I'm just saying. One of my customers, I definitely don't share the name, but it was a telephone company and I was reading the spec of everything they can track. And my creep alert went, wait, I mean, whether they used it in that company, I worked for them. They were actually anonymizing all of it and they were actually very ethical. But the possibility was there. They knew who I talked to, when they talked to where I was, everywhere I've been. And that's a good part when you think about a security. I don't go anywhere interesting. I go from office to office, but you know, someone could track my patterns. There's one more, just one more quick thing. Oh, go ahead. Yeah, go ahead. They were talking, think of data sets and thinking outside the box. There was one I saw the other day and it was on these smart homes. And they were actually saying that could be used for domestic violence. When you think of someone from the cell phone can lock your doors, shut up the heat. And I personally wouldn't have thought of that. And they're saying, do you have a diverse group of people in your development organization that maybe have been through that? And again, it sounds so horrible, but every technology that's cool, you kind of have to think, how could that be used too bad? You know, someone's recording me. That's cool, but when could it turn not cool? You know? I don't know. Just things to think about. It doesn't mean you don't ever go outside and do anything, but it's something to think about. Sure, absolutely. We do have another question that came in. What skills or tools are needed for machine learning? Excellent. So yeah, and I would say there's a lot of good resources out on YouTube and things like that for like machine learning 101. So some of it is basic, like sort of statistics and analysis skills. So data science-y type skills, knowing how to write an algorithm. Again, some of it, the algorithms when built was into basic coding skills and it's like Python can be used. So it sort of depends on what level you want. I was pleasantly surprised that there are a lot of these that you can build a chat in an hour at a very simple level with some basics coding skills because there are some APIs. And then you can also be the ones writing the algorithm and developing new techniques. And I think that's more your statistical analytics and algorithms and that stuff. You probably learned way back in university if you did. As well as basic data management skills, right? How do I know it's the right dataset? How do I cleanse the data? How do I govern? So it's sort of like that elusive data scientist to get right. I think you have to have a good mix of skills, but it's a little more heavy on the programming than maybe this typical audience and a little less on the pure data management. You're really building algorithms on a code level. So we've got one final question here. It's just kind of a statement that maybe you can recommend. Building Lab to investigate recommendations for system software or OS? Oh, I don't know. I think there's a lot. I mean, security concern is one. So make sure I have a lot of, I mean, a lot of this stuff you can be building just in your Python script or whatever coding environment. But depending on your dataset, I mean, things like AWS, you can kind of spin up little sandboxes and still, to me, and I'm more in the data than the coding side. I've been a coder, but a lot of it is how do I get that right dataset? And a lot of folks or my customers are kind of building some of these scalable systems, but I'm always amazed, especially some of the, if your security is right, some of these cloud technologies, you can kind of spin up some cloud sandboxes really easily. So you don't, I mean, a lot of it, you can be done on the cloud and kind of spin up easily. So AWS or something like that have some nice. And all of those, actually, one of my slides kind of talks about some of those platforms like Amazon and Google. It's in one of my slides that Amazon has one and Google has one and IBM has some. So they kind of have some platforms that are kind of built in with these APIs. And was it this one? Yeah. And some of those, Amazon also comes with kind of that data layer. So that's something to think about as well. Sure. So we've got just under a minute here, but just briefly, can you describe the major differences between a statistician and a data scientist? Oh, question of the century. And we've got a whole conversation on that. So, which is also slightly different than machine learning and topic of the century everyone loves to talk about. So there's some similarities, a lot of the core skills. In fact, I took some data science courses online, actually to keep up on my skills and so much of it went back to my statistics background, college, right? So I think a lot of it though, that's elusive that makes the best data science that I've heard a lot of customers and folks complain about is that human aspect to me in a way of what makes a good data architect. It's someone that understands the statistics and the coding and the data sets and how to manage that, but also the business case. And so a perfect data scientist I think can do both. They can understand the business, understand what needs to be described, understand the so what it comes from the data and then make decisions. So it's not just building the algorithm or finding out that 75% of our customers are based in New England. No, that's a stat. That's not so interesting. But here's some purchasing patterns that may make sense to our business that may be something to think about or here's a decision I can make. To me, I know in a lot of the discussion I've had with folks, that's what makes the scientist that the discovery and the analysis and so what. Well, Donna, thank you so much for another fantastic presentation and thanks to our attendees for being so engaged in everything we do. We hope to see you next month and I hope everyone has a great day. Thanks everybody. Thank you.