 Good morning. Good afternoon. Good evening. Welcome to Red Hat live streaming. I am for short host with the most of the channel and I'm joined today for Today we're talking data services, particularly data science. I'm joined today by Sophie Watson and Audrey Resnick. How are y'all today? Doing pretty good. So greetings from Arizona. Is it warm there still I guess is the question 200 today Remember, it's a dry heat. No, I know that I know that it's a dry heat I've been in the dry heat. It's when the dry heat sucks all the water out of you. That's when it's a problem You just have to remember to keep on hydrating. Yeah, there you go Sophie we are talking about day-to-day data science, which almost is a tongue twister. I Like how you did that there the So, how are you Sophie? Very well. Thank you very well. How are you? I'm fantastic The fire is going in the background because it's only going to be a high of 70 today. It is officially fall here in Very jealous it's gonna hit a hundred here as well So y'all can keep that Thanks, I will stay up here north of Canada. Thank you Yeah, so we wanted to like talk I know we've had like a few sessions where we've chatted about Data science and all things data science and like we focused on data engineering Last month the call and gyo so obviously if you missed that go check that out. That's pretty yeah That was a pretty good discussion But I feel like People don't know what data scientists actually do so Chris. What do data scientists do? You all do data and Actually, no, I mean I to an extent I do know what you because I have worked with data scientists in the past And my dev ops roles like getting them to learn how to use Containers to run their models and run everything kind of at their own pace So I know that you look at a ton of data You create algorithms or use algorithms to kind of parse and you know find things within the data That can help organizations down the road kind of deal and maybe detect future things It is pretty opaque to me I will say because a lot of math has involved and I have like if there's a phobia of math I have it It took me 10 years to take an algebra class. So yeah It's pretty bad. Anyways, how many fingers am I holding up? No five I can count Like if we get past like the basics of like the the little calendar in your OS, right? Like cosines. No lost on me Algorithms not lost on me. I kind of get like sorting algorithms and stuff like that. I get that but the whole the whole thing with Like sand doing math like so like that whole thing. No, I have no I don't have any desire to learn that level of math detail. Yeah I think that's interesting. You've hit on a few points there and you talked about what they find this do And I tend you I'm sorry if I did no not at all But I think it plays into where I want to take this so you get this really nice overview of like We explore data. We try and find patterns in it We might use an algorithm which you know, most people just say model But we might right, you know an algorithm or a model or a process to explore that data and then you also said that We might make predictions using that we might try and predict the future So I like that a lot because I think there's this notion that all data scientists do is just train models. I Don't think that's the case. I think There is a degree of you have to do that in your work, right? Like you can there's some churn, right? Like there's a grind to your work and then there's like the actual discovery phase the aha moment, right? Right Yeah, I was just gonna say I think that kind of describes it well, too The the one thing that I think most people don't realize is how much time we spend with our customers trying to understand what their problem is that they're trying to solve and That leads us to ask more questions of the data and I always Look at look at the data and tell the customers the data is going to tell a story But I need to ask you questions, you know, how do you use the data? You know, what are you hoping to use this data for? So besides the mathematics there is a little bit what we recall Almost domain expertise. Yes, and that kind of almost gives a Venn diagram with domain knowledge Mathematics and then programming and then in the center areas kind of where your data scientists sits, but I don't think that's the whole story Because sometimes data scientists they don't end up creating a model they may look at all the data and what the user is Trying to accomplish and they may say well, you know, you can solve your questions here just by using Tableau right Sometimes there's tools that you already have that you could throw your data. Yeah, that's correct Right, and there's a lot of data scientists for whom the the role isn't to ever train a model and make predictions it is simply to kind of produce reports and and Process that information that you then pass back to stakeholders so that they can make some decision so I Mean I traditionally see that as like a business analyst kind of thing Do you see the two roles kind of merging or am I just old in my thinking I guess think I Think it's trendy to call everybody a data scientist So I think that yes a lot of people that traditionally we might have called business analysts kind of 5-10 years ago And feasibly called data scientist And it really varies from you know companies company You know when Audrey and I talk to our customers so you get such a range of what it means No, our first question is, you know some accounting might call and say hey Can you talk to the data scientists so we'll sit down with the data scientists? And the first thing to say is so what do you actually do? Well, what do you think your job is? How do you spend your time? Is it like is it more like an interview than like a work session or is it? It is like try to figure out what they do and then you know what their pain points are and then How you know if we can speak in with containers like he said in kind of ease some of the friction in their day-to-day and Going forward so and some I was just gonna say sometimes They know that they have a problem and they know that they can find it in the data But when you start looking at the data, you know, did you know that you could do this with your data? So for example, I worked with a mining company and they were looking at their their data for truck optimization And then they were saying well, we have to spend 13 million dollars and buy these new graders to like grade the roads And I was looking at the data going well Did you look at your data to see how much time these your your current machines are using and they're like no We just know that they're all over the place, but we don't know if things are getting done so Asking the questions and probing is It's very important Because sometimes you'll find these little hidden gems that your customer didn't know about right like two million dollars Yeah, like there's so much potential because you come in with like a fresh set of eyes, right? It's almost like a consultancy within a can within a larger organization, right? Like yeah, I know you think you need to grade the roads But do you realize that road isn't used often or you know something like that, right like yeah Or did you realize that somebody's been taking a two-hour break? They've just been sitting there in that spot for two hours I don't know Yeah, not me. Not me Absolutely not. No, never you. No, no, never you my watch never tells me to move. I promise Um Chris one of my friends just text me to let me know that you're on fire and maybe you should do something about that It's digital. So don't be afraid folks. It's it's normal Okay, okay. Thanks for clarifying so Yeah, Audrey and yourself Chris also talked about math like Maps as us Brits call it and comes into this and And I feel like there's there's two sides of there's two classes of people one that thing data scientists just train models and the other the thing that data scientists sit down and do like Absolute extreme next level mathematics pen and paper Greek letters all day every day Yeah, Greek letters all the time Yeah so I Think it's important to understand What when you choose a particular model or an algorithm or you process your data in a particular way it's important to understand the The underlying mathematical assumptions and statistical assumptions that you're making about Your data or the the relationship between variables in your data But I don't know about you Audrey But I haven't sat down and written Greek letters on a sheet of paper for a very long time Yeah, I would have to say that That was probably two years ago for me But then I do have some colleagues Ex-colleagues from the company that I worked for for before They yeah, they did sit down and do that, but I would say generally That's very rare. I mean you just don't sit down all day and turn out mathematical forms formulas or algorithms Yeah, if If you're not sitting there turning out math, then what are you doing? I would say when you look at a data science problem 90% of the time is Looking at the data and trying to understand it very well So that means you could be maybe Doing a number of small analytics where you're looking at some charts at some of the data or some graphs And then going back and asking the customer questions I remember one of my mentors from five years ago Saying that if you get a problem and some data and you start within a week going ahead and Creating a solution. You probably didn't spend enough time on that on the data And I find that actually very true once you start admiring yourself in the data and taking a look at the Relationships between the different data and thinking about what the customer wants You usually end up digging around for more data and going back to the customer and saying I find this Relationship where I see the data telling me this Is there any other data that you you use that you haven't told me about and I know for me a lot of the times the customer go. Oh, yeah I totally forgot about this other database that we use to get XYZ Right, and then you're off again exploring the data one more time Is that I mean I gotta ask is it frustrating to an extent right when people don't realize all the data They have and they give you a problem and they're like or give you data and just like hey find something and like you do you feel like you're kind of like Needle in a haystack looking or like hunting for ghosts sometimes or is it? That's probably what makes the data scientists the data scientists because we're kind of crazy. We dig stuff like that I mean, yeah, that's For me that's going ghost hunting later. What yeah, there you go Exactly So, yes, we say day to day data science, right? Like what does a data scientist do day to day to help the organization, right? I know you said Myron data. I know that it takes say longer than a week to look at the data At what point in time does the does an organization come to the the data scientists and say hey We could really use your help right now. I mean, what do you do to make that? Connection between, you know, what's your day-to-day work is and organizational goals Yes, I mean first and foremost it starts with a lot of discussions understanding what what success means for the stakeholders what they what they want to find out and What information they've got how will you know if you've done a good job? You know when is enough enough and so I think it starts with a lot of discussions Audrey Yeah, I would agree. I mean you don't want to keep Going back to a problem single if we had more data if we looked at it this way I mean because you have to be cognizant of the amount of time that you're spending on a problem as well I mean if If you're spending so much time on on the problem, maybe the cost of you working on it is going to not give you the benefits of I'm just getting a short-term answer and Sometime just getting a small answer or if you did create a model just seeing what the model generates For a few weeks can actually give you more valuable insights so for all of those people that like looking at data talking to different people and trying to glean gems out of What your model may produce or what the data might tell you come on in you know be be a data scientist So what are some problems That you saw personally, I don't know if you can necessarily talk about all of them But you know if you want to omit company names feel free. I mean What kind of you know? Evolutions have you started? Yeah, I know for for me the most fun that I ever had With I would say a large data set and that kept on generating more opportunities was for an Open-pit mine that sounds terrible, but it was for oil sounds and Trying to determine first of all just the optimization of these large trucks carrying the actual raw material around on the roads and The first thing that they were looking at is can we optimize how the trucks take paths on the roads? And then it was apparent that sometimes when things were slow it was really the road So then how can I determine if a road is needing repair or needing a grader and Then it went to how can I track my my graders to see what kind of optimization is being done So those are kind of very general High-level Looks at what you can do But I mean that project for me lasted a year and a half just because we kept on finding more valuable insights as we went along and I mean We have to look the first time I we talked with this customer They were saying yeah, we want to do this and we were scratching our heads and we're like Well, do you happen to have any GPS data and they're like oh, yeah We've been collecting that for the last five years, but we haven't done anything with it Because each of the trucks that would go on the mine. They had a tracker and each time we would pass a certain marker or checkpoint You would get the the speed of the truck Date fuel yeah, yeah, so with with the time and the speed you could get distance Mm-hmm fuel consumption everything and they were just sitting on this for a number of years because nobody really knew what to do with the data so for a data scientist, I think that's kind of the treasure chest of data science where you can go into a problem and there's just so much to look at and so many insights that you can gain and I would have to say the most important part of a data scientist and so if you can agree with me or Give your input on it is really sitting down with the the people that are doing the day-to-day tasks I'll add one more item and then we can put it over to Sophie I remember talking to this one mine engineer and saying well How do you know where all the trucks are in the mine? And he's like oh well On the back of my crossword puzzle this page. I've written down where everybody starts You know, so I kind of know in the morning where this person is in this You know 10 square meter area And I have to jump in my pickup truck and track them and I'm like good god There's got to be you know an easier way that we can do this and we could do that Eventually again because we knew the GPS coordinates we could find out where the person was within the last half hour But these are the sort of things that you can find Wow, I mean Sophie I think of the like the the UPS drivers never turning left or trying to minimize their left-hand turns as a like a data point for You know fuel economy and everything what kind of discoveries have you made in your you know Data mine right and then I have a question from the audience task of both of you. Oh gosh Okay, I'm just gonna talk for the next 40 minutes. So we never get to the audience question sounds scary so I think I should I should throw out a disclaimer that and Although my title is eight scientists and Spend a lot of time with data scientists and We're often kind of helping them and advising them in their kind of roles in daily work and I've had some real fun thinking about recommendation engines in the past so like for for Amazon or like just Well, I mean, right. So how many different how many different things can we think we need recommendation engines for Amazon? Sure, so Yeah, like right so that Amazon like Well, I don't know what's the most frustrating thing about Purchasing something from an e-retail site when they recommend that I buy something similar after I've already bought the thing Exactly new Ironing boards and yeah, I recommend six of them over the next week, right? Okay Versus think about something like a TV streaming Thing so if you watch a full horror show Yeah, it knows you've watched it Does it not make sense that it recommends something similar? Something similar with the actors the genre, you know, there's so many so many data points to touch there Yeah, exactly. So first up the even though at the end of the day, we're just recommending essentially We can think of all of these recommendation engines. It's just suggesting products to users You've got to think about the domain because in the movie case we want similar things to continue to be Recommended to the users versus in the case of I just bought an ironing board I don't want another ironing board because I just bought an ironing board. So I really like things like that in terms of recommendation engines and algorithms and models There's you know, there's there's a known set of algorithms that people use for these when I go and approach a problem like this I'm not writing a new recommendation algorithm from scratch. I'm not getting, you know, the paper out and Greek letters and kind of making something up. There's known algorithms that will take for this But you can't just throw your data in and expect a good answer. You've got to think about What you're going to then return to the user what's important and what isn't And it goes back to that notion of how do I know if I've done a good job? So um, I think Yeah, the recommendation stuff was so fun because Every facet every like turn there was a new Facet to think about like he said in the films, you know, oh, do you recommend things with the same? Right exactly or Do you recommend things that are set in the same area? So when we moved to Oklahoma and then lockdown started we decided we were going to watch every film that had ever been set in Oklahoma I can let you know that we got like three films in And didn't finish the third film. Um, but there's still hope um But things like that there's so many facets of ways that you could recommend things and chain things together And then it falls down immediately as soon as you transfer that to a different A different domain, even though it's the same algorithm to recommend things The other thing that I got really hooked on was um Spotify so Spotify has a Yeah, oh, yeah Okay, go ahead. Sorry Like I have a lot of beef with Spotify's recommendations beef I have beef with their recommendations because sometimes the songs they recommend are not anything I would listen to Are you sure that um, some of your kids don't actually No, uh my account is safe from the kids. So Okay, so The thing I find fascinating about Spotify is the way in which it can it puts together these daily playlists Which are a mix of things that I've listened to before and I liked But I've never told Spotify. I liked them. So how does Spotify know that I like them? Any suggestions Well, I mean if you've got a song that you like Then you listen to it a lot, right? Right. Yeah, the more you listen to it the more. Yeah Right. So it's not like when we buy something on Amazon or we watch a film and we rate it five stars We don't really do that with music. You instead just listen to it a lot and perhaps in repeated patterns and so It's incorporating that information in there. So not only is it saying which songs have I listened to you before? It's saying which songs have I listened to many times um I always like trying to mess up the algorithm See what happens if you put on a song that you don't like overnight don't sleep leave it on repeat in the other room Is that going to get into your algorithm the next day? Is that going to be on your recommended? Like daily playlist one in a week's time. Who knows? Figure it out. It's trying like reverse engineer. How are they making these decisions? The other thing it does with those daily playlists, which is fascinating is um, It kind of makes coherent playlists out of them It's like somebody has sat down and crafted you a personal mix tape And there's kind of four different daily playlists and they all kind of cover similar genres Similar moods and it's not always that like playlist one is pop playlist two is rock. It just It blows my mind. I think it's so clever. Um, how They are using their recommendation engine and then They're not just recommending things that you've never listened to before So occasionally they do they mix them in they've got some Algorithms somewhere that's determining how often they mix in new songs to your playlist Um But they're also recommending things that you've already listened to and putting those in your playlist as well so again, it's different from the movie recommendations and the um The online shopping recommendations, but it's not going to recommend you things that you bought before Unless it's something like printer paper In which case there's another algorithm running that knows that you need to buy printer paper like every three months because that was how Frequently bought it before so basically I think recommendation engines are fantastic and i'm going to stop my branch there Um, so question someone's curious Is there a great path or known good path to switch from data scientists to ml engineer with ml ops expertise? I really enjoy training models and creating apis And also I would like that to take advantage of open shift to achieve that So that's kind of like a long Long question to basically say hey Is there a path for me to work on apis more with right my machine learning capabilities Oji Say hi to max back there Hey max Yeah poor guy Got three lakes. Um, I think that that path is actually pretty seamless. It's very similar to Kind of what what I've done. Um, you can start on data science But if you kind of get interested in any of the ml ops, it's very easy to slide it to that role Right now i'm working with a number of ml ops engineers Looking at how we deploy Some of our our models and what could be future best ways for deployment So I would say the data science portion is actually very important because they're not too many I would say of us that span those multiple roles And you can actually give very good insight Into how ml ops is is done because remember most ml ops engineers They may not know a lot about data science So they may not take into consideration that you know, you may want to say Use a couple different techniques for looking at your data, whether it's grafana or whether you're using apache streams to pull in your data They're only looking at how you're doing your model delivery and sometimes you have to look at how the whole picture is put together And I would say that if you're looking to get more into ml ops, um, at least within red hat There are a number of really good courses that you can take There's like an intro to open shift intro to kubernetes And then it just goes from there and you can go down the ml ops stream So I think that switch again very easy. It's a very good switch and it makes you more valuable Yeah, I just dropped a link in chat for you and Hinge haves hinge avenues. Maybe I don't know don't know how to say that username, but I just dropped a link in chat for the Categorical courses Chris, can I share my screen? Absolutely All right, let's see if we can get technology to play can you see my screen? Yes. Okay, so I think um, yeah, I think when we think about ml ops kind of as the the discipline and the process and um of um operationalizing machine learning, um, then it's a really important thing to think about and The reason that it's tricky and the reason that there's this need for the ml ops engineer is because um Data science is hard, but data scientists on application developers so we don't uh, necessarily know best practices about things like version control source control um Right kind of making code repeatable, you know the same in the same way that you'd want your um A standard application to be repeatable. It's really important that your machine learning code is completely repeatable and reproducible And particularly in industries where things are audited. So for example, if I have to go back and say Why did I deny this person a mortgage on date x? And then you've got to say, okay Well, what model was in production on date x what data was that trained on? What particular parameters were running for that model etc? um And so obviously like perhaps or is it red hat? That's where you think our containers and containerizing this and the container images everything's going to be immutable That's really going to help the story But I just like the side because I think it captures kind of all of Well, not all of but many of the challenges of reproducibility and so I think if you have that skill of being able to produce these apis To kind of put these models into production and interact with the models But you're also aware of the underlying data science. Then you're in a really good Stead to make sure that we don't make any silly decisions When we're going forward how much data science have you seen in medicine lately I'm just curious from a personal perspective having, you know, a couple different injuries Is it okay? Good because I would love some kind of model to be able to tell me if I'm going to have a high pain day or a low pain day Right like something like that would be a life changer for me because you know environmental conditions You know activity the day before that kind of thing Being able to tell me like hey, you should be you know There's all these stats out there that people are looking at to say hey Like that's the magic number for disabled people to look at You know, if this number is low, you're good. If this number is high, you should probably take a break kind of thing and you know, we're all searching for like the holy grail of Data essentially to kind of give us an idea of how our body's going to respond Um, so i'm super curious about the medical sector in general Because I know like you know gas oil natural resources. They all do it I know shipping does it logistics is doing a ton of data, but there's like medicine is just this ripe area I feel like for a lot of improvement. I know medicine in particular just within the last couple of years They've always had a lot with imaging and I would say that Not that that's the the only thing that's really lept into data science, but they've done a very good job of taking Say x-rays looking at extra data CT scans Or even a video when they would used in colonoscopies to try to do anomaly detections Where they're looking for either cancerous tumors or benign tumors That's actually been very useful or I know in the area of covet what they've done is they've taken x-rays of normal chests Those with pneumonia those with covet and can they possibly create a model that could detect covet very quickly? And in that respect, um, they've done not too bad of a job with the data that they have But remember within medicine, it's just not only these x-rays. I mean, they're they're using it for for DNA analysis Uh, they are using it for um, heart analysis. I know Cush even like 10 years ago There was a company that I was talking to that They were actually monitoring a person's heart rate From home They would wear a monitor and then based on In a regular heartbeat or something somebody could check in with that person to make sure that they were okay So You know data science has been around in medicine for a long time And the things that we're seeing right now are great and plus the health trackers that you're you're talking about right are tons of apps out there That you actually can just download and use Now that being said, I think Everybody can think of an example where AI has been ridiculously wrong um For example, you know, if you you train AI to uh, determine Um, identify different types of footwear. So high heels sneakers football boots, etc Um, you could do that with photos. We've got object detection so we can train a model It's really easy to confuse that all you've got to do is wear your high heels on grass And it actually thinks that they're going to be football boots because it's picking up the grass Which is usually in the background of football boots rather than picking up something about the shoe itself so and I mean, that's just a Benign example, um, you know, there's chatbots that have gone very arrived very quickly. Yes. So I don't know how I wouldn't put my life in the hands of an AI diagnosis but I think there is information that we can use like you said Chris like, you know, um Don't go outside today because it's awful out there That's you know, that's useful information and that's not detrimental I mean, for the reason that's not detrimental to Us as a human versus if an AI says you need extreme surgery Um, but a doctor says no, I don't think you do I trust the doctor Hey, Carl. Hey Carl. Welcome Oh, no, we can't hear you Hey you Oh Magic buttons to be pressed if I need of a some AI that could Right like I am talking and I'm on mute. It's your Being late having, you know, I'm still on mute. No, you're good. We can hear you. Okay. You're good Not only do I show up late. I show up with technical difficulties, you know, uh, don't trust don't trust my uh My my data science model right now that's happening. I guess Hey, this is just a good reality Yeah Yeah, so it it sounds like we're talking about the appropriate use Of models in certain situations. Is that is that right or am I uh, am I missing the the beat here? I think that works pretty well. Yeah, Chris was asking of like um healthcare use cases and yeah AI You know, Audrey was talking about the really exciting work that's going on with the medical image stuff or the forecasting and prediction stuff. Um And then I was just being a natural pessimist and saying don't don't get ahead of yourselves folks I mean, that's that's the thing too is sometimes the the models that somebody comes up with or some of some of the analysis Maybe just be on such a small subset of test data That when you're using it in a larger grouping it totally gets it wrong So What's the thing right like an opportunity to get it right? Yeah, and there's I I'm not gonna get on my soapbox But there's a lot of scenarios that like aren't accounted for in the collection of data sometimes, right like Is the person in a wheelchair for example, and they can't reach a thing and that's why they're not using it, you know, right like stuff like that um Like the environmental Things aren't taken into account and we look too closely at if you drill too far down You get data that's very eschewed and kind of works for 80 of the world, but not the other 20 percent kind of thing um I feel like that is going to be a continuing problem in the industry as you know I mean, let's Yeah, the the Theranos trial started I'm just gonna throw that out there, right like there's only so much you can do with AI and ML right now, right? like you can't Change all of an industry with it unless you truly are discovering some new way of doing things um That being said Are we in the right environment for data science to flourish and machine learning to flourish or is it in your opinion a little too early for like us to put full faith in in our AI and ML and it sounds like Yeah, don't trust all the algorithms just yet from the sophie perspective I mean, is there anything obviously the the music suggestion engine works for you sophie, but the It doesn't work for me necessarily. So yeah It's got uh, I've got an easy job for me really. It just has to recommend taylor swift songs on repeat This is fantastic Low bar to clear got it I think I think people in general They should be able to trust AI to a greater extent I mean, you may not realize it that on the highways. We do have a lot of autonomous trucks Especially in the Tucson region. There is a company that produces autonomous free free haulers Wow, and they they test those those vehicles Um, and they haven't had any accidents in the last number of years They've done testing through various weather conditions, of course weather in arizona Mostly sunny, but you can get the dust storms. You can get our ones with rain. You've got mountains. You can go find some stuff. Yeah So, um, there actually may be situations where you may not even be aware that AI is working well Um, and as well too, uh, within the imperial valley within california and here Uh, just outside of yuma, we have a lot of agriculture that relies on AI in terms of moisture detection Do we need to uh, you know water some of the crops? So there are a lot of things that AI is doing very very well But I think with uh any new technology as soon as you start looking at something There are going to be a few hiccups So if somebody asks you do you want to participate in this AI study and You have a heart issues or something like that and they say oh by the way, you're going to be our first test candidate Oh, that's when I would be pretty pretty leery Right. Yeah, like read all the warning signs Yeah, um having just spent my weekend at a roller coaster park, uh, or at least a day of my weekend in the roller coaster park Past the thrill level of two. I shouldn't ride the ride according to the warning signs. So yeah But that is just because you know, it's like are you pregnant? Do you have spinal issues? Do you you know, it's like this list of things you should not ride And it's literally on every sign on every roller coaster. So, you know, it's it's one of those things where it's like, okay I'm glad they put that out there because otherwise You know, I'm just going off doctor's orders at that point which were don't write anything, you know So yeah, you can do the kitty rides chris. That's okay Kitty rides are fine. Kitty rides are fine. I just don't fit in them very well. So carl I haven't asked you many questions because you just got here So, I mean Day to science day to day Where are you seeing? You know actual like things happening with data science in the real world Right now. Yeah No, that's a good question. I mean we as audrey mentioned, you know, there are Fantastic examples of AI and ML right that are happening all around us, right? I mean some of them are Big examples, right physically big like trucks on a highway, which is pretty You know, it's pretty interesting But a lot of it too is, you know, if you think about customer interactions Like interactions you have with shopping, right? You have I guess the music recommendation algorithm. You guys have already discussed but There's also, you know next best action type Models that are out there like what Products can we recommend to a customer and how can we recommend things? In the moment, right? It's not how can we recommend things and then interact with, you know, a sales associate who then calls you later the that afternoon and then says, oh, you know, the The the item you're actually interested in is in stock. Please come back to our store to Uh, you know purchase that like that the opportunity is already gone, right? So a lot of the I mean, I don't want to call anything Easy because it's not right. I mean, it's the only thing that's easy about data science is introducing bias, right? We don't want that the maybe the more straightforward things or models that I've seen a lot of interaction with are those models that exist In internet shopping, right and surfacing recommendations of recommendation engines And next best actions those are those are the good examples because it's an in the moment decision your model will take the information it has Create an inference present it back to you and if you know Sophie mentioned shoes, right if I'm shopping and I see an advertisement for shoes. It's It's not that big a deal. You know, it's it's not going to upset me even though I have no interest in shoes, right? It's benign example. So the we see a lot of these where The model can act quickly and even when it's wrong The downsides are limited, right? So it's a very benign example the the ones The The models that are getting all the headlines are the the really big hard to solve problems like you've already discussed in health care where you know that touches lives You know pretty much pretty directly Yeah, I mean there was just a study I read that obviously used a ton of data to break down Kidney function and patients that had COVID And how like the long-term effects of COVID it looks starting to look like the data is indicating That it affects long-term kidney function If you do get COVID, you know at a younger age You might need a kidney replacement at a later age is what it's starting to look like So like that data to me is fascinating, right? Like the long-term effects of a novel coronavirus Um, like that is truly interesting because this is something that's going to be with us forever Um, you know, we're not going to eradicate this overnight kind of thing um So yeah, learning the the long-term effects of getting it is vitally important because There's a whole class of people now that are quote COVID survivors and They might have long-term issues if we don't go look at the science now, right and start looking at it and seeing what's changing in those people that had, you know naturally contracted coronavirus, yeah Right and that brings us back to kind of the two different types of data scientists Chris in my opinion because the thing that you're talking about is kind of taking that data Analyzing it there's no sort of Pressure on time and then making some conclusions in a report and then feeding that back to somebody versus Carl talking about kind of the AI that's ingrained in these systems. So as a data scientist, we'd have to think if we were trying to solve this problem Okay, well, which algorithms can I use because I could probably make a better recommendation If I had three weeks to churn all of this customer's data And every other customer's data and pass it through a really deep neural net And then flip it and reverse it and then do something else with it And then come up with this recommendation for them. But by that point, they've already like that. They're gone Right, like they're they've had strategy changing sessions and you know all those other stuff, right? Yeah So it comes back to kind of thinking about Sitting down with the stakeholders understanding what's important how you'll define success in the project And then figuring out where to go next Yeah, I mean in both of these cases, right it comes down to the quality of the data And being able to ask the right questions. I mean, you've probably already talked about this Since I'm late, but you know the the real mark of a data scientist is Coming up with appropriate hypotheses being able to test them and then understanding the impact for the inevitable iteration That's that's going to happen as we continuously improve these models and with Something like coveted and long coveted and you know, all of the unknowns that are out there I mean, we simply don't have enough data. We have intelligent people who can ask the right questions. So we're Moving in the right direction. But if you think about longitudinal studies, I mean a lot of these studies happen over 20 30 40 years. I mean, we're talking decades, right? So there's we're only scratching the surface on long coveted Yeah, and like the systems that are put in place today to start studying these things will still be like Turning away five years from now To continue to study them, right? So absolutely. Yeah, it's pretty wild So we got about 10 minutes left. Is there anything we want to share for that data scientist out there That's trying to you know, break through in their work today and find something awesome in the data Ask for more data Is that just a stall tactic No, come on data more data. I'm just kidding the Some of the things that I've seen right like in the financial sector with a model or ml models Are just models in general. I should say because I don't really know what I'm talking about Is you know, like person x Like has these accounts they probably would appreciate this product kind of thing Or person y is applying for a mortgage So they need to like make sure their credit score is as high as they can get So here's what they recommend doing or not doing while you're going through the mortgage process that kind of thing Or like don't buy a car When you're trying to get a mortgage like that's like great advice Um because it raises red flags to the people in the mortgage business, right? So Like I've seen those examples out there kind of in my day-to-day work What other examples do you think are helpful for data scientists that they've created over time? Like the annals of data science history I don't know what would be so helpful that what I find interesting are Just on our social media how they've been changing their algorithms in the past couple years Due to a lot of the unrest or disruption. I mean they're taking a harder look at the emotion and The context of various conversations that are happening and and flagging them So whatever natural language processing that they're using for that or What are their algorithms they've developed? I think Have been very interesting just over the past couple years and that that's allowing A lot of those individuals within those social media companies to Actually ban people or ban groups I would say those are are more visible and those are actually more interesting and I think it's good for a lot of the You know the ones where they're actually banning a lot of people or groups to provide Just overall more stability and safety though other people will say well, what about the freedom of expression? Right well Safety first is what I always say Safety first, but yeah, it's it's an interesting discussion. I like the way that it's It's going right now. That's just It's very interesting. Yeah, I mean what I'm hearing you say Audrey is that as data science artificial intelligence and machine learning Become more prevalent, right? We have the interaction of data and inference and modeling With society as a whole and that's a whole another topic that we could You know, we could spend hours Discussing and that's that's a really cool thing. I mean as you know, it's It's a dynamic system too. So as society changes with Data science and artificial intelligence like we're going to have to now go back and update these models And it's a continuously iterative process given that, you know You know the the interactions between everything here so It's funny that you mentioned social media and I have a you know A different an opposing view. I like sports So I get nothing but ads for sports books and casinos all day long, right? Like that's it That's all I see on twitter for ads like when I'm using the native twitter ads It's just non-stop casinos and sports book. I am very much anti gambling so right like gambling is an addiction the models aren't addressing that right like they're just Throwing all this, you know a lot of sport books are, you know, throwing money at ads So the ads are getting thrown up to people and it's like How do you say this ad is not healthy for me? Right like that is the problem that I have right like there's no way to report an ad as oh This is actually bad for me. Like or or I'm not allowed to do this by law. You know, some judge has made a ruling kind of thing So Yeah, I was gonna for interrupting there are sometimes When you click off an ad They'll say what do you not like about this ad or why are you not interested so you can put that information there That will get back to the the company whether it's um google or whoever Is popping that ad up on whatever device or browser that you're using. It's only twitter. Let's face the only side I really use is twitter But that that is a really good point. I mean what happens? When those those ads come up to somebody who's fighting an addiction, you know, they're That's harmful. You know, yeah, it's it's not it's not easy coping with that I don't have a good answer for that. Yeah, no, yeah, I mean it I don't think it's Maybe it has been defined, but it's really a whose responsibility Is this right? Is it you know, is it the person serving the ad? Is it the infrastructure? I mean like I don't know if we have an answer for that. Maybe we do. I'd love to you know hear a comment In in that regard But it's to me that strikes me as an incomplete solution to the advertising problem Right. They aren't Considering the new flow of data as you interact with the system. Sorry, Audrey Yeah, I was just going to mention but you have to remember too with A lot of these companies the reason some of their ads float to the top is they've actually paid for that content to float to the top That you're that you're looking at so any like Whether it's a social media company or just a web page browser Remember, there there is payment for certain vendors to offer their services or their products So there is a balance that that company Is is having and You know, it's the bottom line sometimes, right? No, I get I trust me I completely get it like I know that you know, this wonderful service is that I use routinely is I don't pay money to it. So it has to make money somehow I get that But there has to be a better way, right? Like that's what I'm trying to say is, you know If someone from twitter's out there watching feel free to DM me They're just gonna block you that's what they're gonna do. They're gonna. Yeah, I mean, and you know what that's fine I have a website. I can be de-platformed. I don't care. It's fine You know, it's just one of those things where it's like I wish there was a better way for the, you know, like add Deterministic things to happen, right? Like, oh, he's literally blocked every single sports book account on twitter We probably shouldn't show him any more sports book stuff You know, because they pop up every day. It's just a particular pet peeve of mine. So And so I think that goes back to this need for mlops engineers people that understand what the machine learning algorithm can do What is other functionality that we need to bring in from other aspects of the system might be, you know, it's it's essentially You're kind of talking about just encoding a rule, right? Do not show chris x um, and so then it's filtering those out of the recommendation and Going back to the stakeholders and thinking how will we know if we're gonna do a good job? I think everyone will know they've done a good job when chris is happy Yeah, how do they know i'm happy though? Yeah, let them, you know, what types of ads you would you would prefer to see So I have gone through that data and It's like, okay. Yeah, I've cleaned up some of it because it was just like way off, right? Like show me ads about the san francisco 49ers. I do not care about the san francisco 49ers. No offense to anybody. That's a 49ers fan I'm not So it it's weird on what it picks up on, right? Like it thinks I'm part of the green party Not part of the green party in the uk. I'm in the us, right? Like it's really weird. But yeah, so At there's Yes, there are things that I can do but When companies target specific demographics, that's where I find the problem, right? Like Is that the right way to do things? I don't know to them so far. Yes, and there is a money tree kind of thing to it, but There has to again there has to be a way to give feedback to that model to say Not chris or not people like chris kind of thing Right again, you're going into that whole topic where crawl is right. We could spend a whole couple hours Discussing the ethics involved and the money flow and How certain demographics may or may not be targeted? Yeah, and yeah, that's the thing with AI. You have to use it ethically Exactly All right, any parting thoughts before we sign off here. Well, thank you all for joining Thank you all for watching out there Coming up next on the channel here in an hour Going to be sitting down with some of our managed service folks to talk about managed cloud service offerings In the cloud so should be a nice little conversation here at 11 a.m. Eastern 1500 utc So feel free to join in and you can catch this crowd again in a month And then in two weeks we'll have the data service office hour will contain data storage folks. So Stay tuned folks. It's going to be a fun ride Stay safe out there everybody