 And we're alive. Hello, everybody. Welcome to our live stream today. I'm very happy to announce this is Monika Wahee, and we're here for our data science chat live stream. It's June 4th, and it's at a different time than it normally is, which is 11am, because we have a guest here, which is Joe. Joe, you're here, right? That's right. Okay, here, can I, I can't hear you that well. Can you turn it up a little bit? Maybe I can turn you up. Looks like everything's okay. Well, I turned you up on my end. So, yeah, yeah, I can hear you. I just, you're a little quieter than I am. And actually, you're more important than me because I'm really happy that you're here. So, I put the chat overlay up. So what, when I'm talking, maybe Joe can tell me some of the chatting and vice versa. But what I also want to do is I want to share my screen here and show what just happened here and show our slide presentation that we're presenting today. So, so Joe, you can see that, right? I can. Okay. So, so welcome there. I'm sorry. I think maybe you're in front of their, in front of your name there, but let's see here. So the title for today's presentation is freedom charts in healthcare and the automotive industry. And then let me go and advance the slide here. I'm going to be like, it's going to be like where you have to go next slide. All right. So then, let's see here. So what the reason why we're doing this presentation today is actually it came out of a conversation Joe and I were having because I met Joe because I do healthcare statistics and he doesn't. He does statistics in a totally different domain in automotive and warranty. And also you'll see on the next slide. What is it? FMEA is what? Failure mode effects analysis. Right. So I don't even know what that is. So this is amazing, right? So I, so I used to teach a course in statistics in undergraduate to undergraduates in nursing. And some of you have watched my YouTube lectures, which is what I made for them. And you'll find that in one of them I mentioned the Pareto chart because it was in the text. And I was like, this stupid chart who uses this? So I didn't think it was very useful in healthcare. But when I talked to Joe, he's like, oh, no, you have to use Pareto charts in my industry. You can't figure out what's going on. Like they're so helpful. And I was like, okay, I got to hear this one. So that's what we're going to go over. Well, first of all, what it is, then uses in healthcare, which as you can guess is going to be really short because there aren't that many. And then most of this is going to be Joe talking about how the Pareto charts are really used, basically, like a real use case. So, so first, I'm going to just get myself out of the way here. Well, I'm going to introduce Joe do a better job here. So like I was saying, he's a warranty engineer in the automotive industry, and does quality control and analytics. He says SPC and DOEs. What's that Joe? Statistical process control and design of experiments. So what this is teaching me is that you can do statistics your whole life and not know what you're doing because I've never heard of that. Well, that's not fair. Well, I mean, you know, like I don't know how to do that, right? So I can still learn. And I'm sure I do some stuff that's like, like, pretty normal for me that Joe probably never does, you know, because we do different things with healthcare data, even though it's just statistics. So which is kind of the point of the big picture point of this, but we're focusing in a narrow way on Pareto charts as an example. And Joe is also an ASQ professional and that becomes important on my next slide. So I'm going to, you know, just do my little part, which is Pareto charts and healthcare. But first, let's talk about what a Pareto chart is. And let me let me turn off the chat overlay for a second so I really can see this. So Brazy and Brazy was a textbook I was using. And they defined it as a bar graph in which the bars are arranged in order of decreasing frequencies. Okay, so I added the frequency. Because if you ever take epidemiology class, the first thing you learn in epi 101 is frequencies are bad. You want rates. Okay. So because if you don't have rates, you can't compare between populations. So if I've got two statistics classes, and one has five people out with COVID-19 and the other has seven people out that those those frequencies don't mean anything unless I know the total number of people in each class, right. And so this is beat into us and so we're sort of tree told, don't do stuff with frequencies. So when I encountered the Pareto chart in our statistics textbook, which is not aimed at nurses, it's aimed at just general it's a good textbook I would recommend using it. That but the problem is, I was just immediately like how do you do this in healthcare, do you. So when I would teach a class and if you watch the video, I'll say like we don't really do this. But then I actually met some people who did do it in healthcare. Now, could you go back to that slide for just a second? Oh, sure, sure. Yeah, yeah, go ahead. What one comment I typically don't worry about frequency. Some of the older software that I used that was provided for me to use did incorporate that. But the problem with it is, is that you could have a failure mode that was so vast that that it became pointless keeping track of the percentages, you know, because it would change from month to month. But was it are you trying to say let me see if I can rephrase this and tell me if I'm right. Are you trying to say that the denominator, like you could have made a rate but the denominator was huge and changed little from month to month percentage wise because it was huge to begin with that it almost didn't even matter. Exactly. So what because I would only be worried about maybe the five worst things. Right. And so what you're talking about is what we see in epidemiology when we're talking about things like, like the rate of, you know, like if you have the rate of car accidents somewhere. We are the rate of deaths from car accident or something will use well when I worked at the army actually, you know, that's maybe a weird connection we have because you were in the military right and I worked in the army. One of the things I looked at was rates of ankle injury we even have an article on it, and I made it like ankle injury. Actually, I remember rabdomyosis that wasn't that common. That was like per 100,000 so we'd say eight per 100,000. So in this year it was eight per 100,000 and that year it was 10 per 100,000. And so we were kind of like you we could have just done a 10 whatever, you know, because it didn't really matter. Right. Because we were using 100,000 but what you're saying is slightly different is like you every year it would be something like 100,000 so it didn't really matter what it was. Right. Right. Right. Exactly. So that that is sort of like gets to the first sort of maybe domain difference that we're talking about, you know, which is that, like, in epidemiology as you can see my chart. Sometimes it's 100,000 right. But when I was doing knee injury knee injuries unfortunately very common. Army because you're running around and stuff. I would use a different denominator. So this is the issue, right, is that our data are different. Like mine are less predictable. So where we actually, the first time I ever encountered a Pareto chart in healthcare was when some customers came to me and they were crying. I mean, not literally crying, but they were miserable because they had implemented the PDSA model, which I put on the slide, which is derived from engineer quality control principles, which is probably what you do. And they completely adopted it and their healthcare was not getting better. It was not improving in quality and they had invested a lot of resources. So it was really upsetting to them that they had done all this and adopted this whole model and nothing was getting better. And so they were showing me, I said, well, show me what you're doing. And one of the things they showed me was a Pareto chart. So they were keeping track of incidents like falls and needle sticks. And these are like bad things that happen in healthcare. And they were making these frequencies. But the thing is that if you think about a needle stick and just to be clear, so everybody knows what I'm talking about, a needle stick is when you're a healthcare provider and you're trying to inject someone else or use a needle on someone else and you accidentally stick yourself, okay? Which can be really bad if the person, you know, you get something on you from that person and then you get infected and all that, right? Well, think about it. We just had COVID-19. We just had a big vaccination drives and stuff. Our denominators are always changing for things like this. Like how many injections are you giving? Well, you're gonna increase the numerator of needle sticks. So this is part of why that customer that came to me, I gave them a lot of services. It was not succeeding at improving their healthcare by using the PDSA because the PDSA is based on what Joe does, which is really different. And now let's hear what Joe does, okay? So I'm gonna go to the next slide and stop talking and we're gonna hear about how to really use Pareto trends. So what's really interesting and in fact, this is where Monica and I really do the same thing because I worked on the warranty side. There's the quality side, which is just gathering data. And, you know, the only thing you're trying to detect with data not only is whether it's useful data, legitimate data, but, you know, and you know that because it complies to the central limit theorem, the bell curve, right? Well, warranty data is a lot like healthcare data. It's random because it just is, but it's not necessarily unexpected either, right? And a lot of the issues I dealt with in automotive is subjective data rather than objective data. And I think, again, that's where Monica used to provide surveys to people who would answer questions and, you know, there were maybe levels of answers. Sometimes it's a yes or a no. But there's also like, or pain, you know, people talking about their pain, that's a subjective data. And the only thing you can do is, well, prescribe more painkillers, right? If somebody's in pain. Welcome to the Albuya crisis, for God. That's right, that's right. And so, again, similarly, that's where, you know, she and I kind of cross paths, so to speak. So the reason why I got so interested in warranty is an article I read in Wired Magazine many years ago, about 10 years ago, and in here is a failure curve. And you can see that's a nice little Pareto chart. And the reason why it's, well, linear, so to speak, is because they've changed the fatigue cycles into a logarithmic value. So you get distinct columns. What's a fatigue cycle? So I bend a piece of wire a million times, or. Like a hanger? Like I'm trying to bend a hanger straightened out. Exactly, exactly. So this was probably like a, maybe on a piston rod or some kind of metal that you would see a regular, minor load, even high frequencies or something. There's a lot of frequencies that travel in a vehicle that cause fatigue, cause noise. And so I'm not, they didn't provide exactly what they're looking at here, but there's just all kinds of opportunity. It's like metal fatigue. It's like the metal is getting worn out. Vibrations are being bent or something. Yeah, and I give the example, everybody's taking a copper wire and bent it a few times and it breaks, right? That's fatigue. Okay. So, and I thought, well, that's pretty interesting. And I liked it because of the way the data is displayed. It's very intuitive. So what is this? I see the red one. I see cycles on the X axis and failures on the Y. And is that a new frequency of failures? And this is where I'm saying, you could plot it that way for sure, but it would be kind of confusing because you started out with the definition of Pareto, it's decreasing. It could be increasing too, but I tend to line up and you'll see in the next slides where obviously that's kind of the way you want to do something anyways, just to give it a little bit more information. But this is- Yeah, to put it in some order, either increasing or decreasing, right? Right. But it says very, very distinctly in here that this is a viable distribution. And that's a whole- Yeah, you're right, that is a viable distribution. I was like, look at the little bump at the end. That's what this is, you know? And the one thing I learned, everybody wants to know when something's gonna stop, right? We wanted to know when COVID was gonna stop. And the CDC is constantly giving vague answers and I've been like, I understand that. They don't have enough information to figure out when it's gonna stop. They know what's gonna help speed it to an end, but they really don't know when it's gonna stop, right? That's a very complex function, unfortunately. And it's the way I learned about how doing root cause analysis with Pareto's is what we'll talk about in the next slide. And then below is the yellow slides or graphs. I started looking at this site called Warranty Week and they provide a lot of free data to look at. And what's interesting in the automotive industry is that they manage warranty costs by accruing a certain percentage every year. So let's say your quality of performance is such and such a thing, your company will pay a liability of some percentage. Of that cost, right? Because the more of it that's your fault, the more you're gonna pay basically is how it's solved. So... So as an automotive company, you wanna reduce your liability so you end up paying a lower proportion. Right, and that's, some companies will say, no, you're always gonna pay this percentage. It doesn't really matter. Your fault, our fault doesn't matter. And any of those financially can be worked into like basically a selling price for the replacement part. And that's either it'll be cheaper or more expensive, whatever. And that's the whole purpose of warranty in automotive more or less. And of course, safety. So that's where that first one's very important, right? If I know the durability of something. Let's look, I'm gonna try and see if I can make the slide a little bigger so we can see these yellow ones you have. I'm wondering if that worked. Let's see here. Oh, of course I'm in the way here. I don't know how to move these around. Maybe I can move this around. Oops. No. What just happened here? Let me see. I bet if you open, just open up the PDF and edit it, you could, what would it be? Well, I'm trying to like... Yeah, put it in. I'm trying to like make it be, obviously I don't know what I'm doing here. Getting pretty close though. Let's see here. I wonder what happens. See, I'm afraid to toggle you off because I'm afraid that'll turn you off. But let's see here. Let me, I wish I was better with the software. See here, this one doesn't do it. That one kinda does it. Let me see if I can make this bigger. Just a second here. Okay, so now we can see better what those images are on the slide. Across the bot, on the left, the left one says Microsoft Corporation and the right one says Ford Motor Company. Yes. And both of them across the X-axis have the years, like 2003 to, the Microsoft one ends at 2011 and the Ford Motor Company one on the right ends at 2010. And at that, along that Y-axis, it says percent of sales, but it's a little confusing because the Microsoft Y-axis ends at 20% at the top and the red line is the accrual rate and the black line is the claims rate. And you can kinda see them zigzagging around between 2007 and 2011 and then finally kinda calming down. Whereas the Ford one, that Y-axis ends only at 3% of sales, like at the top. And between 2003 and 2010, you see the accrual rate is sort of like parallel to the claims rate, but lower. And so what is the point you were trying to make about those? This is good information in that, what this shows is that the accrual rate or the liability is decreasing basically. So as the claims go down, their accrual rate goes down and so they pass that of course along to the suppliers or internally or wherever that claims are coming from, right? Could be internal. So they're passing along. The red line is, let me make sure I got this right. The red line is what they're passing along and the black line is what they have to pay. Yes. Is that it? And so they want the red line to be above the black line because they want other people to pay for them, but it probably won't be, right? And again, that's a case where what a company is willing to assume for financial responsibility, right? They may take more financial responsibility if it's their design, for instance. I see, so it just might be a choice of a company, of their reputation, of how quality they are, like Bozy speakers versus regular old speakers or something like that. All right, okay, so let's see here. So let's go to the next slide here. Let's see if I can unscrew up my screen now. I don't even know what I did wrong. Let's see here. So, let's go to the next slide here. Back to 100% here, okay, next slide. All right, let's see. That's not really the best display here. Well, let's try this display. Well, it'd probably be easier. Can you go back to the full screen? It'll just be easier. I don't know how to do it. I think. Let's see here. Is there a way to make this the full screen? No. Definitely UI issues here. How do I, I don't know how to bring it back to the default. Let's see here. I'm sorry for all this. It's okay. I just wish I could make this, like I'm afraid I can't make us go away. Let's see, it says drag to move. Oh, look at that. Amazing. Perfect, okay. Okay, go ahead, sorry about that. Okay, so I use an example that's just kind of fun because often, when I'm looking for data, I really don't know what kind of questions I need to start with. And this example is, I walk into a restaurant that the owner's confused about what to do and what soup should he make more of and what should he get rid of, what should he get rid of, that sort of question, it's just fundamental. And so the question is how to sell the right soup. So I made up some data. And nothing, no big deal. And I pretty much stuck to the premise of what a Pareto is supposed to represent. Okay, so let's look at your first data. So the soup, this one here, right? So the soup, you say to the first soup vendor, show me your sales for will pretend July kind of happened recently, right? And you don't really care how many sales they did in July all told or anything like that. You just care about how many sales of soup, even if they sell other things or whatever, you just care about their soup once. And then you just count it up. Now, let's say that most restaurants you go to, they offer you two sizes, a cup and a bowl. Would you just throw them all together or would you just choose, oh, I see 16 ounce servings. So you'd pick a serving, right? Well, and if you notice up on the sign, don't move it around or anything, but there's like a 32 ounce one and a 16 ounce one. So you just lump it all into 16 ounce servings, right? Oh, I see. Because it doesn't matter. So in other words, if you had, so let's say that you sold it in eight ounces, you had two transactions for eight ounces and one transaction for 16 ounces, you would just consider that 32 ounces. Yeah. You just mush it all together. So you don't care about transactions, you just care about the volume of soup you sold. Right, right, because that sort of just sort of cleans the slate, so to speak. Because that's very businessy because the person buys volumes of inputs, right? Like they buy mushrooms and bulk and they buy, you know, sage or chicken or whatever. So now they're just, they may be priced at different if they give you the bowl or whatever, but they just care about the amount, you know, the soup is the good they make. So they just care about how much they're moving. So more than likely the only place I'm gonna get this information, I'm not gonna get this from the cook, I'm not gonna get this from the cashier, I'm gonna get this from the accountant. And the accountant, I'll say, hey, just pull up all the data accounts receivable. And the other thing that's nice and hopefully they're doing, right? As, you know, we'll talk about that in a second, but let's kind of focus back on this. So you can see, you know, a trend here, it looks like you can sell any kind of soup, right? I mean, that's all it tells me. Yeah, but it does really, I mean, even just from my point of view, like if you're in the business of trying to sell soup, right? And this is the soup you moved in July. So it kind of represents the amount of soup you sell, like the amount of soup that's comfortable to sell or whatever. You would rather only sell maybe five kinds, then all these, especially if you're hardly moving the clam bisque and the turkey, you know, you have to get, oh, you have two clam bisques. Well, we have that problem in healthcare too. Well, actually it's one is crab and one is clam. Oh, I'm sorry, crab but it's a clam. So there it's a great example. Like it's a pain to get good crab, it's a pain to get good clam. Maybe you wanna dump those. Like is that, I mean, that's the way I'm looking at it. Am I looking at it wrong? No, but you know, the other thing that, just because I know about soup is that you really have families up here. The turkey kind of chili stands by itself and that might even be a seasonal offering because turkey might get really cheaper on November. I see. And really, you know, the best crab and clam available is probably gonna be in the fall. So even though your Pareto chart for this hypothetical soup place might look very similar every month. There would be sales and seasonal changes that might make, like for instance, this turkey chili, if it's special in November, it might end up being more where the tomato rice is right now. Exactly, exactly. And so, you know, after I just created this, I'm like, well, you know, really, if you make tomato rice, you're kind of making the base for bisque. I see. And so there's a certain amount of optimization in this menu already. Right. You know, the only one that kind of stands by itself, even mushroom barley and chicken barley and mulligatani have very similar bases. Black bean probably are all basically the same thing except whether you use vegetable broth or beef broth. You know, you're inadvertently explaining why me as a vegetarian, I almost never could buy soup anywhere. And I'm glad you brought that up. Because there's, you know, there's gonna people say, hey, I can't have that because it's, you know, it's doesn't have... Well, it's the base, right? So if there's a place that only uses a vegetable base, just look at all these soups on the slide that suddenly open up to me, you know, like tomato rice. But if that's a chicken base and you're vegetarian, like I am, you gotta say no to that. But yeah, no, I see there's a certain optimization behind this menu. Okay, well, let's go down and then, oops, I'm trying to move over to the one on the left here. Okay, so this is, you see the same soups on this one, but on the y-axis, you've got these... You don't have a label on the y-axis. But I'm... You got cut off, but yeah. Oh, I'm sorry. What was the label on the y-axis? Servings, it's just servings. Servings, so it's not the 16 ounce, is it the 16 ounce serving? So as it, you know, you saw 1600 on the first column on the other chart. This is 1600 servings. Okay, so it's the same, you're using the same thing. Yeah. Okay. Also, I added was a scrap. Oh, I see. So this is the exact same, these blue bars are the same blue bars we just saw. It's just what is added is this orange bar, right? Right. And that's a scrap. So what is scrap? Scrap is something that, yeah, let's say it's driven by the fact that they can't store it. So scrap is they throw it away or what is it? They could throw it away or maybe they donate it. Okay, so like for example, it's like Panera, right? And it's the end of the day and they don't, everybody bought all of their black bean and all of their mullagatani, so they have nothing left, right? Right. The crab bisque and the clam bisque goes bad right away. So they just throw it away. They throw away the orange part. But the chicken broccoli and the mushroom barley, they can store in the homeless place, picks it up. And so that's still scrap, but it's going to the homeless place. Yeah, and maybe they take a tax right off for it. Who knows, but. Okay, I see. So, okay, so that's what the scrap is. So I guess is the goal to have no scrap? Well, and if you're trying to sell the right soup, that the answer would be yes, right? If there's a donating soup, and it may not even be literally donated, it may be like people want to try a sample ever, like I don't really like this. So that becomes. Oh, I see, scrap can be very useful things. It can be a very useful thing, so maybe it's not bad. Exactly. But it'll also tell you, it could tell you that you're making too much, right? So yeah, you should keep selling that, just don't make as much. Yeah, because when I look at mushroom barley, I go, okay, everybody loves that, everybody's buying that, but these people are making a bad prediction. You know, it's interesting, there's a small restaurant near me that's really good. Now, what are they? They're called the Cornish pasties place, nearby. And that guy who runs it, they're just so nice. I'll see him on his Excel, right? Like just like you, Joe, on your Excel. And he was trying to figure out how many pasties of different types to make so they didn't have any scrap, basically. Yeah. Yeah, and again, this is a useful way to use a Pareto for anything, right? I wouldn't, I would be less likely to use it for survey-based subjective responses. You know what I mean? Well, tell me, now I've advanced the slide here. Help me use, this is the scrap one. Help me use this to do something, right? To tell the soup restaurant person what to do. Oh, let me see. Can you see it? Do you want me to make it? Yeah, no, I can see it. I'm just trying to... Or seeing what you wrote for. Yeah, yeah, exactly. Oh, okay, well, I'll read what's on the slide. Like include additional information. Include all information that could be useful and note to include scrap, okay? So this is when you make your Pareto chart. So if you've got the same problem Joe had where he went into a restaurant and they said, what is the right soup to sell? What do I sell? You know, like you go, you're in any retail situation, they're saying, what do I sell? You can start by making a chart like this, choosing what you're charting, just making a chart like this and including all the information that could be useful, but also making sure you get scrap. And I'm gonna take a detour for a second. You said, Joe, that theoretically, if this was really happening, the restaurant owner, like my Cornish Pasty restaurant, if you were working for them, you'd have to probably go to their accountant to find how much mushrooms and what they're buying. Who do you go to to find out how much scrap they have? Well, ideally it would be the accountant because if the... They should be telling the accountant how much they're moving, either throwing out or moving to the homeless shelter or using as... They should, yeah, absolutely. Theoretically. Let's say it's very expensive stuff, right? They're gonna wanna know how much expensive stuff you're scrapping. Yeah, I suppose like purses, like Gucci purses, like expensive, yeah, they would wanna know like what's damage and you're putting out, okay. And then, okay, so then the second point you write is, it's gonna be hard for first engagements to discuss data. In the case of soup, there's probably a seasonal spike due to availability of ingredients. I'm sure there's seasonal spikes for a lot of things, like we have that in healthcare. One month isn't going to tell very much, so squeeze whatever information you can. So then, I guess my bigger question is this, like my inclination as an epidemiologist, if I had the soup problem, right, and I'm encountering the soup restaurant owner in August, what my inclination would be is to get data all the way back to the beginning of the year and to look at what January, February, March, just kind of, or even get a whole year's of data and look and see what might be seasonal going on. Would you do that? Or would you squeeze more information? I suppose it's not illegal, I mean, we're talking about Pareto charts, but it's not illegal to break out and make some time series graphs, right? Like maybe just Martian Barley or something that you want to look at. Okay, so basically we're on data science in here, right? We would do different things with the data. Okay, and then your last point is break even cost. Is it amortized and the profit is driven down by cost losers? For example, I think you mean crab bisque, not crab bisque, right? You're Freudian. That's what we would have called it in a college dormitory cafeteria. That's what I was saying. Oh, it's a crab bisque. Your college days are showing. Okay, so what we're done, that says crab bisque. You're killing me, man. Crab bisque is very time consuming to make, but chicken broccoli is easy and the ingredients are expensive, which is sort of like back when I was talking about the incident reports. Falls and healthcare are way more important than a lot of other things we keep track of, like missed appointments, right? Because in fall, you could break your hip and then you're disabled, your life totally changes. You say, what did you exactly mean by break even cost? Is it amortized and the profit is driven down by cost losers? So in other words, what is the just behind the third suggestion? How would I look at this graph and take your third suggestion? So for instance, if you would wanna take a look at your accounts payable, how much are we paying for all the stuff that we buy? Crab is very expensive, clams are very expensive. That's what's missing is what is expensive, what are more expensive to make, basically? Yeah. Okay. And but you're willing to make those things because as I said, this is where the families of soup, basically soup starts out the same, and then it starts to blossom in different directions. So like tomato rice, you just take the rice out and then you make bisque, sort of, right? Yeah, exactly. So you gotta think about that. You can't just go, oh, I guess we'll get rid of black bean. It's not very big. When it's maybe an easy win, that little bit of black bean, maybe it's the easiest thing you can make and the cheapest thing you can make. And it's an easy win. Whereas the same amount of crab bisque might be more expensive to make or harder to make or have more caveats like you can't store it. Yep. Yeah, you don't, you know, because it's got cream in it, typically, right? Yeah, and also just crab, like now I'm turning into an epidemiologist again, but you know, when I look at this list, to me, the easiest to store are gonna be like mushroom barley, French onion, tomato rice, black bean, split pea. The ones with actual chicken, like even if they're using chicken base or beef with broth base, because actual chunks of chicken and chicken broccoli or like in crab bisque, I'm not sure what's in Malogatani because it's usually got some sort of meat base and so I've never eaten it. But like when you have clam, you know, turkey, more crab and clam, I worry about storage, you know, because, and also this is July, I see July on the slide, you know, it's harder to keep things stored right in restaurants in the summer, you know, like I'm totally being an epidemiologist and food safety and all that here. Yeah, and health code violations, right? Yeah, health code, well, where did the health code come from? Well, because you get sick, you get, you know, when you learn about the different kinds of things you can get from the different mistakes you can make in restauranting, you start to become vegetarian very quickly because there's about two of them that vegetarians can get. Yeah, you can, you can get us sick too. You know, like, but there's a whole lot of other ones you can get from seafood and different kinds of meat. So, yeah, so in other words, you're saying that, like if you were using this to advise the soup restaurant guy, maybe, like, I think I hear what you're saying. It's like, maybe you would give him maybe an infographic with this on it. Also information about what are the families of the soups. You know, just so you could, you could color code that. Those of you who are listening, you take my data curation class, it helps you do that. And you could, you might even have a way of indicating which ones are closer to the break-even cost and which ones are being expensive on you for various reasons. I see what you mean by trying to include all information, not necessarily on the graph, but just nearby, like maybe in an infographic or something so that they can make decisions, look at this graph and have other information and make decisions. Exactly. Yeah, you know, and it stimulates conversation too, right? Like, maybe the accountant didn't realize that those things are families. But, you know, maybe I didn't either. Yeah, well, that's why my data curation courses is so helpful because I found that actually, like when I'm with some people and we're trying to do something, sometimes I just start making diagrams with them. I'm like, okay, I'm getting confused. Is this in the family of soups? Like I'll make a diagram of the family of soups, right? It's also kind of funny. You get to use little emojis and things. But, you know, if you're creative, but I will do that on a slide, like a PowerPoint slide. You know, I'll start just doing that in our meeting because we're so confused. And then I start saving it. And pretty soon it turns into like a real image that we're using to make decisions. Like this actually happened kind of recently where I was helping this person study some learners who are taking different courses at a language institute. They actually had to make a diagram of the different groups of courses there were. So that would be maybe something you could make as the families of soups and all that. But I guess getting back to the parade-on-ness of it. One thing that you're teaching me in this discussion is that in healthcare, we don't have break-even costs. Like we don't even have anything close to it. And so I'm gonna tell you what I'm talking about exactly. I think I know. Well, I have a paper. I don't know if you've read a joke, but it's about error in data entry, okay? Yeah, yeah. Error in data entry. I mean, let me tell you, I see the same thing. And that's one of the first rules of warranty is this data even usable, right? Can I use it? Right, but let's just think about it. I just said error rates in data entry. And one of the things you and I are arguing for the rest of our life is that these rates don't change much. They're sort of stable. If you're, especially if your system is stable. Well, what I found is that rates of error in data entry are pretty low, actually. Yeah, they are. I would agree. And we, in healthcare then, so okay. So rates of data entry in healthcare, or at least in the study I did. The study I did pitted an old version of Teleform Software. I say old, this was a long time ago, right? I'm sure it's better now. That was doing like optical recognition, versus Jonathan, my assistant who was a gamer. He was really good at data entry, right? So he was single entry and he had a very low error rate. And the point of the paper was to say, single entry does have a low error rate. I thought he'd have a lower error rate he did. The thing is the end of the paper says, you're gonna have an error rate for data entry in your research study, just set it. Just set it. And if you have people who, and test people, and if they're above it, just don't have them do your data entry. Don't do the alternative, which is dual entry, where you have two people enter the same stuff you can imagine Joe doing this. I mean, two people enter the same things into two different spreadsheets and doing a product compare it. Like you'll spend the rest of your life doing that. So this was to get out of having to do that, okay? But the reality is we don't do that in healthcare. We don't set error rates and say this is our tolerable error rate. This is how many falls we'll tolerate. This is how many needle sticks we'll tolerate, right? And so we can't, we don't have anything like, here's our tolerable number that we could put there. You know what I mean? Yeah, and I think I absolutely agree. I, and I was gonna say something to that effect, but here it is. I mean, it would be unethical at the end of the day. Well, wait a second, wait a second. Are you saying my field is unethical? You gotta break the eggs to make an omelet. No, I'm joking because my field is about mortality, morbidity, right? It's not gonna go away. I gotta predict it, right? So you're like, it would be unethical. And you know, that's what most people think, but the reality is it is ethical. We have to plan. People are gonna die, you know? People are gonna get in car accidents. Do you wanna make car accidents go away? Make cars go away, right? Like this is the issue with guns, is right? Like in Canada, there's a gun rate. There's a gun murder rate. There's a murder rate in Saudi Arabia where guns are illegal. You know what I mean? Like there's, I mean, they're not illegal. Like you have to have permits and stuff, but there's always going to be a murder rate by guns wherever guns are available, right? And it's just like, well, what's a tolerable murder rate by guns, by civilians? What is it? And no one can actually even like, everybody's like, oh, that's unethical. Well, it's not unethical because we have to make a decision. And how do we make a decision with no data? That's the issue, right? Because, you know, the CDC has made it so that we couldn't, and I'm not, the CDC wasn't alone, like Congress worked with them, but made it so we couldn't just collect data. There's a few, I think Wisconsin, there are departments of public health who would collect state data. And what they would do is like, it was really hard to make a case definition. Like they would collect data on, every time they knew a gun went off or something like that, right? But the bottom line is, no matter how much patchy data you collect on guns, there's the basic idea is that the more you have, the bigger your rate is going to be, right? And so like the, so the more you have, like I used to be married in my ex and I, we had a pinball machine, the more moving parts you have, the higher the likelihood, you're not going to be able to make your pinball machine work, right? And so it's like the more guns you have out there, the higher your murder rates going to be, even if everybody's responsible and everything, it's just the way it works, right? And so, so yeah, so I guess that's why with me, I have to worry about denominators and all that. I'm living in unstable territories here, where it sounds like, if you're dealing with something like restaurant soup, like even if you've got seasonal things, you don't have whack-a-doodle things like pandemics, right? I guess even now. A huge drop in price of chicken broth. Well, yeah, I guess now maybe, maybe we're all going to have to revise our feelings about Pareto charts or anything else, because look at these supply chain issues. I'm complaining about COVID screwing up our health data, but I'm sure supply chain issues, you're going to see, you could theoretically just see mushroom barley be very low just because they couldn't make it, because they couldn't get barley or something. Exactly, exactly. And that's a good point too. Wow, yeah, that's interesting. It's, you know, with part of my other title is of course supply chain. Well, yeah, yeah, so I was just going to throw this up here for more information, because you had talked about the ASQ website, which I put a screenshot from there. So I don't know who's listening to this. You're probably interested in data science. I don't know what field you're in, but if you're interested in quality control, that's not in healthcare, that really works. You want to go to the ASQ website because it's got tools. I mean, Joe, why don't you talk about, you said the ASQ website's really good, huh? Yeah, I'm very interested in cursing reliability engineering. That's kind of where my foot is solid, you know. When you say reliability engineering, what exactly is that? It supports warranty basically. I see, I see. You know, that's where I get back to Weibull. So what a Weibull distribution is for is determining the life of something, right? If you need a ball bearing to go a million turns, then you produce a ball bearing, you test it, and then as long as you make it the same way with the same material, you expect it to go a million cycles. Oh my God, what a dream. I wish healthcare was like that. I know, I know. So that's where you can kind of crush your arms as an engineer and say, look, I know that's not what broke, you know. Yeah. I was having a discussion like that the other day with somebody about, they were complaining about something. Well, you know, how many cycles does it take to, I'm like, you didn't cycle it at that point in the life of that product to the point where it would break, you know? I mean, that's- Oh, you didn't run your study long enough. Yeah, like for five minutes, basically. You didn't give enough heart attacks in your study. Yeah, yeah, exactly. How do you induce heart attacks? Well, what your problem is, Joe, is you got to get ready, people with all the rest factors, ready to pop, you know, and we need to do that. And I'm joking, but I'm not. That's what we're taught, is if you want to study like incident diabetes, like you want to do a cohort study where you get a group of people who don't have diabetes and you want to watch rate of diabetes and see if it's different in like people who eat, for instance, the Adkins diet versus people who don't, whatever, then that's what you're doing is you're getting people who are ready to get diabetes, you know, almost at that point. So, but if you get young, healthy people, they might get diabetes. You just have to follow them a lot longer, which is what you were saying is like, well, if you don't follow it very long, you won't get any failures, right? Right, right. And the nice thing about inanimate objects that don't have, what is it called, presence. You can beat the heck out of it. You can beat the heck out of it until it breaks. That's the cool thing. Well, that, you know, it's interesting because when I was learning the Cox proportional hazard model, I was just like, Cox proportional hazard? I was like looking at these words. And I love my professor, Dr. Zue at USF. I said to him, I'm like, what is this? And he goes, Monica, please imagine this scenario. You have an ignition and you have a key, you know, like in a car and you want to see how many times you can start it before it fails. So you start it, start it, start it, start it. Oh, maybe it fails at the 1000th one. Then you do another one, but you have machines do it and I go, oh yeah, you know, because I used to be a fashion designer. We had machines that would go to your fabric and see, you know, the tensile strength and when it broke and stuff. But then after he explained it, I was like, what does that have to do with health care, right? Yeah. Yeah, and, you know, the interesting thing about statistics, it comes from farming. It's an agricultural, you know, everybody thinks of statistics. Your statistics started in farming and egg. Well, you know, the T test. T test, all that's from farming. Well, the T test actually, I put it on one of my videos, is from a guy in Guinness who worked at Guinness who was testing samples of small bats, you know, for yeast or something. Yeah, yeah, exactly. Yeah, so I guess in a way, I mean, this kind of rounds out our talk today is that we're talking about how there's all, you know, statistics is statistics. The T test is the T test, you know, the Pareto chart is the Pareto chart. But when you move these things from domain to domain, you can be torturing the data if you use the data from this new domain in the same way as the old domain. You know, which is actually kind of the argument. I made a little article today to promote our live stream sort of, but to also talk about the related issue, like I just mentioned survival analysis, is that one of the things you said, Joe, that I wanna highlight is there's a way to have accurate data when you're doing what you're doing. You know, like if you go to the account and you'll know how much mushrooms were bought, like there's a way to have accurate data. You know how much you paid. And, you know, like that there's a way to know how many times this, you know, ignition went. In healthcare, there's really not a way to know when that tumor started growing. You know what I mean? There's a way to know when we diagnosed it or when we decided that you had a clinical tumor, but there's not a way to know when it started growing. And there's not a way to know when a lot of things occur in healthcare. And so, like when exactly? And that was kind of the point of my article is if you don't have good accurate, like super accurate TAN data, how could you do time to events stuff? Because that's time, right? And so like one of the main things in healthcare data is convincing yourself that you believe the time variables, whereas I have this fantasy in my head and tell me, Joe, if this is correct, is that on the machinery and stuff that works with automotive things, like, you know, you would need, you know, like a Tesla or whatever has all this data in it. I mean, you'd have time stamps and stuff, you'd have really accurate data if you were trying to do warranty engineering, right? Yes, and the inaccuracy comes from whoever's doing the data entry. And typically- You don't have a lot of data that's not really entered. It's just like from sensors or from- Yes, true. Yeah, there's plenty of transducers collecting data. That's true. But the warranty data I dealt with had nothing to do with collecting that data. That's a quality function, not a warranty function. So I get a broken part and I look at it and they're like, well, what's going on with this part? You know, well, the failure appears to be a type of fatigue, you know, and they're like, well, what would cause that fatigue? Well, it's a cycling fatigue, you can tell because of the way it started and then an impulse force caused the final fatigue, but it was- So you do forensics on the part, basically. Right, right. Or it could be, you know, a bad or an operator that didn't really know what they were doing and something wasn't torqued down properly. Even though maybe the apparatus that they were using to drive the fastener in said that it was good, but it wasn't, you know, maybe there's just more. But this is something you could tell from just looking at it or you have to clear you. Oh, you could tell from just looking at it, so great. So you kind of collect your own data too. Yes, yeah. A lot of it's just like, okay, when was the part made? You know, where was it made? What lot is it from? Oh, just like administrative data about the part as well. Yeah, it's all traceability, right? The more traceability you have, you know, the easier it is to get to, oh, wow, yeah, I remember that problem, right? And that's again where Pareto can be very handy. Like for instance, if you know something about some upper level part of that part, you know. I see, so when you're doing an investigation of one particular situation, it's really helpful to have a Pareto chart of what's going on in the background. Yeah, let's say it's not, well, let's say it's a fastener. So the fastener has to case harden, for instance. And you can tell whether it's been properly case hardened or not by looking at it. And I don't mean, you know, you got to look at it under a microscope or you've got to do some chemistry analysis, like, oh, this thing wasn't case hardened for instance. And once you realize that that's what happened or, you know, it wasn't properly case hardened, then you can say, oh, yeah, well, that's why it didn't drive or whatever. But you know, it reminds me a lot of outbreak investigations. Of what? Outbreak investigations where like, a whole bunch of people got sick at this wedding. So we know something went wrong at the wedding. So what they did? Yeah, see, remember when I told you from the very beginning last year when you and I first started talking, I said, this is exactly what I do. Oh, I get it now that you kind of do case control studies, but you often only have one case, whereas we'll have an outbreak. So like, let's say 20 people at a wedding of, you know, a hundred people get sick. What we'll do is we'll take, I've never done it, but this is how you do it, is you take the 20 people and other people that came with them from their family, whatever, who didn't get sick, and you ask them all the same zillions of questions, which of course the sick people love answering about, and you get the list of whatever was provided at the wedding. And then you do case control analysis, you do like odds ratios. But if you only have one person, like somebody got monkeypox in Massachusetts. So that was kind of like the situation you had, is you just had to figure out, you know, why did that part fail? Why did this accident occur or whatever it was? Right? Exactly, exactly. And, but yeah, very, like I said, everything that you do is very similar to what I do. Well, but I guess, yeah, but you know, it really matters, you know, one of the things, I guess, well, it's time to wrap it up now. Thank you so much, Joe, for coming. I guess one of the last things I can say here, let me go back to the slide about you, is that I have a blog post for people who are just starting out on their, here it is, this is the picture. Just starting out on their data science journey, which I'm gonna go back on YouTube and I'll link it to it in the description. That blog post is not, it's kind of about healthcare, but it's got a video on it that's not focused on healthcare. But it basically says sort of four steps to your data science journey, like big picture steps. And the first step really is to choose a domain, like choose where you wanna work. Do you wanna work in healthcare? Do you wanna work in automotive or warranty? You know, because I'll teach you first learn the data and what they do with it in that field. Don't start by going, oh, I wanna learn Python, you know, because maybe they don't use Python in that field. And don't start by going, I wanna learn machine learning because maybe they don't do that or even if they do do that, do do, even if they do that, you have to understand the data before you're making a machine learning algorithm. Exactly, exactly. Yeah, so listen to people who've been around the block for a while. That, we agree. Well, thanks so much Joe for showing up and sharing your wonderful knowledge with us. I learned so much from you. I really appreciate it. Maybe you'll come back, right? You'll come back. Oh, of course. I love it, I love it. Okay, well, I'll let you go. I'll let you have a good weekend. Thanks everybody else for showing up. If you liked this, go ahead and put a like, put a comment. I've got it on YouTube, I've got it on LinkedIn. I come back, I answer my comments. If you've got a question for Joe, go ahead and put it, we'll try to get back to you. All right. Oh, there's CJ here. We gotta say hi to CJ. CJ's my favorite fan, right? Besides Joe. All right, everybody. Oh, this is why I have to say, CJ's my favorite fan because he gives me free original music that he makes. That's for my video. So please check out him on Spotify, YouTube, whatever. He's got great music. You like it, Joe. He does covers and stuff, but it's got that sort of post-punk grunge garage sound. You'd like it, Joe. All right, everybody, have a good weekend. Thanks a lot, Joe. Bye. Bye-bye.