 A little bit about myself, I am a professor at Insofi, the International School of Engineering. I am based here in Bangalore, we also have offices in both in Hyderabad and in Bangalore. I, our, a little bit about Insofi, basically we are Education Institute mainly focused on corporate training. We run weekend programs for corporate training as well as corporate training. We also do consulting and R&D. So, prior to joining Insofi, I was in consulting with Ernst and Young and even prior to that, I was with IBM Research and early in my career, I also work with startups. So, this talk is actually based, you know, is kind of trying to share my learnings, you know, over the years, working with various clients and seeing, you know, how, you know, human factors impact data science projects. So, you know, you might be wondering about the topic of this talk, right? Behavioral biases. So, I'm trying to use the term, you know, behavioral biases as a proxy for, you know, to distinguish from, you know, biases in machine learning, right? In machine learning, we have this concept of bias and variance. So, I'm not talking about that kind of bias, right? I'm talking of human factors. So, with that, let me get started. Let's look at a typical, you know, what's happening in data science these days, right? Why is this becoming more important? So, if you look what's happening, when data science initially, you know, was limited to a few companies, right? There were the Googles, Facebooks, the Netflix, right? And, but what's happening today is data science is becoming much more, you know, going down the ladder, right? You know, even a decade ago, right? It was pretty much folks, for example, in traditional businesses like insurance risks, logistics, e-commerce, digital marketing. That were the main users of data science. Today, that's not true, right? Today, it is, you know, industries like healthcare, physical retail, manufacturing, even small businesses, as well as even support functions like customer support, finance, procurement, HR, they're also looking at using data science. And what's different about, you know, the organizations in the lower two boxes versus the ones in the upper two boxes is the kind of people you are talking to are very different, right? The people you talk to in the upper two boxes, they understand data, they understand data science, right? The people in the lower two boxes, you're talking about people who are good at running a traditional business. They don't really understand data science. They don't understand probability as well, right? So you got to talk to them at a different wavelength. And that sometimes creates a mismatch of, you know, because the difference in their understanding and the understanding of data science folks, that creates a mismatch of expectations. So what I'm going to try and do in this, you know, lecture is try to, you know, highlight some of the pitfalls, you know, that come up. What are the reasons for those pitfalls showing up? How we can avoid them? And just being aware, just being aware of these pitfalls, right? You know, upfront, you know, will help improve the chances of success. Or, you know, say that, okay, this project is not really, has a low probability of success, let's not do it, right? So, so let's understand, you know, what, you know, how a typical data science project gets started, right? So, all right. So a typical data science project, you know, in an organization that's, you know, in the lower two boxes, right? It might start, let's say, with the CEO or a senior VP, right? A person at that level, talking about, hey, he or she read something in the, you know, in the latest journal or online, said, hey, this organization in this, in my industry is using data science. I should also do something with data science, right? And that's how it starts. And, you know, I've called this person a hippo. It could, you know, basically a highly paid person's opinion, right? If you've heard that term, right? So these are the people who basically think that they could think that, okay, if I do data science, I do AI, I do machine learning, and I talk about it, my stock price may go up, right? So that's, that could be some kind of a motivation for them. They're also, you know, also expecting, you know, an ROI, right? But they're not sure, you know, what ROI to expect. So they go and say to their, you know, they catch one of their line of business management and say, okay, can you, we want, I want to do a data science project in this company. Do you have some projects you can think of where data science would, you know, help? And he says, okay, I'm going to hold you. He's telling the line of business person. I'm going to hold you accountable, you know, for the benefits. So the line of business person, he also doesn't know much data science, right? So he goes and talks to, you know, either his vendor, right? Your favorite IT vendor and says, okay, can you help me get some consultants on board and staff, staff up a data science project, right? Well, that I, you know, you never ask a barber if you need a haircut. So don't ask a consultant, you know, if I, you know, need help. So, you know, some data scientists comes on board and they start, he tells them, okay, go and talk to the users, right? So there's end users. So these, each of these persons has their own, you know, things that they want to get out of this. And let's see, you know, what are the pitfalls that happen in this, you know, in this kind of a project? The first pitfall is failure to set the right expectations. Now, let, let's see what I, let's, let me take an example, right? Heard of IBM Watson, right? IBM Watson is IBM's AI platform on top of which they have developed certain specific solutions in certain verticals, right? So that's IBM Watson. And last, last month or end of July, an article appeared in the news, stat is one of the, in the, it's a online site for medical information. And an article appeared in stat, which is quite widely followed, saying that IBM Watson gave incorrect cancer treatment advice. Now, the data scientists here will think, okay, so what's the big deal? I mean, machine learning is not supposed to be 100% correct. I mean, if it can probably treat cancer 80%, maybe that's good enough, right? But that's not, you know, the way the medical professionals see it, right? And once it appeared here in stat, it came in the Wall Street Journal, right? So this is what appeared in the Wall Street Journal, right? The big blue spent billions on its Watson AI product with focus on cancer care. Sometimes it didn't say more than what doctors already knew. Sometimes, how can you expect machine learning every time to tell you something more new than what doctors already know? You can't. And most of the time, you know, it's going to tell you what the doctor already knew, either and confirm the belief, which is a good thing. There's a second person saying what you believe in. Or sometimes if it tells you something new, that's also great, right? But how do you expect it to always say something new, right? So then they say, in some cases, Watson was stripped up by a lack of data in rare or recurring cancer cases and rapidly evolving treatments. Of course, that's going to happen with machine learning, right? If you have rare cases, you're never going to have enough data, evolving treatments, if your context is changing, if you don't have enough data on things that are changing fast enough, obviously, machine learning is not going to give you the right result. Nothing to be surprised about. But, you know, that was something that was totally the way a data scientist looks at it, the way IBM looked at it, and the way the medical professionals looked at it. You know, there was a big mismatch of expectations, right? So, of course, IBM, you know, responded to this, and there's an article by the VP of Watson also online. But there was a Twitter posting by Andrew Ng. So, he's basically saying that, you know, it's important to set realistic expectations, right? So, the keyword here is setting realistic expectations, right? So, this is really one of the first pitfalls, right? Data science is, you know, about making statistical guesses, right? Probabilistic guesses. But sometimes stakeholders expect a sure thing. Let's try to understand this a little bit more, right? So, there's a behavioral bias called the certainty preference, right? And why does this happen, right? Because in a traditional business, the people whom we speak to, right, they are typically, you know, they're used to seeing, basically, software applications which are rule-based. These rule-based, you know, software, they are programmed to always give a consistent output. They have some input, there are some rules, and based on that, they spit out a decision. What's that? You know, that's not machine learning, that's not a statistical output. You know, that's a very consistent output that you're getting, and they're used to seeing these kind of things. So, when you go and tell them, I can predict cancer, or I can predict customer propensity using machine learning. They think, when they think, see, hear the word machine, they're really thinking like a piece of software that's coming up with a, you know, very consistent kind of a prediction every time. That'll be correct, and that is not going to be, you know, moody like a human is, or going to make wrong decisions because they're sleepy or hungry or something like that, right? So, they expect, you know, almost perfect kind of decisions, right? So, this is, you know, this is what happens to people in the traditional line of businesses who haven't been exposed to machine learning to date, and we, you know, as machine learning is going more and more lower down the ladder, we are seeing more of this happening, right? So, the way to, you know, set the expectation, the way I see it, is to tell them, you know, this is, the output is not going to be 100% correct, right? A machine learning project is more like an R&D project than a software implementation project, and it really needs to be planned as such, right? So, all right, so, once, you know, you have set this, you know, expectation, then what next, right? You need to figure out how to, you know, what to expect, you know, what, you know, they're going to ask you, right? How much accuracy can expect from this, or you need to figure out and tell them this is how much accuracy I can expect from this, and then they will say, okay, okay, go ahead, or they will say, oh, it's not worth it, right? So, the other pitfall is failure to baseline, right? For example, if the IBM Watson people had known that this is what the doctors were expecting, they might have tried maybe a different approach, right? So, the failure to baseline is, you know, one of the major challenges. I mean, if you don't know where the line is that you need to cross, how will you know your chances of success, right? You might end up putting a lot of money into something, you know, which has very low rate of success or low expectation of success. So, you need to know the threshold. Let's take an example of speech, right? In speech, today the accuracy, many of you would have attended yesterday's talk by Anand, right? The keynote talk yesterday morning. So, we talked about how speech recognition today has reached an accuracy of about 95%. And this, when it hit this 95%, right? That is when it reached what is called, you know, tipping point, right? You might have heard of this book, Tipping Point by Malcolm Gadwell, right? He really, that book is really in the context of startups, right? Something suddenly at some point, something in the startup has been running for, let's say, 2, 3, 4 years and it's not successful, but suddenly something changes in the external market and the startup suddenly becomes successful, right? The same thing happened with speech. The point it reached 95%, that's when human adoption quickly took up, right? Because for humans, 90% is just too much error, you know, for me to correct, but 95%, 5% error, I can live with that, right? So that 95% was a tipping point. The same thing is true in many applications of data science, other applications of data science. You have to know what is the baseline or what is the tipping point for your solution to be accepted by the users, right? And this is very different in consumer and enterprise, right? I'm actually borrowing from, if you attended the 2PM talk by Om Deshmukh, he referred to this, right? 80% accuracy might be good enough in some context, whereas in other context, you know, even 95% is not good enough. So it's important to know what's your baseline and what's your tipping point. And with consumers, it's typically, you know, when you're doing a marketing or consumer, their tipping point is fairly low. For example, let's say you're doing email marketing, right? Your marketers are using some rule-based system to send out emails every day. They're getting a 2% click rate. If using data science, you can improve that to 3%, they'll be satisfied. 4%, they'll be happy, right? So in that, in those cases, you know, the tipping point is, you know, not very strict, but you gotta figure out what the tipping point is. And this is one of the pitfalls, you know, people, you know, do, they don't figure out what's the baseline. They're putting a lot of money, putting in a lot of money into things where there's very little hope of actually reaching that baseline, but they haven't, you know, figured out, they haven't tried to really identify the baseline. So now let's say, you know, you've done this, you've identified the baseline, right? What next, right? You have to get a team on board, right? So when you're looking for a team of people, right, you know, you need a bunch of data scientists, right? And when we talk about data scientists, we talk about people who know a mix of math and they need to know programming, right? Typically, two skills we look at, but they also need to, you know, it's just not, it's also the attitude that matters, right? So let me give you an example, right? You might have heard of this. A traveler came to three men, you know, working, all right? He asked each of them what they were doing. The first man said he was laying bricks. The second man said he was building a wall. The third person said he was building a cathedral. Which of the three workers do you want in your team? Third one, right? So that attitude also is very important, right? So without that, you know, it's so, you know, I keep saying, right, data scientist is not just, you know, software engineer plus mathematician. It's somebody who also has a right attitude to solve a business problem to formulate a business problem. So how do you formulate this business problem, right? The challenge is that the education of data scientists today largely consists of working with standard data sets, right? And in the standard data set, you have a decision variable. You have, sorry, you have decision variables and you have a target variable, all right? And then with that, you know, you can say, okay, I got 1% higher accuracy than the other person on Kaggle or you can say, okay, I published a research publication in this conference. But, you know, in the real world, it's different. When you go in the real world, right, nobody tells you what's the objective or the target variable. The business problem says, make me more money, reduce my costs, reduce my headcount, help me do more with less people, right? And you have to figure out what your decision variables are. They say, okay, go and get the data sitting in the data warehouse. Go and pick whatever you want. You have to figure out what data is relevant. Nobody tells you that these are the variables you need to look at. You have to figure out what is the target variable. You have to figure out what you want to optimize, right? So it's very different. An example that I was working on, we were doing a project for a large telco in India and they basically came to us, says we want to improve the revenue from prepaid customers, right? So this was about three years back. So that was before GEO came in and messed up everything. So in those days, you know, there was still money in optimizing your things. So, you know, they said, okay, the data is there in the database, go take what you want, right? And sorry, the font is not very clear, but essentially what's out there is that they wanted us to optimize their prepaid plans. So they have some, you know, prepaid plans in the market, which is things like, this is an STD coupon for 14 rupees, STD pack for 14 rupees, which allows you to, which is valid for 28 days, which gives you a discounted STD rate of 30 paisa per minute, all right? Now, this is an open market plan. What they said is, can you come up with some, can you come up with some, you know, customized plans, which will be given through SMS, which will be offered only to some users or some segments of users. Can you figure out how to target these plans? So the decision variable was, they said, okay, you figure out what's the plan to use, and you know, but the point, you figure out, you can change the coupon rate, you can change the rate, you can change the validity within certain constraints, but you have to figure out, you know, what's most effective to change and by what amount to change it, what we want is more revenue at the end, right? So here, nobody's telling us what's the decision variable. So, okay, the team said, okay, let's look at the data, and they said, okay, the way we formulate the problem is let's focus on your STD rate. Let's make that the lever of the decision variable. And what's the target variable? They said, whether the first, they said, they came to me and said, okay, the target variable is whether somebody accepted the plan or not. That is the propensity to pick up the plan. Does that make sense? No. What the customer wants is revenue increase. If they make the target variable as the propensity to purchase this STD plan, they might end up giving this plan to the customers who are already taking this plan, and they will get 100% propensity, but because it is a lower rate, they might actually see a drop in revenues. We don't know how, we don't know, right? What we have to really target is not the propensity, but actually the revenue increase, right? So these are the kind of challenges that we see when we work with data scientists who are relatively inexperienced, and this is one of the pitfalls that we had to deal with. I think yesterday you saw the same Einstein's quote yesterday evening's talk. So you need to spend a lot of time just thinking about the problem and formulating the problem. That's actually the hard part, right? Solving the problem is much easier these days. It's changed, obviously, 10 years back, we didn't have nice packages like we do today, which can make it very efficient to solve problems. So now, as we go ahead, the thinking part, the formulating part is going to become more and more important. The solving part is going to become more and more easier. Yeah, yes, good. Yes. I understood your question clearly, but we are talking about revenue of this company, right? So we're not looking at other companies at all, right? No, yeah, right, yeah. Okay, yeah, yeah. So yeah, so we had to do an experimental design. We had to look at it segment-wise. So for each segment, we decided, okay, we'll make within each segment, we'll do an A-B test. We'll give the offer to some customers, not to give it to some customers, and using that A-B test, we had to do it. The next challenge that I've noticed is availability bias. Typically when the client says, go look at all the data that's in your data warehouse, take what you want, right? We don't actually get all that we want, right? The person, whoever we speak to, kind of the DBA or the database person, or the marketing person, they will basically say, this is the data I deal with, so I know about this data, you take this data. And what I've seen is the average data scientist will basically just focus on that data, and they will be able to make sense of that data, but what they have a hard time making sense of is what is missing, what variables are missing that are important. That they have a very hard time doing. So it's like this cartoon here we see, based on the data you read in the newspaper, right? You're heavily influenced by the data you see. You're not trying to think more broadly about what are all the other factors that influence the target. So it's important to not just look at the data you have, but try to figure out what else is important. I was also doing a project for a call center, for one of the large banks in Australia, where they already had a cross-sell model, right? And they came to us saying, can you help us improve the cross-sell model? So they were using the normal regression and decision trees. So we looked at the model and what we felt is, maybe we can try and improve the accuracy, right? But we still, you know, we don't know how much improvement we'll get. We'll probably, you know, these guys already have a good team. So I'm not sure how much, we can't say how much improvement we can really get. Is it worth taking up? The approach we took was, let's try to figure out what data is missing and try to make that data available for the decision tools. And since this was, you know, the people who came to us really for the call center people, they were making offers while the customer made an inbound call or they made an outbound call, they would make offers to their customer, right? Cross-sell offer. What we said is, once the call center agent has the customer on the phone, let's ask the customer two or three questions. For example, these are credit card customers. So what we said is, we'll ask the customer, what do you like about your card? Do you like the reward program more or do you like the low interest rate? So that's a very, you know, question that tells us a lot about what the customer cares about, right? And it also, you know, tells us, you know, are they more interested in, you know, reward program means they're probably somebody who's, you know, not, you know, cares much about their interest because they have enough savings, whereas if a customer who's, you know, cares more of the interest, we know that they're more likely to be interested in other loan or credit-related offers. So that was a very valuable piece of information we got. Right? So that, we did that. And that, you know, using that, we could improve our cross-sellerate by 20%. So, you know, this is, basically, in other words, I say this is a tendency to beat or only focus, you know, on the data that you have and neglect other data that is relevant, you know, or should be captured. All right, so now you have all the data, or sorry, another example I have is a retail example, right? Suppose, you know, many of us, you know, might be looking at retail data. In a retail case, one of the challenges they have is pricing. How do I determine, you know, how much discount I should give to maximize my overall profits, right? The thing is if you are only looking at some people, some data scientist might take that data and say, okay, I'm looking at the sales of this item and, you know, the price of that and I'm trying to fit in a regression model. All right? But that's not good enough, right? You also need to look at other things. Maybe the customers are coming to the store because there's a big sign outside that says even more on sale, right? You need to know those things. But then there's not captured in the data. The store manager has made the decision to, you know, put a big sign outside that says even more on sale. Is it captured in your data? No, it's not captured in your data. How do you, you know, kind of reach indirectly? How can you make an inference that something like this is there? One way is to look at the total footfall into your store or total sales, total number of transactions on that day. That says there's probably some promotion going on, right? And then normalize by that, right? So those are the kind of things you really need to think of. So having, you know, having, let's say, you've got all the data you need, right? What next? Now, you have your team. They're going to build some models, right? And typically, you know, today, when we get data scientists, right? Today, it's a bit more, people are more data scientists are more aware of all these techniques, right? About three, even three, four years back, we used to hire a lot of people with the statistics background. They used to have a BSC, MSC in statistics, right? And one of the challenges in the training of statisticians we found was they trained in regression models, but they don't know much anything else, pretty much. And, you know, so whatever the problem was, they would apply a regression model. So that's what I call, you know, man with a hammer syndrome, right? You know, one or two techniques, and you always want to apply those one or two techniques. It's really important to understand, you know, have a good understanding of all techniques. So make sure that even if one person, you know, you can't expect one person to know everything, but try and build that diversity in the team, all right? So now you have, you know, you've set up your team and you start working on the problem, right? And what next, right? Your team comes up with some, you know, comes up with some analysis, right? So going back to the telecom example, right, where, you know, the client has asked the team to look at what's possible in terms of improving my revenues by optimizing prepaid plans. So my team came back to me and said, customers, once they opted for the STD pack, increased the spend by 30% compared in the following month, compared to those who did not take up this STD pack. So this was their observation. This is a correlation. But based on this observation, this, they recommended, therefore pushing this offer to more customers will increase revenues further. Now this is clearly correlation, but does it imply that taking up this STD pack is going to increase revenues? It doesn't. In fact, correlation versus causation, this pitfall is taught in statistics, right? But people sometimes have a lot of difficulty taking what is taught in the classroom and seeing that it is relevant to the business problem they have, right? So what's happening here is, you know, there's some other factor. For example, in this case, it's probably the customer intended next month that because they're traveling or because the family is traveling or something, they would make a lot more STD calls and they ended up buying the STD pack. So because of some other factor, they bought the STD pack and so they were anyway going to call more, right? So it's basically saying that, you know, causation is not correlation. Yeah. So distinguishing between causation and correlation is not easy. You have to look at multiple things. What we looked at is, we looked at it at a segment basis, right? So we already had customers segmented. Some of the segments were as small as 100 customers so within each segment, we don't expect any bias in terms of or, you know, we expected fairly homogeneous two segments, right? So in both segments, we do, you know, we give half the customers or some x percent of the customers, we make them this offer, special offer, right? And then we record how much is the uptake rate in each of those segments. So that, in that case, we have really compensated for this effect because both segments, we are not looking at each person, but we're looking at a whole segment. At the segment level, the segments are homogeneous. The same percentage of people who might call, who might want to call long distance is there in both segments. So that's compensating or not normalizing for it. Yeah, yeah. Obviously, this is not something they, this is obviously a discussion I had with the team so it was well understood in this thing. Yeah. How am I doing on time? Okay. So now, you know, and so you are getting close to, you know, solving the problem, right? So what next, right? There are also human factors to look at, right? You know, I, you know, as I said, you know, data science is not just software and math. You also have to look at human side or psychology. In fact, I think psychology should be taught to all data scientists because all the data scientists today, right? We have to deal with human decisions. Either you're working on consumer side, right? Like you're a Facebook or a Twitter, you need to know how consumers will behave, right? If you're working inside an enterprise, you need to know how your enterprise users, your decision makers will like or dislike what that, you know, what this black box is telling you, right? So psychology matters, right? You need to understand, you know, for example, something like you have probably heard of this, right? If you want to, if you're a retailer and you have wine in your store, right? You have a $20 bottle of wine. You want to increase the sales of $20 bottle of wine. The trick is keep a $100 bottle of wine next to it, right? No way data science can tell you this, right? You need to know psychology, human behavior, right? So in marketing, it's important, right? You need to understand marketing because so much of what data scientists do is with consumer behavior, right? Another thing was, for example, you're trying to predict the effect of some promotion, like some advertising, like, for example, this is an ad for a Dow soap, right? A fairly old ad from the US. So marketing is interested. You know, there's a quote, you know, where the chief marketer says, 50% of my advertising budget is wasted, but I don't know which half, right? And so this is for the Dow soap, right? And the, you know, the chief CMO asked you, can you figure out, you know, how effective these ads are? So you do the analysis and you say, the new promotion, sorry, the new campaign you launched where you're advertising the soap has no lift on sales or very negligible lift on sales. But you know, the marketer is going to say, well, it may not have an immediate lift on sales, but it's going to create brand awareness and that brand awareness is going to stick for a long, long time, right? So this is something, again, you only know from a domain expert, right? So these are kind of things you need to be aware of So it's not just, so marketing is one, but more broadly, right? You need to understand psychology, right? In an enterprise case, you might see, okay, users are not following your recommendations. What do you do, right? So in the several, almost a decade ago, I worked for a project at IBM, internal project where we were helping assign, build a tool for assigning practitioners to projects. So IBM at that time had a large bench and they were interested in minimizing bench and there was a lot of mismatch in the skills available on the bench and the skills needed for the projects. So we built a tool to help do a global optimization of that, right? When we launched it, we launched it initially as an Excel file, we would send an Excel report with recommendations, matching recommendations. Users didn't like, I mean, they weren't really looking at that much. They said, okay, Excel file, okay, I don't know how it's created, how it is, what's the basis for this, right? But then we said, okay, let's put it in a UI, a web-based interface. And that made a huge difference. People who are hardly looking at the user file, they were, the web interface made it much more interactive. We were able to give or put a lot more features when you click on a certain person, right? They were able to see, understand why this person is being recommended because we had those explanations in an interactive way. So that made a big difference, right? So, explainability is very important, right? Third thing was, you know, do you understand what users are measured on, right? If they're measured, for example, on revenue and you are trying to optimize profits, right, they may not like your recommendations. So that's very important, right, to know, sorry. All right, so now we have understood, you know, human behavior. I've talked about, like, some nine different pitfalls and biases, right? And many of these come together. And when they come together, it results in what's called, you know, it's the Dunning-Kruger effect, which is basically the effect, you know, in our normal language, we call it overconfidence. But basically, this is an effect where you are more confident than what you should be, basically because you don't know what you don't know, right? So it's very important to know what you don't know. So data scientists might say, okay, data, but, you know, after they build the model, the end result is not so great, right? Because the possible limitations, you know, the data scientist team did not really think about those possible limitations. So with that, you know, pretty much covered, I have ten of these. So I'll just summarize, you know, and typically in the order in which the surface in a project, right? First is failure to set the right expectations, right? Second is I want 100% correct or I don't want it. Failing to baseline, you don't know what accuracy you need to reach for it to be acceptable to your users. Bricklayer mindset, right? You think you're trying to solve a regression problem, but really you need to think of it more holistically. Problem formulation, like the telco example I gave you, what's the target variable. Availability bias, trying to get, you know, the data you have, also look at, figure out if there's some other data that's relevant. Man with a hammer syndrome where you only know one or two models and you end up using the wrong model. Causation versus correlation, failure to appreciate the human side, learn psychology, right? Overconfidence bias, especially when many of these things come together. So my ten biases and pitfalls, all right? With that, I think that was pretty much the last slide I had. I have, I'll just talk a little bit about my organization. We do, we are basically in Sophie, right? We are based in Bangalore and Hyderabad. We do training, we do R and D, we do consulting. We have 15 full-time faculty, all of whom we have a PhD from top universities. And we work in these, all these areas. These are some of the clients we have worked at. And, you know, if you have any requirements, we have a booth outside. You can figure out, find out more about us. And with that, I will take any questions you have. It's not understanding the objective. So the question is, how do I get to the objective, right? So, see, you need to talk, talk to, right, true. And that's I think it's important to have, see, the way I would do it is typically I would go and have a conversation with, you know, the hippo, right? In this case, the highest paid person who's sponsoring the project, right? Have a conversation with them, understanding what they want. And then I would talk to the other extreme, the users, right? To understand their challenges. And then come up with a proposal, try to figure out what's the meeting point. What are the, you know, because of CFO or the hippo is talking at a high level, these people know the low level challenges. The meeting point, right? That's, that intersection becomes important. See, I think, when it comes to consumer behavior, you never have enough features, right? There's always a lot about the customer you don't know. I think it's basically a matter of speaking to domain experts, figuring out how they speak to the domain experts, understand, you know, what they think are believe are the important variable. Don't just think to speak to one person, speak to another person. I think it's a good example. Each will tell you something, put it all together. That's the way how to, you know, figure out what, what's relevant. And then the second part is, you know, having figured out what's relevant and how do you, is there a way, is that data captured in the organization? Right? If it's captured good, if it's not captured, maybe you need to talk to the IT team to see how it can be captured. Sometimes it might involve doing some, to capture that. Maybe it might involve doing some customer surveys. But in the long, you know, in the long run it's all worth it because, you know, in fact, even if your model fails, that data that you collect from consumer surveys that you collect from maybe testing, it might be useful, it's useful for some other reasons. So used to? Yeah. Right. Very good point. Yeah, we, they will typically talk in the context of, you know, the process and they don't think outside of the limitations of their process and yeah. So that is, I haven't figured out a good way to yeah. There's no formal, yeah. There's no formal, yeah. When I talk, I'm talking more about features, missing features. Yeah. Yeah. So what we typically do is look at what's the best practice in industry, right? Retailer, right? Look at globally what are the best examples, what are retailers worldwide doing? Take that as a benchmark and starting with that, then you know, figure out, is there something missing for this particular, let's say, Indian context, is there something else I need to capture? For example, if it's like, let's say a big basket delivering to your doorstep, maybe that's an opportunity for the person delivering to ask one or two questions, right? And they can collect that additional information. Maybe they can, I mean, there's obviously questions of privacy, but they could just visit, I mean, they come to your apartment, they could see, you know, what, how many bedrooms it is or, I don't know, things like that, right? Understand, collect some more data about this. With data science, it's the brains that matter, right? Any, some pre-existing solutions you buy rarely work, yeah. Focus on the brains and spend 5% on your, there's a, somebody, Avinash Kaushik, he's kind of the guru for web analytics. So he said, you know, spend 5% on tools, spend 95% on people.