 Hi, am I audible? Yeah? OK, great. So we're going to keep this very interactive, and I don't mean polite, but interactive. I actually mean interactive, which means if that's what it takes, book my show tickets on me for fun answers from absolutely everybody. Cool? All right. So quick show of hands. How many people have heard the phrase spin tech? All right, this is a fun audience. What have you guys heard about it? How much in the lending space? Consumer lending. Why is it interesting? I heard Bitcoin. Credit scores. What about credit scores? They don't have any. Yes, that's a great start. We're going to go from there. All right, so we're going to be figuring out how phenomena is building a big data-driven credit underwriting system. The first few days, we had to recruit a wizard nightmare saying this much, but we're going to figure out what this is about. More so than that, though, this is supposed to be about excitement. Lending has this huge emotional connect with users. It's not a transaction. People need money. FinTech is a need to have, not a nice to have. Money is a need to have, not a nice to have. So we're going to be figuring out exactly why all of you, every single one of you, every smart person possible, needs to be thinking about this problem and needs to be figuring out how to solve it. But because there's awesome people in this audience, I will very quickly tell you that undergrad masters, computer science Stanford, worked at Microsoft, a VC fund in the valley, Facebook, built Graph Search, and also was the youngest product manager in HoloLens. But enough of that. All right, who wants to guess what the x-axis is over here? We can see all the sectors, the major sectors on the y-axis. Any quick guesses for the x-axis? Customers, fraud, both good ones, by the way. What else? Amount of data, yes, we're in a data science conference. Amount of data, even better amount of data for revenue. If this isn't exciting enough, right? So this is a great reason for why we should all be talking about this, but there's a lot more reasons as well. This is just initiation. OK, what's the size of the impact? Why do we care? We've established we're all data geeks, and this is going to be a fun problem to solve. But why does it matter? India's young, credit-starved, mobile-first, data-rich, and just Garadhar with 1 billion people on it. In more detail, what does credit-starved mean? How many people know what percentage of India has credit cards? Random guess. 2%. Yes, a little less than 2%, right? Awesome. About 15% to 20%, like you said, has credit stores, which means 80% have no bureau hit. That's unlike other developed countries. We're different today. Half our country's under 25%. 65% of our country's under 37%. We have never been risk-assessed by traditional scorecards for whom the average age is about 35, right? We're also getting more and more non-traditional jobs every day. There's more bloggers today who are 19-year-olds who are earning somehow a good sum just by virtue of creating content than our parents can imagine. It's actually mind-blowing. We obviously know India has a very large informal sector as well, right? So we're very, very different. With data-rich, with mobile-first just like China, the US is not mobile-first. In fact, a lot of the lending patterns in the US still don't have mobile apps, right? Over here, everything starts from there because we leapfrogged a bunch of stuff. Rising internet access, I'll very quickly just show this. This is mind-boggling. We can even, I don't intend for this to happen, but we can kind of see the infection point happening somewhere around 2014, right? So from 2010, where we were about at 92 million internet users, we have gone to 45, sorry, 450 million internet users as of July this year, so the year is not even complete. Last year when I started the company, we were at 35, 350 million. And I used to say 350 million so much that the 450 million number surprised me today, right? And I just kind of extrapolated for the next five years at 20%, even the last year we grew at 35%, not even possible where it's gonna end up reaching a billion. But we have hit an infection point, both in terms of internet penetration and smartphone access. And actually, the both of them help the other, right? And now financing comes in, helps more smartphone penetration, which helps internet penetration, which it's just a really big loop, massive growth awaiting, right? Aadhar, we have Aadhar. I met B2P platform owners from Indonesia in China and they were like, we have no system. We suddenly have our own social security number equivalent, but better, with biometrics and civilians strong. If the last six years were about the smartphone wave, we just got another wave start, but with a billion people on it already, right? And small ticket size lending, 60,000 rupee, 80,000 rupee MacBook, right? A furniture loan from Urban Ladder, God, they're expensive. I mean, I love them, but they're expensive. You know, no one's gonna lend you that money even if you actually earn very well, right? But you don't wanna block your credit limit or you don't have a credit card because only 2% do. No one's gonna be able to lend you that money because the profit they make from that, your cost of acquisition, your cost of manual underwriting, the way they do it right now, the cost of servicing, God forbid you default. They just don't make money, so it has to be tech-fast. So what do we get? We got new to credit and non-good credit. Only 20 million people in the entire country from what I hear are considered good credit by traditional financial institutions. 20 million, 1.3 billion, big, big, big difference. We got a target audience because we're younger. It's a huge part of the new to credit. Because of that target audience, we have a digital footprint which is massive. We have an identity system to prevent frauds and we have a massive white space, right? We're actually, machine learning and technology are strengths not weaknesses, right? Just a couple quotes. A farmer said, hey, you know what? I keep hearing about all these apps. They're developing stuff so I can check crop prices and all that. Can you please help me buy a smartphone, right? A salaried person, again, I get the furniture example already. Another guy, self-employed earns a ton but because it's self-employed, right? And no credit score. We're seeing all over the world in the US and India and China. Millennials don't wanna own credit cards. That's crazy from a traditional perspective because access to credit is supposed to be something we enable for ourselves, like the US, but we're not because we realize our propensity to overspend retail therapy. We all know that happens, right? So, how do we do credit underwriting a scale? Let's break it down. What that get, right? Consumer lending, approvals. More, but this is lending. It's not transaction-based debtor. I can't lend money to everybody. They'll never come back, right? So money needs to come back. So what does the funnel look like? You source it, analyze it, finally make a decision and service it, right? Applying the more and better context to all of them. Very simple, right? For acquisition, I won't go into too much detail, right? You obviously get a lot of volume and it's better than traditional because it's targeted, it's cost efficient, it's more effective. I can run a Facebook script which automatically optimizes 50,000 ads at the same time instead of a manual person trying to maybe max out at 50. Analyzing, this is the meat of what we're gonna cover today. But again, we're only gonna cover a sneak peek of what we're doing because I need to excite all of you about why this is worth doing, right? Obviously there's volume, there's variety, there's velocity. We've heard those words, we know exactly where we're going next. Better, how do we do it better? We need to prevent fraud. In China and India, the biggest use cases are preventing fraud. That alone will wipe out your entire loan book. It needs to be personalized and contextual, right? Again, approving, these need to be automatic, machine-learned criteria, but different for every persona. Somehow you're stuck figuring out, well, yes, every person is unique, but they also have to have some similarities, right? And servicing obviously has to be more tech-first. So we've seen the four V's over here, so we know we're gonna try to use data, lots and lots of it, wherever we can find it from possibly, we're gonna be very greedy and we're gonna try to figure out how to solve this. But before that, really fun exercise. Would you give the person next to you a loan? Because you cannot code, but you cannot do it yourselves and I can see a person nodding in the now. But seriously, any fun answers over here? Why or why not? I could start cold calling people, that would not be fun. Exactly, it does depend on how much for, right? Let's say it's a reasonable amount, like 50, 60 K. Probably why? Brilliant, so you already identified spending patterns, right, they spend at the conference. I will not go into the second one for reasons. Great, other fun answers? Come on, look, I'm not kidding, look at the people right next to you. When you observe, you start to figure out what you might base a decision on. No? Last chance. Okay, here's another one. This is a problem statement. Let's say you're running a credit scoring company and you gave a score to this person, Mr. Kunal. He has a credit card, always pays it on time. Yeah, there's lots of other things, the only thing you know of right now, and you score them as score X, right? Now, the new data point you get is that this person goes and closes their credit card down. You increase X or decrease X. Why? I heard increase for everybody else, why? You're on fire, by the way, why? Okay, so that was a profit-driven reason, but as a credit scoring person, you have to be very, very ethical. Any other reasons which are non-profit-driven? Uh-huh, so it doesn't require credit, so perhaps might not need credit, cash flows are doing fine. Yeah, so no credit card left anymore, so can default on it. Interesting fun fact, by the way, people don't know this, right? So if you don't get yourself access to credit, you think you're in the safe because you can't default. It actually, the traditional credit system treats you as worse off. They say you're a good credit-worthy person if you keep taking credit in moderate amounts and then keep paying it back and then keep taking it again. So any other reasons for increase or decrease? There are still good reasons for both, yeah. Yeah, increase because the person is presumably managing, like actively managing their expenses and wants to stop overspending, right? Any other reasons? Yeah, first value, yeah? Okay, decrease, see the same reason you said when they were vice versa, right? Maybe the person thinks, let me just close it down. I'm not gonna pay it back, I have no job anymore. Better I got a demotion or I got fired or whatever, right? Okay, so my point was you could kind of argue both ways. The traditional system, though, definitely would decrease. Right, again, for the former reason I mentioned, you're voluntarily choosing to deny yourself of this credit line and it's crazy. I think it's crazy. I like to call myself a millennial. I think it's crazy. Okay, so very quickly, we talked about the credit underwriting part again. Those were just motivational things, right? But basically we can think of so many reasons why we need to start fixing the system. But we're gonna talk about scale for a tiny second. All right, so what does scale mean? Now, in a e-commerce, transportation, delivery context, it just means transactions, right? Growth is oxygen, right? Okay, lending, bit different. This is what a transaction-based business looks like, right? There's a visit, there's a bunch of stuff in the middle. Growth and product get the user to purchase, right? Okay, this is what lending looks like. There's a visit, the max growth and sales and product can do is get the person to complete their loan application. Those two arrows in green have to be tech and data and nothing else. Nothing else gets you there, otherwise you just go into the reject file because the money has to come back, right? Big difference over here. It kind of underscores just how much data and tech matter to this, because without that, you're just stuck at being a reject. Here's a fun fact. Raise your hands if you're slightly, you know, shulked in a pleasant or interesting way by it. So if I can make 40 decisions a minute, that's how much I earn, 100 crores a month. Isn't this looking good now? Right, for other reasons. There was the emotional reason, there was the data reason, there the very, very nice lending has inherent interest-built in reason, right? But 40 decisions a minute is crazy. You know why? Because again, think of the very first application you have to press a yes button on and the money goes out of your bank account into someone else's and it might or might not come back. And you're the one responsible for the PNL. How much, how many hours are you gonna take? Forever, how many hours? Roughly, couple of days, okay. So forever, couple of days, I think eventually people get down to like, you know, maybe push yourself two hours, right? One hour tops. It does depend on the data you have. What data gets you to 40 decisions a minute? I would love to know. Yeah, awesome. Yeah, awesome. Sorry, is he an alcoholic? Great. How many of us get disqualified suddenly? Me included. Okay, so yeah, that's the part we're talking about. Let's talk about it more. How do we get to 40 decisions a minute? So we talked about spending, we talked about income. By the way, no matter how much data you have and how many metrics you auto-calculate, even going through that will take more than five minutes. You know, just like reading through it, making some sense, being a little bit nervous before pressing that button. Anything else? What else? History, what kind of, oh, history with us. Yeah, oh, repeat cases are awesome. Repeat cases are beautiful. Could you just trust them? What else? Non-defaulters, yes, please stay away. Those are easy cases though, what else? Come on, we're a data science conference. We're the best people in the room. Simple score, yes, jinn ke baas hai. An jinn ka achcha hai un ke liye. Why does the person need the money? Amazing. I actually had to fight with my co-founder to put that question in the application. What else? Willing to secure the loan, yes. We are doing unsecured, by the way, because again, we were like, you know, if you wanna solve the affordability crisis in India, those people are not gonna have, I mean, if they need a loan for a smaller amount, less than two lakhs, they're not gonna have much collateral to begin with. But yeah, yeah, if you're doing small business lending, yes, it's a venture profitable art, right? Yes, yes. Consumer lending becomes even more interesting, right? Because you don't have business characteristics to look at. Here's some stuff people often use for examples. I did a lot of this myself as well, right? I'm gonna look at what courses you take on Coursera, and I'm gonna look at which videos you watch on YouTube, Hugh Gasp from Half the Audience. I'm gonna look at how much you walk every day, how much you sleep every day, how often you recharge your phone for how much. Very nice to talk about, but without structure, all of this is still meaningless, right? So let's put some structure around it. But are we realizing how complicated this is? A little bit? Come on, please say yes. Otherwise, I'm gonna have you solve it. Yeah, okay. And the prices for the solving would be much larger than movie tickets, by the way. All right, so here's some examples, right? We already said some. We said spending, publicly available, web data, of course. I love calling it, you know, private investigators or stalkers, 3.0, right? Just run scrapers while on Google. Government data, data partnerships, mobile phone data is amazing. We are a mobile first country. Biggest advantage ever. And behavioral data. I'm gonna talk about what I mean by behavioral data. It's personally my favorite. I wish more people explored this. But if you have an app on the phone, I can track so much. I don't mean about mobile data. I mean your behavior. How much time you take to fill absolutely everything. Right? Okay, so, but see, here's the thing. People also run into this. Without a structure, without an approach, we just have a big, big haystack. We can be greedy beyond belief, here we are. But we just have a big haystack, right? Good luck finding the golden needles and people keep selling each other data. Well, he's got some money for the haystack, right? Okay, so usually the structure imposed is two things. Actually, the third one is just remove the crazy people and the fraudsters and the known defaulters and any people who suspect of defaulting near future. Remove all of that. Remove identity fraud and other kinds of stuff. But then, capacity to repay and intent to repay. One is extrinsic. Depends on your current cash flows, but also environment variables. In the next, in the term of the loan, right? And it's very hard to predict them, actually. But the other one is intent and no one's been able to figure out intent properly, although we're all trying, right? I will move on. But any ideas for how we do intent or how you would do intent? Again, think of yourself. Why should someone else give you a loan, right? What intent do you have to pay it back or not have? Yeah, how will I get access to how, you know, where your dealings are, though? I would love to, but social media. Okay, is someone else screaming at you out with a pitchfork for you? Definitely a bad sign, so negative lists are probably easier to develop than positive ones. There was an answer here? Okay, so start small and then keep extending the credit limit based on the trust that is established. Yes, that's another way to go, right? So most people, I mean, this is where the magic happens, right? Intent, it's really fun, although even for a magician, they know that trick is not that much magic. It's just a lot of practice, a lot of hard work and a lot of thinking. It's the thought process, right? When fields are new, the approach and thought leadership counts. When they're old, we talk about how to do it and how to do it faster and how to do it better and how to do it more efficiently, right? I think of them as Lego blocks and I would love for by the end of this talk for all of you to be thinking about Lego blocks. How many blocks can you build? They're not that hard individually. We just require thought and use case. Think of use cases, think of scenarios. And I'm gonna give you a sneak peek now. Not into everything, but into a good side of it. So here's how we do it, part of how we do it, right? So on the very right, okay, where's it? Here, finally, on the very right is stuff people talk about individually, right? Oh, I'm gonna get behavioral signals. I'm gonna scrape Google. I'm gonna get LinkedIn and Facebook information or Twitter information. I'm gonna parse to your bank statement. We asked for 12 months bank statement. It is lending, it's need, right? I don't need to discount anything. We can parse SMSes, we can parse call logs, we can make a location profile, we can do so much, we can talk about spending patterns. But individually, they're not that effective, right? You need to have hypotheses around behaviors and traits and personas and what they actually mean and what can they start telling you. And everybody only thinks about a credit score, right? Either you have one or some vendor company will come and build it for someone, right? Okay, so we started to diversify a little bit. You can't measure intent, but maybe you can measure character. How do you measure character? One way is truth, right? Does the person lie or not? Now let's figure out how do we do that? Okay, so I'll give a couple of cute examples over here. I'm gonna spend some time here. I'll give one example, a very petty example in the bank statement passing side, right? Because it's not easy. It's really not, right? Mint solved that problem of real-time bank statement parsing and analysis in the US got sold for $200 million. In India, the legends, much crappier. It's just non-existent, every bank is different. I know there's some computer system somewhere generating those, but take your own bank statement from seven months ago and read it. And I will give you something if you can recall every single line item on that, right? Because they're not, for some reason, they're not designed to let us recall exactly where we spent our money, which means it makes our jobs very hard, right? Well, I'll give you one example. We don't have to bet 100%. I think that's the other takeaway. But use cases will get you to 555% and they still matter, right? Again, 1.3 billion people to serve. Let's start small, right? Let's start. So what we did is we found one data set somehow, which mapped ATM codes, code IDs, to their addresses all over India. Then we parsed out those codes from every single bank statement for 60 plus banks. So we figured that out. Then we took the address and got the latlong from Google. Then again, we went to the bank statement where the ATMs were parsed out by day. Moved it back to the latlong, got a location profile by day. Then we could pseudo-verify residence or work or for lucky both, because where do you withdraw money from your ATM most often? Either where you work or where you live. And if there's a cluster somewhere else, you have a girlfriend or a boyfriend or a spouse. Your current spouse doesn't know about it, I don't know. But it's very interesting to see those clusters. And again, I use that as an example, because it's a simple example when you think about it. But someone still has to think about it. OK, another example. So behavioral signal on the fraud side. OK, so you write your name on the app. Now let's say I'm on the app and I write R-I-S-H-A-B, then backspace, backspace, backspace, backspace, B-D-H-I. If I was only monitoring what comes to the backend after submit, I would miss that. But that's a pretty big indication. Someone's doing something funky. You do not forget your name that much. Right? OK, I'll give another example. I have two dates of birth. Long story. But I think in India it's pretty common that a lot of people have it, this is what I keep hearing. OK, so every single time, irrespective of which form I'm filling, I pause at the date of birthfield. There's no good reason for any person to be pausing on the date of birthfield ever. We all know our dates of birth. That's the one thing we're very proud of. We love being born that day. And I pause. And if someone's actually monitoring that pause, they know something's fishy. That's what I mean by behavioral signals. And that's what I mean by pot and approach more so than analysis. But someone has to think about what blocks to build and how many use cases they can help us with. Right? OK, so the others are sort of, if there's questions, I'd love to answer them. But here's a couple more examples. So on the truth side, I mentioned, oh, OK, here's another interesting one. We mentioned the back space, the back button on the app. Earlier, we had a hypothesis. It's a loan application. It has to be linear, no? I mean, you've been used to paper for so long. So there's no reason you use two student ones when you go and fill out some of the application, then you back, back, back, and try to change your category. No, no, I'm self-employed. You think there's no logical reason for that. So we disallowed it. Then apparently, people complained of fact-finger syndrome. So we allowed it and decided to very OCD-wise monitor how many people use it and for what. And we noticed when people were trying to, you know, they choose one category, go all the way to the very end, look at the documents they have to upload, think something. You would be intelligent. They're not static. They're actually learning faster and responding to the product you're putting out and trying to game it. Genuine or not genuine, I'm not even going there right now. But people are still trying, you know, what will increase my chances? What makes, you know, lets me get away with the least amount of stuff uploaded. And they come back and then they try something else and they come back and try something else. And we're just looking at the data going, what's happening? But it happens all the time. That's interesting. So what we've done now is we allow everything. Just monitor it very sweetly. It's our way at least. Of course, you know, there's other ways of monitoring truth. So it depends on the data type. With the name, if I am actually taking data from so many sources, half the benefit I get is verification, double, triple verification. Most of the important data points I will never trust someone on. I need a second source of verification, very much like journalism. Okay, need score. So how do we start calculating need? Well, we did a couple of stabs. Long list, but here's a couple of examples, right? Time you take to fill out the application. Or when you're chatting with us, how urgently do you want it? How desperate are you? How much are you pinging us? Please, please, please give me a loan. Right, people do that all the time. You see such interesting behavior, it's not even funny. Which, by the way, is why I'm not in a race to take less data. I am lending to the riskiest part of India, which is 80% yes, but no one's ever lent to them before. I'm gonna take my sweet time and ask for more data and look at their behavior and actually judge it, right? And then try to automate it. They will score, of course, cash flows in real time, but we already discussed why bank statement passing and automatic analysis is actually very hard. If you're doing anything to do with non-numbers, I mean, you know, you can kinda just say some all credit, some all debits, minimum balance every month, that kinda thing. But anything more sophisticated gets harder. Credit score, we all know this. Even for the folks who don't have a credit score, you can kinda calculate things like debt to income ratio, timely payment of other bills, right? Phone bills, utility bills, that kinda thing. Hello, SMS, right? Okay, and credit awareness score. What does this mean? Why have I put it over there? Why would I care about this? Anyone? Sorry? Why do I care about that? I mean, I've put it over here, so I clearly care about them knowing about it or not, but why? Yes, with the credit awareness one. So that's more for the credit score, that you could be paying loans on time, but not bills on time. I do that. Self-image, okay, tell me more. So much more than self-image. I'll tell you all the reasons. There are cases where we can see that the person has gone to credit so far, check their score, they know it's bad, then they'll, on the app say, I don't know about it. Might be good. Or, they'll actually have a great credit score, but they'll genuinely not know about it. Or, we end up getting a whiff through multiple data sources. We've covered, you know, how many these can be. So they're only to the persona. We get a whiff that you don't care about your credit score going down if by any chance I get a whiff. I'm not giving you a loan. That's my only leverage, you know? We get half our parents' generation was, what will you do if I don't pay back? Right, we have bureaus now. So, but people still don't know about them. So, we're taking a very financial literacy centric and driven approach to lending, because lending has to be responsible. It's either a win-win or a lose-lose. Win-win, I win, the person wins, we're both good. Lose-lose, my money doesn't come back. Their credit score gets dogged. Half the time, no one tells them their credit score got dogged. What happens when we all pay minimums on our credit cards? Your credit scores are not doing very fine, right? So, I care about this so much, right? One is because what they say is it actually what is, or even what is, is that a good thing or a bad thing? Do they care about it? Do they want to increase it? Do they not care about it getting decreased? Okay, you know, rich parents really does not care. I have no leverage against this person. I'm not going to go with the bad guy, with a gun to their door. So, let me just stay away. Lots of people to lend to. Okay, here's some examples, again, similar, but more categorized by AI category this time. So, image processing, right? Document verification. This would be awesome. I mean, obviously, you know, we're only at like 5, 5, 10, 10 percent each, but I'm showing you just how much work has to be done in absolutely every field possible. Automatic information extraction from the images that are uploaded, right? Order generation of reasons for rejection. You want to do tech first. You want to do small ticket size profitably. You need to automate as much of it. And you, yes, you start from 20 percent, but you keep needing to, you know, increase it very, very steadily, right? Video processing. So, we actually take these 30 second videos on the app I can send. We ask them to upload it. Mostly because even for a selfie or a photo, you could still be frauding me, right? Your friend might not even know you just took a picture and you don't know and you're half your way to a loan which you don't intend on paying back. But for an entire video for 30 seconds, where you're supposed to be saying exactly how you need it, harder to actually fraud, right? We are present offline as well at the points of sale. Let's say at home appliance stores. Over there, we actually took a video just to make sure the dealer is not defrauding us by trying to just, you know, get random people or no people, fake people, and trying to fill out applications and just get loans, right? Khuddi ke andar, you know, there's just showing growth. So that used to happen a lot, which is why traditional lenders never went for a known feet on street model. I'm a tech company, I need to solve for it. I don't need to encourage more random people just, you know, being at offline stores, waiting for people to come in and then sell them loans. And LP, lots and lots of stuff, right? Not only stuff like the purpose, right? So that's the field. Ideally, let's analyze it slightly automatically in the future, because there's lots of interesting stuff that comes over here. But the chats are a pure gold mine. The kinds of questions they ask, right? And actually, I'm stopping myself because I was gonna share two examples of, we've sort of shown, if someone is asking these two questions, they're very serious about the loan in a good way and have a very high probability of paying it back. But then, you know, so many potential loan takers, so you will excuse me and forgive me if I don't, right? They're desperation to get a loan, tone analysis, that kind of thing, right? Okay, speech to test again. So we have a lot of call information from when we call them, just in case, you know, something's amiss and we need to double check. One of the philosophies we have at least so far in this stage, not in the growth stage, but in this stage is because we're learning, we need to not assume. So if there's any discrepancy at all, give that person a reason to clarify, right? So we're very, very much on the, just we'll still give you a call and tell you what we think is amiss and just to let you clarify. You deserve that chance, right? I mean, I remember my college admissions time and doesn't matter how many exceptions you're always like, no one likes getting rejected. Yeah, so in this rate, so what I'm doing over here kind of all combines into verification, risk assessment and risk pricing, which we're not doing right now, but we'll start very soon, right? Risk pricing is easier when you start getting repeats, which again, I thought I was months away from multiple loan functionality, but I'm having to do it already. So, okay, challenges, so lots of challenges. That sounded like a very rosy picture. Did it? Lots and lots of challenges. All right, so one is absence of organized data sets. Next 10 minutes left. Okay, great. So we're in time, awesome. Absence of organized data sets, but this is awesome, yeah. In the US, there's just so many of them, so you get started quicker, but over here, it's a beautiful moat to build out. You wanna pull your hair, but it's a beautiful moat to build out, so I'm fine with that. Absence of labeled data. Can you imagine we're trying to learn, we're trying to machine learn the correlations on something where we have no data set, not training set, we just have to make those mistakes ourselves, then learn from them, then not make too many of them, but make enough diversified mistakes, control our portfolio risk, our experimentation end to end, personas, that's crazy, right? So the absence of labeled data makes it a very interesting problem, which is exactly why the first approaches have to be hypothesis-driven, right? Okay, we, okay, here's the something. So our average loan tenure is 10 months, but we still need to lend more every month, right? Even before we hit the 10 month mark, and that's what we're only gonna know about the first month cohort. So as much as I'm slightly worried about what might happen a few months from now, I mean it makes our jobs harder, right? But you have to live with that, and you have to kind of become even more creative and hopefully smarter to deal with that. Okay, this one, all data collected is probabilistic. So what you're saying might or might not be true, might be true to some extent. The way we do it depends on the time at which we collected, the source words coming from and the data type. So the probability of correctness of your name from Adhar is much higher than your name from voter ID. Voter ID spelling mistakes, oh, and we all know that, there's so many nods in the audience, right? Okay, and lots and lots of examples like that. It's also temporal. So when we had to implement multiple loan functionality, we had a very big debate for a good week on what belongs to the loan, what belongs to the user, what could have changed, what could not have changed, because we were trying to be nice and not re-ask things, but come on, you could have gotten married. You know, even if it's just three months, you could have gotten married, what do I do with that, right? Data birth should not change ideally. Name again could have gotten changed, at least for one half of the population, right? How many of you thought about that? I had to bring that case in. Okay, so it's very interesting, you know, and those boundaries become more and more blurry, especially when it comes to documents and so on and so forth. And obviously, so lastly, there are ever-evolving data sets, right? What do I mean by that? So for the first time ever, we're treating a loan application like a product design problem. What does that mean? For every new persona, I see. I react, right? Oh, an Uber driver came in. Okay, well, this one went through this application, the base case one, but now I have three more ideas for what I should ask the next ones. I add it. A student came in, an accountant came in, someone else came in, where a delivery boy came in, a self-employed person came in, so on and so forth. So because my application keeps evolving, actually it makes my life on the data analysis side even harder. And both are a necessity. I need to continue to learn on what to ask, how much, and obviously we go over both sometimes and then we cut it back down and then we ask more. So the application keeps changing, the personas keep changing, evolving, and then of course we still have to analyze it at the end of the day, right? So it's just, yeah, fun, fun times. All right, very quickly, feedback loop importance. So mostly, you know, this is how ML usually works from what I've seen you ingest, you train, you predict. This is what I would love for all of us to do at least. We're trying very hard to do it because of mostly our challenges, right? So we have to act. So we actually at first have manual feedback going in. And there's feedback under multiple sources. I'm gonna go into that. Automatic feedbacks like, you know, delay in any payment, 1, 2, 10, whatever, one day delay has not come back. Failure to pick up automated reminder call. A failure to pay after seeing the notification and opening the app versus not seeing the notification, not open the app versus seeing the notification but not open the app. Versus open the app at least three or four times, still not paying, still not picking up calls, right? Very, very different. Manual feedbacks, of course, because the calls we get within new ones, which we believe in creating taxonomies, to continue to get that information back into the system because as of right now, the reason why decisions post manual intervention are different than the automatic ones are because the manual person has access to more data. Analysis is about being, right? Just access to more data. So I won't go into too much detail, but basically this is data and this is all the cool Lego blocks we built. The things which can get a feedback loop are the verification or any past patterns or defaults or the manual intervention on the present case, either from a data or decision perspective or the past cases, again, data or decision perspective and so on, right? Very quickly, I won't go into this in too much detail, but just bears saying, obviously you don't want to be a big brother kind of thing. So, you know, you have to always take consent. This is lending though, so you kind of feel good about it and they know that the more data they give you access to, so you make it a choice, right? The more you give me, the more higher your approval rate has a possibility of being and you really see what people end up choosing, right? I think here's the other one. So I know Indian FinTech companies end up using this a lot. In fact, ZIP Code has traditionally been the biggest definer in Indian traditional finance of, you know, this is back listed. We will not give a loan here. There's all these lists that you can buy and stuff. In the US and UK, things like that, these are actually not even allowed, right? So I think India does have a long way to go in terms of not even allowing things like age and gender. I mean, we all know how much discrimination there is on age in ZIP Code, right? At least over here, even in the traditional sense. So I think it's interesting. Hopefully one of the things I'll tell everybody I can do in India is start telling us to not treat everybody in the same ZIP Code the same way, just so we can actually ensure their growth as opposed to ensuring they do not grow, right? Are we excited about this now? Do we kind of get it? Yeah, quite fintech, why India, why now? We went through this. Why now? Because everything is exponentially increasing in the next five years. So it's gonna be a very exciting decade. Thank you. Any questions? I know I didn't leave much time for it, but. I heard this very good discussion. But I just want to know how are you implementing a continuous risk assessment? So if you see in the recent past in India, in Indian scenario, at the time of loan approval, it was, means all the scores were good and the loan was approved, and later it was defaulted. Yes. Right, because, and the action was too late, too little, right? So means, are you implementing or thinking of continuous risk assessment? Yes. Yes. During the whole duration? Yes. So you just elaborate. Well, so from two perspectives at the very least, right? One is that we keep learning from the defaults and make sure that those cases are tuned down. Second is on the portfolio side, we keep rebalancing it just so we're always aware of the risk exposure. And again, it will directly affect in real time any future applications. So sometimes there are people who create worthy, but we know we've had too much portfolio exposure in that. So say, for example, I took a loan, I'm a salaried guy. Yeah. And suppose two years down the line, I lost my job. Yeah. For example. Yeah. So are you tracking that? Yes. Yeah. And with what media? How are you tracking? Can you just come inside? Well, I mean, so I would say we're only tracking that till the point of the loan. So if your loan is for 10 months, I track it for 10 months. If something happens two years from now, I don't really care so much. Oh, sorry. Was there another question here, Samu? Hi. So I see that a lot of the work that you guys have done is in the verification space. But particularly on the risk assessment side, we hear a lot about social scoring. So just wanted to hear your thoughts. And have you tried anything there? And does it really work? Great question. So I worked at Facebook on graph search. And all I will say is the reason you see us focusing on verification is because what we believe is, as long as the capacity to pay is there, so we've checked that from your cash flows, you're not a fraud case. We've checked intent through other soft parameters and behaviors. It'll mostly come back. And so on the social scoring side, while we have at least 150-ish variables, I mean, it's very easy to increase the number of variables, the correlations are not going to come in. So for a lot of the other things, the only thing I can advise is data cannot be backfilled later, which means take as much of it as possible. And then perhaps two years from now, we see something, but right now, we just take it and sit with it. Keep seeing it on every month basis is there something interesting occurring just from an exploratory perspective? But nothing put into this thing yet. Although one interesting thing from WeBank. So I went to China recently, and the WeChat folks have so much information, of course, on social networks that they were trying to say, even though people think they can fraud, fraudulently create social profiles, if you actually analyze the types of graphs, you start seeing a difference between the ones you can create fraudulently and the ones you just cannot create. Real dynamics are not that easily completely replicable. Hello. Hi, here behind. So this is a really interesting talk. So the amount of profiling that you guys are doing is insane, right? I mean, it's just crazy, the amount of variables that you're taking to decide all these things. So I want to know, how are you measuring success? Compared to traditional means of giving loans and this crazy amount of stocking that you're doing, what success are you seeing? So are you discounting? No, no, if my traditional system said that this guy is good. Makes sense. So that's a great question, awesome question. I think that is the benchmark. Where we find it interesting is 80% of even our portfolio has no credit score. So it's very hard to benchmark from traditional because they wouldn't have gotten a loan in the traditional sense, right? So it's sort of like, pray to God, a lot. Track, like a hawk, their first payment, their second payment, their third payment, by the fourth payment, even if the tenure is about 10 months, 11 months, you are resting easy. I think that's what there is. On the credit score side, that's where benchmarking actually can be done. We do have a lot of people in the 650, 650, 680 credit score range, which would find it very, very hard to get from a traditional lender. And again, that's where the reason we've given it to those particular people, not every person with a 610, but those particular people is because we realize the reason for why the traditional credit bureau had given them a low score. But we believe, for our reason, that did not affect or completely tell us about whether they could repaid or not. So we believe they were being unfairly docked, which is why we gave it. And then we just, again, you just track. You hold your breath and you track. All right. Thank you so much. Thank you so much. Yeah.