 Score is both purposeful purpose, I mean attached to a purpose which is simple for you or could be generic across all the spectrums. So typically a credit score will be generic, like anything you'll get from FICO or Trusting Social, like any time you buy a credit score from some vendor, that's meant to be a general purpose one that will work across all of their customers, whereas internally you'll typically build a score on top of that plus other things and that'll be used for your own lending. So if I'm running my own startup, doing my own lending, I would probably purchase a score from you and I'd do something on top of that. And you'll probably purchase something from Sibyl and then probably you'll look at a few other things that might be more special purpose for you. So in respect of whatever I do, I still have to, I mean all the credit score becomes input to my own personal. Yes, typically you'll want to do a little bit more than just use the score without any modification. Okay. So I mean here's an example. A consumer unsecured loan is when you lend money and you hope the customer pays it back and if they don't, you're out of luck, whereas there are secured loans like for instance a home mortgage, people put a lot more effort into paying those off than consumer unsecured because if they don't pay that off, you can eventually repossess the house and they have to find a new place to live. So they'll put more effort into paying off a home mortgage than they will into some consumer unsecured. In the U.S., medical loans are some of the first ones that are written off, so people will typically try to... So I want to put in the context of Australia because in Australia, student loan is also a big issue in terms of policy. So in Australia, the government runs student loans and basically the money is coming from government funding. And basically you won't need to repay the loan if your income after graduation is less than let's say $50,000 a year, and the thing is student loan in that case is a bit different because they are managed by the government. So the ATO is going to manage your income anyway, so they're going to know how much you earn per year and they enforce you to pay back that through the tax. So the reason why I asked about student loans is because there's no such concept of guaranteed student loans in India, right? And there are startups which try to sell loans to students, right, and they get into all kind of trouble in terms of default. So one of the stuff that they keep asking about is like, if there had been no history of these guys, there's no jobs, no sets, nothing. How do you even give them a loan? I mean, would alternate models work or is it like, I'm not going to give you a loan? Well, okay, I mean, you still have some predictor of what earnings are going to be. Okay. So the question is, so the two questions are, will the person earn enough to pay it back and also are they likely to do so? So they're, given this, you can still, keep in mind like any loan you issue is across a pool, some of them are going to go to LincWint, and you just have to figure out the interest rate to charge the remainder so that you can make up whatever you lost. So what you could do here is, for instance, a bunch of IIT grads are probably going to make pretty good money, Pune University, probably a bit less. Okay. Similarly, there are regional differences in repayment rates. Like some cities and some states have much lower repayment rates than others. So you would probably, if you haven't been in one of those areas, you would probably charge a higher interest rate or just not do business there. And you'd probably observe patterns off across colleges as well once you actually have some historical data. Do you see these kind of problems in Vietnam? Because the reason why I stood in LincWint was important for me is because, A, there is not much prior data, right? And there is a distinct, uncertain community in some of the developing countries. So do you see more use of alternate data sources for them compared to other? So in Vietnam, our business is not only focusing on any particular customer group and any particular segment. So, and we haven't looked particularly for student loans in that context. But, so then let's move to the second part of it. I mean, you have any other questions on correct scores before we move to the alternate data sources? I have a question, yeah. Please fast. So how standard is the 300 to 900 frames? And I've always been curious, is the correct score sort of like linear or is it something like normally used to be correct? Would a large percentage fall in the middle? Or is it something like, you know, I hope you got my question. Yeah, yeah. So, I mean, I can tell you what, the only place I've seen data on this is FICO. And even that is typically only for specific use cases. But typically, like for certain use cases with FICO, it will be kind of a, not quite normal, but like a bit fatter version of normal. I mean, the 300 to 900, you could easily make it 0 to 1. They just, they think it 300 to 900 is easier for consumers to understand. That's just marketing. And it's typically a smeared out normal. As far as the applicants go, like where that normal is centered is to a great extent based on your marketing. Like who you're marketing to. And then, like put it this way, if you're marketing is targeting a bunch of poor people with a lot of, who like lose their job a lot and just don't feel like paying the bill sometimes, it'll be a normal closer to 300. And if you are targeting like a bunch of rich yuppies in Greenwich, Connecticut, it'll be pretty close to 800. Do you have your own predictors for what you have? Do you find the civil score still has predictive power? Even when you use your own signals? We have actually found traditional credit scores don't add very much to what we already have. In the past, we've, it's added a little bit, but it wasn't like worth the extra engineering effort to keep it. And at this point, we don't use it at all. Whatever credit score you come up with is limited to whatever data sources you have. Yes. So there is a lot more data sources outside your control. Correct. How do you work on this? We, I mean, any data sources we don't have in our system, we obviously can't use. But the thing about our data that makes it well suited for our purpose is that typically our data is based on things like transaction history on merchants where symbol is available. So typically speaking, we've experimented with alternate data sources. A lot of that, like that are very disparate from this. A lot of them, we get a lot of data on people who never even use Zemaito and therefore never see the symbol button. Never use Book My Show, never use Dunzo. And so if you're not on Zemaito, Dunzo, BigBasket, or one of the merchants symbol is on, it doesn't matter how great or how bad we think you are, you never see the button and you never have the opportunity to use it. Not all of this data is credit data. BigBasket, Book My Show, whatever data you use is credit data. Yeah, multiple means is enough. Credit data, so you still rely on them to come up in the scope of credibility? So we actually use very little credit data. We've used almost entirely other things. But some of it is financial in nature but not credit. But the point is our data is also tuned to our use case, so we are, so every time we experiment with other data sources where we just go out and get data on people separately. Unless we're just getting data on people we know are also customers of merchants that simple is on, it doesn't help us. Like we don't even know if it would work or not because those people don't wind up using simple and therefore we don't know. Let's say you don't have any person who have payment history with symbol. So we have to accept the money that we lose on some of the bad customer at the first point as a cost to get the data. So we have to invest into that to obtain the data. And that is the same situation with trust in social as well. So the cold start problem can't be solved without spending a lot of money earlier? It doesn't have to be a lot necessarily, but some money. Some, yes. But compared to the whole running cost, then that is still acceptable. You should not push them too hard on their trade secrets, but generics are okay. The thing is we, it's not that we are not willing to share with the community. But if we say something, then people who are, you know, want to do that kind of things, they will- People out there who really want to steal our money. So we have to be a little quiet about these things. Yeah, okay, fair enough. So let's come back to the hard one, which is, do you employ collection agents? We have our own people who call up and say, please pay your bill. And you get some spam messages, but we don't like send anyone to our house to take your furniture if you don't pay. Okay. Although I think some of the merchants we're going live with involve rentals, so they may wind up taking your furniture if you don't pay. Okay. Or your TV or whatever it is they rent. Okay. I don't know exactly how they work if you don't pay the bill there, but it's all in the lives. Fair enough. But you said trust in social used to do credit scores, right? But are you also trying to go into the model where simple is going, where you're just putting your own money in lending? So, there is, as far as I can tell, there's no company that's convincingly use machine 90% of the story. I mean, there's quite a few companies that have done that. So, Lending Club, Klarna, Afterpay, WePay, those are some billion dollar companies that do this. I'm not 100% sure if the WePay lending line is bigger than a billion, but it's close. There are several more in Latin America whose names I don't remember off the top of my head. So, it certainly is out there. Also what happened is a lot of consumer unsecured bioalternative data has started actually going through banks. So, Goldman Sachs discovered there was a ton of money in this and now banks do it. And that's what Goldman Sachs Marcus is about. Goldman Sachs literally never ran consumer banking before. Then they discovered how much money there actually is to be made by doing this. And now suddenly, it's different brand, it's Marcus, but they do it. So, I would say FICO is still there because FICO works. Like, no doubt about it, it is, at the same time, the alternate stuff is covering use cases that FICO does not. And as far as I can tell, it is working. What's the example of those Reniche cases where people believe it of their own model and how it's just been because they have specific signals, like in your case, for your use case? What I'm struggling with is that you still have like the FICO paradigm. As you said, broadly it works and there are people who are relying on extra-general generic credit needs. Why haven't we seen a more advanced way of credit scoring that's become its own paradigm and not just for the niche use cases like our lending club, et cetera? I'm not sure exactly what you mean by niche. I mean, these are double the difference in the market now. That's because the niche is on the first one, but specific use cases that are not generic. So, I think your question is about why can't we have a generic kind of alternative credit score that works on all kinds of problems, right? Is that your question? I think this goes back to the root of machine learning. Unlike a person, if you hire a person and assess the law to approve the law or not, then the person can work in almost all of the law that you brought up, right? But machine learning model is in a different way. It needs the training data. And normally, the model can only work with the new data that's quite similar to your training data. So, if your training data is, let's say, on a great loan, then it may not be going to work in cash loans, for example. So, that's why we have to build particular models for different kind of problems here. I would also suggest that if, imagine, FICO had not come first, there would be a need for a thing like that. So, suppose there are machine learning models that are currently used for alternative cases that are also as good as FICO. If FICO comes first, probably no one's going to switch to it unless it has some dramatic advantage. So, there's sort of a first mover advantage. If FICO was there and you're as good as FICO, I'm just going to stick with FICO, except in those cases where FICO doesn't work. So, that plays a role as well. And, like I said, FICO is actually pretty good. It's hard to beat. So, when you have that level of data, it's even much easier to beat when the data is missing. And that might be also why the alternative stuff is only used when FICO is unavailable. But also, to squeeze out basis points, many banks are, in fact, incorporating alternate data on top of FICO in traditional credit histories. Another thing that is a little bit tricky is that in the U.S., there are significant regulatory risks to doing this. So, specifically, there's sort of laws related to fairness and lending. So, I don't know how many people here are familiar with American racist history. So, the thing is, if you actually, like, were to plot a delinquency rate versus FICO score, and you do this for each individual rate, so you have Asian people, you have black people, you have white people, et cetera, you discover significant differences between these curves. And so, that is signal that you could use. Now, a machine learning model might accidentally learn this signal, which is actually illegal to use. And, in fact, they do. Like, my first foray into lending was this project where I was just trying to teach these guys how to do numerics and Scala. But then I noticed, wait, I can make this model a whole bunch better. And then, I'm like, look, I made the accuracy go up by, like, 8%. Look how much more money you could make. And they're like, yeah, the regulators will never let us do this. What, you were using at this point? No, I was using some location stuff and a couple of other things that happened to be correlated with it. I was, it was basically a machine learning model that just happened to pick up proportions of ethnicity in certain regions and a couple of other clues. Indirectly, that was still an issue. That's interesting. So you were approaching ethnicity. I mean, I had a very nonparametric model. I didn't know what it was predicting. I just noticed, look, there are these patterns. This thing seems to detect them. And, like, it's like you train a neural network and suddenly it picks out ethnicity from locations and which stores people shop in. I mean, certainly not with anything like 100% accuracy, but, like, 70% is enough to squeeze out more money. So they told me the regulators would just not allow this at all. It's a no-go. So that also prevents some of this from happening at least in the US. I don't know what happens in Europe. So could you say more about, and without, sorry, I was just one, I was just, I don't know. So, I mean, if you lend out to, I mean, if you only lend out to people who go to FICO score, if you lose the money that comes, then they pay you less. So if you want to, you need to lend out to people who are FICO scores. So how do you optimize interest? To get that revenue, you need to lend out to people with lesser FICO scores also. So how do you optimize that? How do you really optimize that? So ultimately, there should not be, like, if you're running a good loan portfolio, it shouldn't be higher in, you shouldn't be making more money off one cohort than the other. So let's say I'm loaning to one cohort with a 5% delinquency rate. I have to charge about 5.3% interest just to break even, factoring in the cost of running an operation and making a profit. I'm probably charging like 7, 8%. And that gives me about a 3% profit margin. I mean, minus the actual expenses of hiring people in collections and so on. Now I want to lend to a group of the 10% delinquency rate. I will probably have to charge like 12, 13% interest and I'm still making that 3%. Based on the... Yes. And so ultimately, unless a segment is underserved and you can charge excess interest rates, which is currently what's happening in consumer unsecured right now in the U.S. and Europe, but that is, as more people enter the market, they're just offering lower interest rates to compete and it'll get back to the same as everything else. Anyone else? Yeah. There are actually less academic studies on this than you would expect and they're mostly interested in something else. Like that validation is incidental, but at the same time, all the lenders that use this and don't go under is kind of the skin in the game test of whether it actually works. But if I'm wrong, I just lost money. Like this is one of the things I really like about working at Simple. Like I can ship code and if it doesn't work, damn, I just lost a bunch of money and if it works, the company made money and either way, the proof is in the pudding. Yeah, so most validation data sets will include this. So I don't know exactly what you mean by a strict A-B test. Like what is a strict A-B test here? Oh yeah, I'm doing that right now. Like I do that all the time just to see. Like you're using transaction history, like if you're using, you'll be using let's say a large amount of data for use of this system. Now recently, this recent history might be more positive, like now, but maybe because your model is using the bold history and maybe not just like giving a very different reason, you might be actually not making it to some users who are now in a better situation, they might have a lesser probability of getting it. Just because they are earning better, probably. So is that a situation that you might be in? I mean, now we're kind of getting into things we can't say that much about. But you adjust the scope based on, when you exaggerate the scope, I can say you do that setting. Sorry, what do you think? That's a very high level point of what he asked. If you find a scope for a person at a point of time, do you re-evaluate based on his pattern? Oh yeah, we re-evaluate almost real time. Just one more question before we move to the financial. Our question is that in terms of affordability and credit assessment, so do you see that as different items or do you kind of have affordability as part of credit risk assessment? Just wanted to get your thoughts on, affordability wasn't someone who has a very high rate over his loan, or do you kind of handle the loan amount? I mean, it's simple. We're not lending enough money that this really matters. Like we're mostly your movie tickets and your sponge. So it's probably affordable to you. Do you see anything to add from what you're doing? So I think even though we are not in the micro-landing at the moment as a symbol, but our product is still kind of small compared to let's say a car or a mortgage home loan. Then affordability is not a very big deal here, I think. But I mean, I know that people who do things like car loans and home mortgages, they very much look at debt to income, stability of income, that kind of thing. So if you earn a lack a month and you want to buy a five-core house, no matter how good your credit score is, they're just going to say no. Okay. So let's move to the second part, which is on the financial inclusion thing. You have done a lot of work on the alternate stuff because you said there is a lot of unbanked community in the Southeast Asia. Can you explain a bit more on that? So basically, as far as I remember, there are about 1.7 billion people that haven't got any banking history or credit history. So basically, for those people, five-core score doesn't work, right? So what we do is we use annotated sources of data, and instead of, for those people, instead of they are rejected by the bank or they are given by, along with a very, very high interest rate, we distinguish them, we distinguish the wood customers from the bad ones, and we can still help the bank to give a better loan to those customers. That's how alternative credit score is working. We are not trying to replace five-core as Chris said. Five-core is still working very well in some certain applications, let's say car loans or home loans, but in terms of smaller loans like a few thousand, for example, I mean dollars, then our problem is try to help in some certain groups of people that haven't got any banking history. Okay, so it would be more accurate to say that the five-core, the target segment of the population that does have five-core scores, and the one that you try to do is complementary or exclusive? It's a bit, I think it's complementary, because what five-core does is they will give you score anyway. If you don't have any banking history, they still give you some score, but it's normally very very low, and the bank will not approve your loan or will give you a loan with a very high interest rate, because they don't know who is the wood one and who is the bad one in that group of customers. Okay, so we have a lot of usury laws in India, which probably are learning beyond a certain percentage. Do you have such laws in Vietnam and so did you share? I think for, I'm not sure about the western countries, but I think most of the Asian countries are going to face that the same problem. So basically for some groups of people, especially like I want to put in the context of Vietnam, there are some certain groups of people who are running short of money all the time, and they need some cash flow to recover to their business. And basically the bank are not going to give them the loan because they cannot basically the bank do not have enough information to assess the credit worthiness of those people. So basically they have to run into the, I can say it's a kind of black market where they borrow money from some people, and those people they are willing to give you a loan without any property or anything to assure for the loan. But the interest rate is very high. Let's say they charge you 3% or 5% per day, something like that. And if you don't pay the loan, if you don't pay that amount, that interest is going to become the principal next day. So the problem is very, very, and it caused a lot of, it's like people will be trapped in debt and they can't pay back the loan. Anyway, so Vietnam government they try to have this kind of issue, so they want to push the bank to give the loan to those people, but they haven't got a clear policy how that is implemented yet. So what internet sources of data that you use for these people? So I think it's running to a realm that I can't review much, but in interesting social, we use telco data and other sources of data like social media, anything that we can get from the customer online. And we also have our own platform, our own app, where let's say if you want to have a loan from our company, you install an app on your phone and we are getting consent from you that we're going to collect a few items of data from your phone usage and we can use that to predict your score. How reliable are the predictors? Better than zero or? Yeah, so in terms of performance, I think so far in the markets that we are working on, the banks are giving us high good feedback in terms of performance, at least better than what they're doing if they don't have our score. So you have a feedback loop about the loan performance with your lending institutions? Yes, we have to run this back test dynamically with the banks and to assess the performance of our price scores. So when we start about the machine learning cold start problem, you solve that by basically getting data initially and prepare to write across it in the amount of losses. So once you get to that problem and you start using all the data sources, then you need for the model to perform better, you need feedback from the underwriting institutions about how the loan is worked. That's how you get better? Yeah, it's kind of a loop. So first we have to do the cold start and we obtain some training data and then based on our prediction model, we give the loans to some certain customers and then we have obtained feedback in terms of loan performance and we use that as an extra data to improve our model. So for the current business model, because we partner with the FI, so we just act as a provider of the score and the finance institution will decide whether they approve the loan or not. But in our new platform that we just launched in Indonesia, we will design whether we approve the loan because that is basically our planning platform. Any other questions? Keep asking. So let's say we have a one to ten data sources. Now the problem arises that for some customers we have one to three hundred data sources. For some customers we have five, six, seven. For some customers we have two, eight, nine. Now how to do the combined decisioning or how to make a model because you can't make a model on all the condition combinations. So how you use all the different alternate data so whether to move the plan interest this year? So I think now we have some models that are quite robust in terms of dealing with missing data. So basically if you have that kind of situation I think we can still use all 10 data sources and for some certain customers that you know miss some certain sources of data we accept that as a missing value. I think we have to cover that in our training data. You can also like one this is just sort of a general machine learning tip I can describe it. If let's say you build a model and it is dominated by one factor when you have that available. What you can do is train just delete that factor retrain the model get something else with lower accuracy and then build an ensemble of the two models and this is typically a way to handle it. So you use one when it's available the other one you still have to you know evaluate the thing as a whole find its rock AUC find its precision recall whatever but so the point is when you build this combined model you still have to evaluate it the same way you would evaluate a single model but that is typically the way you would handle situations like this. I mean another thing is there's like imputers of various sorts in your favorite machine learning library you know go with the average of the values NAN. So you can always do that kind of thing as well depending on how important it is or not. So there's a number of things you can predict I mean one of which is like so FICO is basically attempting to predict delinquent on at least one loan after 90 days but there are other models so when I was sort of describing the like depending on what specific lending you're doing how people will pay off their house first and their medical bills last because they don't want to lose their house whereas the hospital can't repossess their repaired heart. So typically these things get input into the model and so you make these these adjustments you make to let's say FICO are often based on exactly things like this. Additionally there will be if you can additionally attempt to predict let's say recovery after 180 days and if that is non-zero you multiply it by the time value of your money so these all these games go into actually pricing a loan and people who are making much bigger and longer term loans than I am will be much better out of than I am. No it depends on the kind of loan so in the U.S. if you're talking about like mortgage lending there are other risks like there's interest rate risk which is to say that you have a let's say a 30-year loan with a certain interest rate so it has a certain value but if interest rates go up above then that reduces the value of your loan relative to just treasury bills that you might buy today. Another thing that happens in the U.S. so U.S. is a really weird structure for home loans it comes with an embedded option that you can actually pay it off all at once if you want so this is called prepayment risk and what can happen is let's say if interest rates drop significantly the borrower might take out a new loan to pay off the old loan and then suddenly they're paying lower interest rates and you get nothing that's called prepayment risk that is another major factor that goes into home lending in the U.S. I don't believe it works quite the same way here prepayment works here prepayment works yeah so you get the embedded option but uh but durations are also typically not 30 years right yeah 30 is unheard of but 2025 we usually give it but there is also a charge saying that if you prepay with an certain thing you're to lose there's there's a penalty so yeah that's how they normally are so the penalty is probably carefully calculated and it's basically what you would lose yeah um in the U.S. you don't actually have those penalties I thought there were also balloons here right yeah so so the other question that I have for you is using what kind of water I mean what kind of inputs is it like a single factor that has predictive values like location or is it like you need a lot of factors to come and say this is good enough so um again I can't review much but from our data we can extract up to 10,000 features and we run like a process of machine learning to to reduce to about 100 or 200 factors that are most useful so basically if you ask if you ask the question that whether we can decide which factor will affect the outcomes of the grid scoring then I think I can't give you an answer because machine learning models work as sometimes it works as a black box and there are many factors inside that interact with each other that we can't actually give you a clear answer that you know which factor will lead to the result okay the reason why I asked that is because when you start off right you start with heuristics you just keep improving upon it and at some point of time your heuristics still work or it doesn't so I think in terms of our our situation um sometimes prediction with a very good performance is not a good choice because in that case you can come up with the overfitting problem with your training data so we try to balance between our prediction performance and the performance in the real um long that we are taking with the bank any any other questions on alternative social models so so we we are running into some situation where if you train a model even though you don't see the test data you sometimes you you you tune the model and you can again a better performance on the test data but you unconsciously tune the model that it overfit to the your test data even though you don't fit your test data into the into the training process so that's why I'm saying that we can can run into the problem of overfitting even to the test data yeah we have to figure out how to balance that um I think I can't really much but we have to run back testing very frequently you don't know what your features are you are going to do some approaches to find what they are but as as you get more data if you rebuild the model completely your features will change so what kind of approach do you take are you okay to shift your parameters or do it in a way that is incremental and how do you how do you rebuild the entire model yeah I think we have to accept the fact that if we train a model and then we use that like for a long time it's not going to work because our data are going to change and the way they the customer behave and the way data evolve is different we will change differently so we have to change the model we have to train a model we have to fit it again and we have to do that regularly probably six months we have to balance between how much you cover and the performance of your model I mean we we train much more frequently basically the simple cycle is you pay pay the bill twice a month so we basically train that frequently one other thing I would mention is that at least for us there is kind of an adversarial nature to this as well so essentially there are there's first of all ordinary underwriting we're doing to try to determine if you personally this real person sitting in front of me are likely to pay your bill as a good credit risk and there is separately the fact that you're not actually a person sitting in front of me you're a person out there in the world sending signals to my server somewhere you might be shocked to discover there are people out there who will just attempt to do this over and over again take out a thousand bucks take out another thousand bucks try to get a bunch of free food that kind of thing so the so another aspect of all of this is we have to whatever we're building we also have to just be careful that some human won't be able to come up with an adversarial attack on it and then oh where did that lack go for the what so it's some combination of rock AUC we also just focus on the num I'll take lower rock AUC if I can accurately get also calibration and also since we're about like since it directly affects our growth just finding how many people are actually in the good set so I'll lower my rock AUC if I just get more people that are definitely less than 10 percent or 20 percent risk yes it's all in house what is the size of the data I work with I don't think we've publicized that so we're in a weird regulatory position but I don't know the full details of it we are not an nbfc we are so we are technically lending off another nbfc's book but I believe it's not exactly alone it's I don't I'm I'm not a lawyer and I don't know those specific legal details so sorry just India right now the same questions about you so Tristan so so started our first service in Vietnam and we extended to Indonesia and now we expand to India and there are a few other countries that we are heading to India so we as I said our whole point is using alternative sources of data to do credit scoring so the way we do is we we use social social interaction data as the as the alternative source of data so that's why we and when we use that to do the credit score and that is something like trustworthy of the person right so that's why we named the company as a trusting social but by social you mean Facebook what's that kind of thing or any any other data any other data that that is considered as social let's say you your friends or anything can be called as social right it's not only the social media sorry can you repeat the question so at the moment we consider each is home sorry it long application separately so we assess that credit score when they make the application so because we do not provide landing platform at the moment so we are not running into the business of expand like increasing the limit the bank the financial institution will we do that have such policies in mind I think that applied to Simba okay we do give credit bumps um typically if you pay your bill on time regularly and also you look like you don't have enough credit so like if you run out of credit on day five of the cycle it's a 15 day cycle so if you run out of credit a lot on day five um probably a bump is in your future assuming also our model predicts you're likely to pay the bill after you get the bump it's model based so it's model based predicting whether you are likely to pay the bill and then I mean I can call it a model but it's based like in terms of whether you so there's two questions first of all if I give you a bump are you going to pay the bill and the second bit is um do you even need it so if you have a 5000 rupee limit and you're spending 600 bucks a month you know why would I give you a bump that's just risk with and you're not going to actually transact so why bother um so that is a really simple model it's basically just sort of you know are you really spending anything close to your limit and if you are sure it will give you a small bump and if you spend close to the new limit you might get another one what is the documentation you expect text and credit line nothing okay I just signed up I will ask you like your email address once you install the app and that kind of thing and then what is the typical credit line like typical credits can be a few thousand bucks okay it's not differentiated based on the credibility if everybody is informed no so it'll it'll start it'll basically we're sort of a longer-term relationship so we'll we'll start you off probably not very high and then as you pay your bill and as you um start to hit your limit or come close to it uh then we'll start giving you bumps I don't honestly I'm the wrong person to answer about the specific regulations I know we have lawyers and every so often they say you can't actually do that and I don't do it um so I think I can't I can't actually name any specific model that we're using uh yeah we try a lot of we try a lot of models and we have to figure out which model works in which situation and we can't name that specifically if black box you know what's happening it's just giving us a I think in our case interpretability is not so important as long as our model works then we accept that yeah I would tend to concur like rock AUC is something I can turn into money if the rock AUC is not high enough we lose money yeah um interpretability it helps me debug the code but on the other hand like the interpretable models are rarely going to perform as well as the complex black boxes so so we don't provide exact reasons so we probably can't do business in europe if not for example let's say you want to probably going forward you want to learn only for let's say food so to answer this question I think I can refer to the interview that Jeff Hinton father he can be considered as the father of this learning in an recent interview he said that if you can explain a model then you won't actually need it because it's so simple if it's that simple then you don't need a machine learning model I'll also sort of describe a little bit just sort of about how these things are often built in practice so there's lots of pieces of that are interpretable like for instance there's sort of a whole library of things we've seen fraudsters do and here's some code to detect when someone is doing that so most of these are sort of interpretable because we know what the guy's trying to do and we know how to spot it and it's relatively straightforward procedural code that basically looks for that specific pattern and fits a few perhaps okay then these things feed into another model that is much less interpretable so there's there's very often interpretable pieces of a model that then feed into a big black box that mixes everything together so the interpretable bit might say I think there's a 40% chance that this guy is doing this fraudish thing and then there might be another piece that is also fairly interpretable that is I think there's also another 10% chance he's doing this other thing but I'm not really sure and then the big meta model also says and also he's from a suspicious area and and I wouldn't block him on the basis of any of these things by itself but putting it all together it tips him over the limit so this is sort of how a lot of things are engineered in practice and then typically most large models will have various sub models that are dealing with one piece of the data so like we like we might have a thing that sort of deals with like we have one thing that just deals with email addresses some email addresses are more suspicious than others and like that's kind of interpretable still a bit of a black box but at least we know what goes in what comes out and this is about this part of the data this model is about this other part of the data and then they all feed into a much bigger thing at the end of the day so if if the government enforce some regulation that that requires you to expand the reason why you reject a loan for example then I think we have sometimes we have to make up the reason and you know it's not a real reason that we understand but we have to make up to answer that specific examples on made-up reasons okay right so I have an example on 2009 when city bank actually said that I kind of basically according to a dispute with city bank with the credit card score and one of the hard problems that they had to explain to the government as part of the dispute was they basically linked some of their persons identity with my students course okay so the point is in the code I argued around it they basically said it's a severe technical error and it went on for about half an hour and at the end of the judge said you know what I don't think you guys understand what you're doing right and so I'm gonna want the case to him in five minutes if you are not able to explain it right so it's in India it's my personal case which I argued before the judge okay that was in 2009 and the side effect of that thing was that for three years I couldn't open any credit card or bank account because until that time my credit score was locked as due and stuff like that so right in some form but I got the money back I got the money back and I probably also made them pay half of my home loan as a mental thing and all stuff like that so it got resolved so the key thing about interpretability is it's really not important for the FinTech companies to do it unless until there's a law which says that you thou shalt see it okay and I think that is the part that is important so that's why he's saying he can't do business in GDPR EU areas yeah or put it if we did business in a GDPR area we would just have to charge much higher interest rates to deal with lower accuracy right so that's the price of interpretability and why you're seeing you're seeing this and there's not much you can do about it it's either this or more interest rates yes or similarly like we in terms of the same trade off is there for privacy if we take no data about you we have to charge you the average of you and all the fraudsters that are also saying I want my privacy um and that might be a lot I mean maybe maybe he faked it I don't know I think that is too specific and it's too rare to be to be considered I think it depends on the law but probably 12 months or 10-4 months something like that but if you ask me about those questions I think I am not the right person to answer because I don't know much about the business how the business is going so that's why you need that no I think the identity is very important uh or not important as long as they need the goals which is like I think you start by saying that this is the amount I will maximum loose or I will maximum gain or something around that yeah we typically focus on like how much money we are likely to lose and I mean as long as you go you lose much less than that I think you don't really care like put it this way the business guys always ask these questions and ask if we could give something they don't use the word interpretability specifically that's a technical term but they want to know more about what's happening I'm like look do you want I can either give you less accuracy and we'll lose more money or you can uh just sort of accept or you can read this paper on gradient boosting or neural networks or whatever I mean ultimately at that point that's the trade-off and you explain that to them and they never read the paper on auto encoders I mean it's simple I'm mostly also running that knob so I think our model works in a a bit like higher level we're not building into that specifically but the model is like we are not saying about like 50 or 100 models different models right we're talking about like five 10 different models for different products but we're not we're not building models let's say predicting the repayment after 12 months or 24 months we don't go to that specifically but I can tell you that people who do issue longer duration loans do worry about these things but mainly I just know that's because my wife builds graphical tools for them to actually like understand this so like for instance there is a thing called credit score drift your credit score is predicting your delinquency over the next 90 days on some loan but also your credit score might just go down or up over your lifetime and in a larger pool of loans it'll go both ways for different people so there's typically sort of a random walk model to describe how that happens there's both random walks you also look at what happened in the past and you project and typically what you'll do is you'll come up with nightmare scenario you'll come up with good scenario standard drift standard drift plus more and then you sort of evaluate what comes out differently in each of these cases that's that's the basically that's the part of your credit score but my question is that whenever somebody so there are two ways you can prove this you can say that okay will this guy they can't give him some money to pay back for a certain deal like you'll have something in mind like say 180 days or 6 days or 12 months or you say just does he have a property of paying it back in a large enough duration other can be other way can be that okay you fix the loan amount and now you try to predict okay what time will the user be able to pay back given a certain historical data worth to use is that something which is well those are the same thing it's just depending on whether you're like when you draw the graph it's either going like seeing where the horizontal line slices this way we're seeing where the vertical line slices or if the vertical line slices this would be easier with a whiteboard okay but they're basically the same thing it's just a matter of like you're looking you're looking at it's either a horizontal line or a vertical line in the same graph and if we had a whiteboard it would be easier to explain that okay uh next up so here if you want you can just take it but keep asking the questions wondering if you want to share any transaction data so typically what they'll share with us will be more like their gold their platinum their their best customers because like what symbol is primarily is this convenience product you push a button and your lunch is on the way you push a button and then you have your movie ticket no otp no filling a wallet nothing like that so typically what they want to do is the customers who use their products a lot who will make like five six transactions in a in a in 15 days they want those guys to be on simple to remove the friction and make they even make it go from six transactions to seven so they'll basically say these guys are this guy doesn't have a kitchen he just orders his lunch all the time um and then they'll they'll just pass that to us yeah like like if you if you use um if you go to a movie once every six months and use book my show and that's all the e-commerce you do simple is not a useful product for you just don't even bother um whereas like for instance we're on some cafeteria merchants so it's a corporate cafeteria and you might go there for lunch and chai and sometimes dinner that's a great simple use case you don't have to deal with reloading your wallet when you just want your lunch to come you just press the button okay you take your tray do you do the alternate stuff even for companies and corporations or is it only for individuals sorry the alternate credits core models do you do it for companies or private corporations or is it only for individuals so at the moment i think we focus in the individuals okay i can tell you a big player actually in doing it for companies so in the smb area is strike uh because if you think about a stripe knows a great deal about your cash flow and they also have a great marketing platform so a thing stripe does it's part of their business now um small business lending uh stripe knows your cash flow uh they have a rough idea of what's going on so when they decide you are eligible you log into stripe and they're like do you need it you have cash flow issues click here for uh business equity loan subject to these terms collectible against your strike payments so i also read about um ant financial in china so the way to do is they use data from their platform in taobao for example uh and in taobao you can have an other um e-commerce of alibaba so it includes both small um business and individual so in that case um they're gonna they know how well your business is going like how many transactions you have done in the past one or two years and what who are your customers and things like that and they know that you are doing well with your business and in that case they can approve your small loan like a few thousand us dollars and for you to play cash flow you also need to know about the expense side of the stuff for for a company so stripe doesn't know everything but they also know some of it because if you're paying from the account that stripe knows about two another strike account they don't they don't know everything okay at the very least they know your a chunk of your revenues okay um and i'm sure they buy data as well yeah yeah yeah okay like the thing that i know exists i don't know exactly how they do it okay but i do know they have this extra data point that no one else has at the moment uh in our product we haven't used but that is one of the direction that we are working on i'll also suggest that generally speaking the people you learn most about you learn the most from are the marginal people the marginal ones so if you have a score let's say one to ten and you find a good risk cutoff is three um so let's say higher is worse um probably you want to approve like a nice little holdout set at four and a much smaller holdout set at five probably going all the way to nine is just a waste and then what'll happen is um like you do this for a little bit you discover actually it's pretty safe to go to four you repeat it from at five and six and you do this until it blows up okay six thirty we have another five minutes or so you can keep asking questions i would like to avoid fraudsters yes i'm i'm trying to prevent them from getting beyond that i can't say much all i can describe is the mindset um with a typical machine learning problem if you're trying to differentiate cats from dogs you really just focus on accuracy um if you are trying to deal with an intelligent adversary you also have to think about and in this thing i built if i let's say know how this works can i hack it so it's a mindset you have to get into and sometimes you'll discover yeah this seems to work great but someone could scam me this way so i'm just not going to switch it on or i'll switch it on but in a limited form in the minute someone figures it out is off um but that's a mindset it's a mindset you have to get into it half computer security half uh machine learning do you employ adversaries yeah we try to hack ourselves we have we have a bug bounty so if you figure out how to scam simple will pay you you also get an automatic job interview and we'll try to recruit you if you can figure out how to do it um yeah there's a kid out in jipor who got some really nice bug bounties for sitting at home playing with us and one thing i want to add about this is um the way we we consider machine learning we instead of consider machine learning at fully automation system we we better combine whatever machine learning work well and whatever people human work well so it's we have to find a balance between the two and also just like you have to be really creative when you're trying to think this stuff up like here's a here's an attack someone might make they order groceries if they figured out sort of a way into the system um the order more groceries than any human could consume and they're turning around and selling them in the distance like this is a thing we actually discovered someone doing this um but if they can do it over and over again it's just amazing how much groceries are going to this one guy's house so like these kinds of things happen and yeah if you're just like oh yeah i made the numbers go up and i haven't even thought about exactly how i could scam this you're gonna get scammed it's gonna and the other thing is the numbers are gonna look great great great where did all that go yeah that's the black swan problem it is not such a black swan it's just one asshole in yeah that's the black swan we are actually sort of building so we track it internally we keep them out um there's some so we're not actually part of civil so it doesn't get reported the civil but there's also some uh alternative reporting systems that are basically getting started um so basically other alternative lenders may also not like you in the future what about you how do you track and manage fraud um if we are talking about the current business model then um that is basically uh part of the responsibility of the bank and of course we take part in that as well but like at the moment it's like uh to like we have to work with the bank on how to figure out those fraud do do you have actually help the bank in figuring out the fraudster yeah yeah of course of course let's say um as as we said we want to get rid of the fraud the fraudster right so before we um advise that's where it's called to the bank and if our fraud detection model predict that these guys are fraud fraudster then we just reject them okay yeah the fact that you don't have to offer an explanation makes it easier yes I think it's it's that's why we work easier in Asian countries I think we have to find a good way to work in Europe or the US okay I mean additionally like a lot of fraud signals are not going to be anything close to perfectly there's a lot of things that are like a very strong prediction yeah this guy's 50 50 chances of fraud now obviously 50 50 is not a credit risk you can take particularly when the guy's going to just keep doing it and he'll make your entire portfolio go to 50 50 um but at the same time like that's certainly not going to hold up in a court of law that this guy's 50 chances of fraudster okay and if you don't block those people the whole network shuts down yeah okay any other questions last questions two questions all right we're almost out of time thank you everyone right I hope you caught most of it in 90 minutes yeah thank you thank you you can meet and talk to them and still available for whatever time it is yeah but yeah oh god this is definitely not the one that I want to put