 Hello and welcome to this session. This is Professor Farhad and this session will look at Benford's Law and Fraud Detection Illustrated and Explained in an Excel Sheet Computation. This topic could be covered in data analytics or data analytics and accounting and an auditing course or or managerial accounting where managers are looking for anomalies in the data. As always I would like to remind you to connect with me on LinkedIn if you haven't done so. YouTube is where you would need to subscribe. I have 1,700 plus accounting, auditing, finance and tax lectures. This is a list of all the courses that I cover including many CPA questions. If you like my lectures, please like them. It doesn't cost you anything. Share them. Subscribe to the channel. If it's helping you, it means it might help other people share the wealth. Connect with me on Instagram and on my website farhadlectures.com. You will find additional resources if you are trying to supplement your accounting education or study for your CPA exam. I can help you add 10 to 15 points if you use my material properly, CMA exam, enrolled agent or any other accounting course that you are taking. So the idea of this Benford's Law came from an electrical engineer that was working at G at that time, Frank Benford. And what he noticed that when he was using the lock tables basically before calculators, if you wanted to perform computation, you had to use the slot table. What he noticed, he noticed that the first few pages of those slot tables were used more. Basically, they were more used up because they were more touched by people. So he noticed why that anomaly that the first few pages are used more than page 8, page 9, so on and so forth. So he went on and he studied this phenomenon, I guess. What he found out that enlarged randomly produced sets of natural numbers, there is an expectation of the first leading digit. What does that mean? According to this law, more numbers in the population of numbers start with one than any other digit followed by two and three and so on. So what he found out is that when you have a set of data that's randomly produced, what's going to happen is you're going to have most of the numbers. It's going to start with the digit one, then two, then three. And we're going to see the distribution shortly on the next slide. And the test as hypothesis or the test as idea, he used it on 20,000 observation that include demographics from New York, New York, scientific finding, as well as numbers appearing in the Reader's Diget magazine to confirm what he was observing. And basically what he observed is that the first digit one appears third, expected to appear 30% in the numbers, the number two, 17.61, number three. I mean, there are the mathematical computation for this. We're not going to use it. I'm not a mathematician. I can't profess. I don't even totally understand it. I'm just going to accept it and work with it. So this is what we're going to be working with. And when do we use this Benford's law and accounting or fraud deduction or internal control, it's used on sets of numbers that are results from mathematical combination, like account receivable, account spable where you take the quantity purchase or the quantity sold times the price, because you're not really looking at what numbers you're looking at the combination, like it's being produced as a result of two numbers. Transaction level data, disbursement, same thing, sales, expenses, accounts that appear to conform. Like for example, accounting numbers, there should have a trend in accounting numbers or with the mean of the sets of numbers greater than the median, and it's skewed to the positive, which is most accounting numbers. The mean, you'll have a lot of transaction around the price, and you'll have some transaction that are more than the average. So it's skewed to the positive. When it's not likely used, so when don't you use this Benford's law where it doesn't give you any appropriate results, is when the data or when the numbers are assigned by someone, like check numbers. The check numbers, they are not randomly produced. They are assigned one, two, three, four. There's no meaning to them. Like you don't want to also analyze zip code. Zip code has numbers, but it doesn't mean that 30% of them are going to start with the number one. Number that are influenced basically by human thought, like prices at a psychological threshold, somebody decided that price to be $1.99 or ATM transaction where you have a minimum of $20 or a combination of $20 or any account with the built-in min and max. Like if you're looking at the height of the individuals, let's assume you are looking at centimeters. Most heights are either up to two, most people are between 100 to 200 centimeters. Well, you have people, very few people at 300, but notice you cannot use Benford's law because there's not a lot of trees, not a lot of force, so on and so forth. So Benford's law will not be used in those situations. So to illustrate the concept, let's take a look at an actual accounting data and run the Benford's law to illustrate our concept. To illustrate this concept, we're going to be using this sets of data with order number and sales, and here's the order numbers, and we have 200 order numbers. We could have 2000, 3000, it doesn't matter, and here are the amount for each order number. Now the first thing I want to extract from this data is the first digit, and each number, what's the first digit? Well, to do so, well, there is this left function in Excel, so you click on function, left, you click on OK, and basically what you're asking, what you're asking Excel to do is to look on the cell to the left and determine what's the first character in that number, in that number. So 129.33, it's 1. Therefore, I'm going to click OK, and notice now it pulled 1. Now all what I have to do is take this number, take this and scroll all the way down to my data, and now what I did is I extracted the first digit from all the transaction. So that's the first step. Excellent. Now what I want to know is I want to know how many times the number 1 appeared in my sales transaction, how many times the number 2 appeared as a first digit in the transaction, so on and so forth. So I'm going to have a column here called first digit, and I'm going to have to, and the first digit is 1, 2, all the way till 9, so let me go all the way till 9, because, you know, it's all the way till 9 obviously, and now I'm going to have to do the count. I have to count how many 1s I have, how many 2s I have, how many 3s, so on and so forth. So let me go ahead and do the count. So I'm going to have a column called count, and there's a function that can count this for me. I'm going to click on count, if, click on OK, and it's going to ask me the range. What are you counting? I'm going to be counting this column here, all the numbers in this column, notice it selected them 1, 2, 3, 4, all of them, okay? I'm selecting here column C, and that criteria is just simply put any, any, the criteria is the digits here. So I'm going to name them either 1, 2, based on these digits. So E5, I'm going to click on OK. Now it gave me 46 1s, I'm going to scroll down, and it's going to pull everything. I have 200 transaction, you always want to sum to make sure you selected everything. Let's see if I have the 200 transaction. Let me see, let's sum it, let's sum everything, sum the count, let's not pulling the sum, let me just do it this way, sum these, and I have 200 transaction in need. So notice I was able to collect, for example, the transaction that started with the number 1 is 46, 46 transactions, so on and so forth. That's excellent. Now I need to know how many times that the number 1 appeared as a percent of the total. So now I'm going to compute the percentage of how many times the number 1 appeared, percentage, basically I'm going to take 46 divided by 200. I want to make sure 200 is fixed, or relative to everything else I'm going to be pulling. So put the dollar sign, sorry, so 46 divided by the total, and around F put the dollar sign to make sure F14 is fixed, 20.23, I'm going to pull this, and make sure I add them up, and they should add up to 1 or 100, and they add up to 1. So now I computed the percentage, that's excellent. Well, now I need to find out how do they stand up versus the Benford's law. So I'm looking at Benford's law, Benford's law's number, how do they match? Now I can just go to the PowerPoint slides, and copy the numbers, or there is a log function in Excel, and again you don't have to worry about this, just know that you could either do it this way, or go through the log functions. I'm just going to show you for the sake of illustration how to pull the log numbers, log of 10, which is the Benford's number, to this. So basically you put log of 10, and let me just let me just do it manually, you open, and it's 1 divided by 1 plus 1. Okay, so it's 30, 30.1, approximately 30.1, or 30, 30 percent, that's fine. Now I'm going to take this, and go all the way down. Okay, so those are Benford's number, those are Benford's number. Now what I need to do, I need to compare, I need to see the difference. So what is the difference? I can just obviously make these percentages, or make them easier, either or, it doesn't matter, three digits. So I'm going to find the difference between them. So I'm going to take this number minus this number, and I'm going to scroll all the way down to find the difference. So now let's look at the difference in a graph, because you want the difference to be as close to zero, if it's going to be complying with the Benford's law. So let's go ahead and graph the percentage and Benford numbers, just kind of to take a look at it from a graph perspective. So highlight those two columns, go to insert, and let's insert a insert combo chart, and let's select this one, yeah this one looks good, except that the digit one, we have less of the digit one, we're expecting 30, around 30, we have around 24, for digit two, whoops, the digit two, we had more digit two than expected, and the rest are in compliance, not in compliance, well as expected, except also six, six, we have more sales in the digit six than when we expect, so two and six. Now what we can do at this point, if this is fraud investigation, we can start from there. For example, what I can do is I can pull all the transactions that are with the digit two and six, so to do so I'm going to do and insert the pivot table, so this is the digit, and what I'm going to do, I'm going to insert the pivot table for sales and digits, I'm going to go to insert pivot table, and if you don't know how to do a pivot table, please view my pivot table example, so I'm selecting the range, those B and C column, create this in a new worksheet, and I want the sales and the digit, so here they're giving me the sales number, which is, you know, it's the sales number, I like to create percentages, and notice as a percentages, so does the sales that start with the digit two represent 13.63, way more than number one, which is because it's 200 dollars, but also six represents 16.62. Now what can I do with this? I can click on this, and this is all the sales that start with that digit. Now if I have, obviously the companies will have more data, now what you can do, you might have the name of the sales individual, maybe the salespeople, and if you have this, if you have the sales number, if you have the sales name, the salespersons, maybe if there's this same person appearing at $600, or the majority of the time it's appearing at $600, now you want to take a look at this, maybe this individual is inflating his or her sales to get more commission, that could be a possibility, I'm saying it, I'm not saying it is, but this is how you would start, or the same thing, if you went to the, if you wanted the 200 sales, and you find out that it's disproportionately represented by one individual, then you might say, why this individual is selling mostly at 200, are they inflating their sales? That could be the case, we don't know, but at least you have a starting point that you have, according to Benford's law, you have too many sales with the digit 2 and six, that's all that we can say for now, nothing more than that based on this data. Now we could also do a chi-square test to see what's the significance of the digit 1, so on and so forth, but we're not going to go into this any further. As always, I would like to remind you to connect with me, and if you are an accounting student or a CPA candidate, or you like more about Excel, please visit my website, farhatlectures.com, study hard, stay safe, especially during those coronavirus days. Good luck.