 Hello and welcome to this session in which we will start to look a little bit further in step one of the plan, the audit data analytics. In the prior session, we looked at the five steps in planning and performing and evaluating the results and these were the five steps, one, two, three, four and five, which is plan the audit, access and prepare the data relevance and reliability of the data, perform the ADA and evaluate the results. And if you remember what we said in the prior session that plan the ADA, we're going to talk a little bit more about the step because in planning the ADA, we have to know what is the purpose or objective of the ADA and we have four of them, risk assessment, test of control, substantive testing and evaluated conclusion. In this session specifically, I'm going to be discussing risk assessment and basically you learn about the entities and its environment. So I'm going to be within step one and within step one, I'm going to be only covering one point. It's very important to understand that this topic is covered on the CPA exam as well as your as well as your auditing course. So in your typical auditing course, they may spend on this whole topic, I don't know, between some courses spent 15 minutes up to one hour. Okay, so this is how much time they spend. I'm going to spend a little bit more than an hour because I think you need to understand this topic a little bit more in details. It's very important that you don't shortchange yourself because you're not comfortable with this topic. You don't work with it on a day to day basis. Therefore, for you to understand it, I might have to explain it a little bit in details, give you examples, go slowly rather than reviewing it with you. Now, the CPA review course can review it with you and that's why I invite you to take a look at my website, farhatlectures.com. Whether you are a student or a CPA candidate, I can help you understand this topic better. The subscription is your risk is one month of subscription, your potential gain is understanding this topic, doing well on the exam and this topic is relatively new. It means if it's new, if it's relevant, it means it's testable so you cannot take any chance on it. Especially when every time we talk about data, that's an important topic, whether it's on BEC or auditing. And if not for anything, take a look at my website to find out how well or not well your university doing on the CPA exam. This is a list of all the other courses that I cover, including auditing, intermediate accounting, governmental, advanced taxation, so on and so forth. My CPA supplemental resources are aligned with your CPA review course, whether it's Becker, Roger, Gleam or Wiley. And I'll give you access to the AI CPA previously released questions with detailed solution. If you haven't connected with me on LinkedIn, please do so. Take a look at my LinkedIn recommendation, like this recording, share it with other, connect with me on Instagram, Facebook, Twitter and Reddit. So let's take a look at this first step, risk assessment. One of the purpose or objective of when we plan the audit is to determine what are we doing here, for what purpose we are performing ADA, the audit data analytics and it's risk assessment. So what is risk assessment? Let's review, you should know what risk assessment is. It's understanding the entity and its environment for what purpose to obtain an understanding of the entity and its environment. And that includes the entity's internal control. The purpose is to identify and assess risk of materialness statement. That's why we understand the entity's internal control, whether those risks are due to error or fraud. And we have to assess this at a financial statement level, as well as the assertion level. Simply put, risk assessment is taking a look at the whole company and trying to understand its internal control, assess if there's any risk of material misstatement, whether it's due to error or fraud at the financial statement level, as well as the assertion level. Now, how does ADA fits under the spectrum? Well, ADA is an effective way to do so. It's not only effective, it's not only effective, it also helps us identify specific transactions that are unusual, account balances that could be misstated, and identify and assess possible risk of fraud. So it's very effective. And we're going to see how we're going to be using. So now we're going to be basically combining ADA and risk assessment. So the following technique can be used for this risk assessment using technology. One is clustering, two is matching, three is statistical analysis, four is visualization. Now, again, every time Professor Farhad has a list, it means what? I'm going to go over each item in this list separately explaining what's involved. Also, the software that can be used is Microsoft, Excel, ACL, audit command language, IDEA, R, Tableau, Python, some other language case where there's a lot of software can do this. I'm not going to cover those software, but you should be as an accounting student, be familiar with all of them or know what they are. Now, Python and R would require programming, Tableau should not, ACL should not, and obviously, we should all be familiar with Microsoft Excel. If not, take a course with Microsoft Excel. So we're going to take a look at step one. And step one that's going to help us identify notable items, which is basically assessing the risk is something called clustering. So here what we're doing is we're working with clustering. So the title is basically clustering. And clustering is a data science term. It's coming from a data science field. And basically, what we're going to do, we're going to group transaction or balances based on a particular characteristic or multiple characteristic. And we'll try to identify some sort of a pattern. Okay, so we're going to group similar customers together. That's the purpose. What are we going to see? Well, if we group them together, we might be able to identify customers, for example, that it's taken them longer than usual to pay. We identify this group, maybe identify one group, one customer within that group, or we could have a whole group for these customers. Well, we could, for example, look at customer logins. What time do they log in? And we can find information about spending habits. So for example, customers are logged in between 8 p.m. and 10 p.m. And they are, that's where most of our sales, we just find the cluster clustering, we learn more about our customers, inventory, not selling with the within the expected timeframe. Well, why not? We need to know what's that inventory, group it together. Customer with certain high profit margin. Well, why do they have high profit margin? Or are we making an error? Maybe we are not recording all the related expenses to those customers, or we might be overstating the revenue for these customers. But we're looking at high profit margin customers. That's a cluster, that's a group. So the group might be unusual by itself, or we might notice something within the group that's unusual. And this is where we come with that term, notable items. And we looked at this term, notable items in the previous session. And I told you, we're going to go back and discuss this notable items a little bit further, and later is now. So notable items is what we're looking for, what we are looking for as we perform those ADAs within the risk assessment cycle. And this is from a graphical perspective, clustering is, it has a graphical, graphical aspect to it. And this is basically what we say by clustering. So this is customers group one, customers group two, customer group three, whatever those customers are, maybe our highest selling customers, customers, for example, these customers on average purchase between 500 and 1000 per month, these between 1000 and two, so on and so forth. We're clustering. The next tools that we have is for the risk assessment is matching. And what is matching? So we're matching the characteristic of two population to see if anything overlaps. So what we're doing here, the auditor use some searches for a key characteristic that might be in two different databases. So we might have one database here and another database here. And we want to see if something in this database matches something in this database. And we're going to see the purpose and the reason for it. So the auditor uses audit data analytics to search for key characteristics. They may exist in several in several database. And we're looking for something unexpected, unexpected matching. So something graph graphically, it looks something like this. We have two databases and something in common between them. That could be very usual, not a problem at all. But we're looking for unusual matching. So what could be, what are we looking for? What do we mean by unusual matching? For example, we can match the vendor address to employee address. What does that mean? In our database, we have payroll. Payroll has the employee name and address. Also on our vendor list, we have the vendor name and address. We want to see if there is any vendor address that match the employee address. Why? Well, our employees should not be also our suppliers. They could be. I'm not saying they may not be. They could be that your employee also have another company and your company purchase from them. That's not unusual. But we want to know. So this is what we're looking for. Looking for unusual. Here we're looking for fraud. Can match the bank account number to that of the vendor? For example, we know the employee bank number from their payroll. And we want to match all our payroll bank number to all our vendor account number. For example, the employee might use a P.O. box for their business, but they might keep the same account number, bank account number for his personal and business use. So we want to do that. Do that match. Match employee social security number to our 1099 recipients. Again, they may have a different address. They may have a different bank account number. But if they are getting a 1099 from us, well, we can match the social security. If there's any match, then we know that one of our employees is also a vendor. Once again, it may not be unusual, but we need to know. Search for the same last name in the payroll. Why? I mean, think about it. Some companies, they might have thousands of employees, and they could have matching same last name, but also the risk is maybe someone put their spouse on the payroll. We don't know. Just why not? Let's do that. Let's search. Let's find some duplicate information if we can. This is not matching, but this is basically trying to find duplicate. But that's also part of this matching process. For example, government employees, they have access to the unemployment benefit and they have access to the government employee and they can match social securities to see if any employees is also getting unemployment benefit. Just an example of what we just said. So matching is another tool. It's another ADA tool to assess. And here matching basically mostly looking for fraud. Here what we're doing, we're assessing for fraud. And that's not the only thing. Obviously, we can do it for many different techniques, but you need to know what matching is. The third tool we can use in our risk assessment using ADA is statistical analysis. There are many statistical analysis we can use. There are many software. Specifically, we're going to look at two. One is descriptive statistics, and we should be all familiar with descriptive statistics. Practically Excel can run this for us. And by the way, statistical analysis is also covered in BEC. So you want to be familiar with statistical analysis because you will need to know this information for the BEC exam. In statistical analysis, we're looking for balances or transaction. If we're looking at descriptive statistics, for example, three, more than three standard deviation from the mean. It's the outliers. Why are they outliers? Why are they outliers? Regression. Regression, we could have many types of regression. Usually, you're responsible for simple regression. Again, I do have recording about regression in my BEC unit to be familiar with regression overall. And basically regression is a prediction equation that express an item of interest commonly known as Y or a dependent variable in terms of other data fields, X, which is the independent variables. And the reason here is to identify some notable item, something that's out of the norm, not usual. So a good relationship is a good study will be relationship between sales and advertisement. So what we say is the more advertised, we're supposed to have more sales and should be some sort of a relationship between a predictive relationship. Another one is relationship between sales and web traffic. If you have an online store, for example, you would see that people who are logged in between 8 and 10 p.m., they represent most sales. And on average, for every 500 visits, we would say for every 500 visit, we have $10,000 worth of sales between 8 and 10 p.m. Okay, then we can run a predictive analysis to find out what's going on between 8 and 10 p.m. Or another way to look at it, within 24 hours, that's not be specific. So every 24 hours, it looks like for every on average, for every 500 visits, we have $10,000 in sales. Now all we have to do is now, if we have this relationship, we can predict if we have 1,500 web visits, we should have 30,000 worth of sales. So what we're doing is we are drawing a relationship between two variables. Okay, we say that the relationship between the website visits and the sales is 500 to 1,000. Okay, so for every 500 sales, 500 web visitation, you have $10,000 in sales. Now if you want to predict sales, you could just take a look at the web traffic and determine that and estimate that. Now you're going to have to make some changes, we're going to see in a moment. So steps in regression, the first thing is to predict the outcome of the equation based on historical data. And this is basically my historical data. On average, for every 500 web traffic visit, I should have approximately $10,000 in sales. Now after I do that, I have to go back and examine my business condition, knowledge of the client and the industry and maybe I want to modify step one. What is modifying step one? Well, during COVID, I would expect more sales per visit because people are buying more online. So those 500 people now, they're going to buy more because COVID is here now, they're buying more online. Or if we're going through a recession, I expect sales to be lower per web traffic. So this is how you do. Then you would validate the prediction equation from a statistics point of view. Simply put, you want to make sure you have the data, the data is comparable from a statistical perspective. Otherwise, you want to go back and modify your prediction. Step four, round the regression. On the actual data being audited this period, then compare the predicted value, what you predicted in step one, which is for every 500 visit around $10,000 in sales to what actually found. See if that comparison exists or not, still hold. If there's any unacceptable variation, you treat them as notable items. You either investigate further or consider them as false positive. Why would you consider them as false positive? Because there's an explanation for them. For example, somehow, one client, there was one client, one rich client, they went in and by themselves, they bought $10,000 worth of merchandise. What just happens to be just one person out of nowhere? Well, that's a false positive. As long as you can verify that that's what happened, then you could basically say that's a false positive. I don't need any further investigation. But if you see that on average, for every 500 visit sales is showing as $12,000 and that doesn't make any sense given the current business condition because everybody's sales online is going down, then you have to investigate further. This is statistical analysis. Visualization, again, just like statistical analysis, this topic is covered in BEC. I do have a whole session about visualization covered approximately in 25 minutes, which is covered visualization in details. Here, I'm just going to go over it in a sense of basically a review. I know I don't, I hate to say review, but this is a review within a review. So visualization is a representation of a data set or key information as a chart bar or some other image. You're looking for unusual characteristics, unusual transaction, unusual balances visually. Now you're looking at it visually. The purpose, basically the purpose of visualization is to communicate the information what you find effectively and efficiently. Visualization are very, very easy for people to understand to see. They are produced to reveal information to people. And visualization should have some good characteristics. And these are some good characteristics. They should make the comparison between data elements an easy task. If you're looking at this quarter versus last quarter, product A versus product B, it should be easy. It should be mass understood in a sense that anyone with a wide range of people be able to understand this, this, this picture, remove the jargon and communicate the message in a way to a wide range of people, lower level, upper level customer, so on and so forth. It should efficiently communicate the information to people. And picture is worth a thousand words. And this is what we mean by this, but bear in mind, the picture cannot be misleading. It must convey the intended message. And in my visualization chapter, I talk about how you could use visualization to mislead. It might look good, but it doesn't really convey the proper message. It has to convey the proper message. It has to have an impact, a good visualization you should be able to remember, ease of remembering. Let's communicate the information in a way that makes sense in the mindset, in mindset of data and information overall. People, they're going to be thrown at a lot of data, a lot of images. So your data, your images that you provide to them, especially your image, because we're talking about visualization, it should be easy for them to remember. If it's easy for them to remember, it's effective. So those are basically the four tools for risk assessment. As I mentioned at the beginning of this recording, I explained the material differently. For example, Becker, Gleam, Wiley, they may spend less time on a topic like this. They don't go in details in the risk assessment. They may cover it in few, literally in few minutes. It took me more than a few minutes to cover it. And the reason is if you understand it better, you will remember it better. If you remember it better, guess what? You will do better on the CPA exam. And that's the purpose of my course, is to help you understand it better. It takes a little bit more time. Look, invest the time you have to pass the exam once. Don't shortchange yourself. Take a look at my website. Consider subscribing. Stay safe. The CPA is worth it and good luck.