 All right, everybody. Hi, this is Monica. So it's Monica struggling with applications as usual. Welcome today to our live stream, Excel Pivot Tables for Quick Analysis. And I'm Monica Wahee. I'm supposedly a data scientist, but I can't seem to figure out Zoom. But what I'm really good at is helping people do portfolio projects in data science. Hence the name of the series of live streams is Portfolio Projects with Pizzazz. So today we're going to be talking about Excel Pivot Tables. And you know, I encourage you to put questions in the chat. But for now, welcome. Happy New Year. I'm glad you're here. And whoever's watching this, if you want, you're interested in resources for data science, learning resources, or just good advice, please follow our company page, our DPS company page, because that's where we're posting our events. And I'm going to be posting a lot of resources on there. The resources I have and the resources I find on the web that I think are helpful for people doing portfolio projects with Pizzazz. So let me continue. Welcome to today's lecture. I'm glad you're here. What I'm going to talk about today are pivot tables in Excel. I'm going to explain to you what they exactly are and why, kind of, what their use case is, what they're useful for. Most people like actually hate them. And so I'm going to explain the people who hate them, which is me, basically. I hate them. But they have a really good use. And I'm going to show you that use today. A few people totally love pivot tables. And I'm going to tell you about those kind of people and how weird they are. Well, actually, I'm going to tell you sad stories about those kind of people. It's better to be people who hate them and use them only for what I'm going to show you right now. So you're probably saying, why even use them at all? Well, this one use case is a perfect use case. And that is where you're just trying, there's two basic things you're trying to do. One of two. One is you're just trying to figure out what's wrong with your data set and why it's not going into R or SAS. You're troubleshooting and import. The other thing you can use it for is a quick and dirty analysis, which is what I'm going to demonstrate today. Like if you have small data, not a lot of columns, you want to do a quick descriptive analysis. Pivot tables are your friend, but they're not like your best friend. They're like your friend. So I'll show you how to use them to get to your descriptive analysis. And I encourage you to, oh, welcome, Shamim. I encourage you to download the slides here so you can get this link, because these are four links I'm giving you. The second link here, where I use a blog post for this quick analysis, we're going to look at that. But the other ones just have some support for, like this one up here talks about using a pivot table. Another pivot table for a quick analysis. This one's a little more involved. So I'm showing you the quicky one today. So I'm going to now explain what pivot tables actually are. A pivot table, and also you'll see pivot chart. So just to be clear, the word pivot means to move. Our business had to pivot when nobody was buying our product. That's what pivot means. But don't let it fool you. So pivot table, pivot chart are functions in Excel. And what they do is they allow you to apply summary operations to a relatively large dataset. So if you got a column, and it says what country people are from. And you want, can we count this column really quickly by country? Can we do like a select, like a group by in SQL? Or like Proc, Freak, and SAS? Or like a table in R? Can we do this really quickly in Excel? That's basically what a pivot table is good at. The pivot table is essentially a set of dynamic filters that you can change interactively using Excel. And it'll update as you do it. And we're going to have some fun today, because I'm going to show you how to do that. And since Excel can be used for everything, like it can be used for data storage, it can be used for reporting, it can be used for making pivot tables. The problem is that using such a cool function like the pivot table depends on the data. And Excel lets you do anything you want with the data. It doesn't let you make tables or put keys or anything. It keeps data integrity. Well, Excel is just kind of like, oh, cares, you can do whatever you want. And so because of that, if you put a pivot table on top of something that looks like that, you can run into a lot of challenges. And the thing with pivot tables is they're interactive. But they're not really dashboards. A dashboard is designed for you to interact with. You can't break a dashboard. But you can kind of break a pivot table. So it's kind of like the agony and the ecstasy of Excel. Okay. And I'll demonstrate it so you'll see what I'm talking about. So if you're saying, well, why not just you are a SAS or Python or SPSS? Why would you even bother with this pivot table if you can do or SQL? Well, you know, the answer is, yeah, if you're doing a real analysis, don't use Excel. Use R, use SAS. You're right. But what if you have a small data set in one tab of an Excel spreadsheet? So what often happens to me when I'm using R or SAS on big data? Let's say I'm analyzing a bunch of patient encounters. Well, those patient encounters might be from like 50 facilities. And let's say I just want to study some stuff about the 50 facilities. I make a facilities table and maybe it's got 10 columns in it. You know, maybe it's got the county of the facility or stuff about it. Why not just throw it in Excel and make a quick pivot table? That's what I'm saying. So yeah, with the patient counter data, of course, I'm using R and SAS, but sometimes that's kind of overkill. And so it's nice to just have the pivot table to do your work for you. And so like that's what I was saying. What if you want to do a quick analysis, see, oh, another, like if I was just doing this with the customer, let's say you download data from the web and you want to see if it's any good. It usually comes out as a CSV. Well, you can just do save as and then you get an XLSX. You can save it as an XLSX. And then in another tab, throw down a pivot table to just see if like a variable is filled in that kind of stuff. Now, that's not going to work if your data is like huge, you know, Excel will just break. But if you download something that's like got 500 rows, no problem, this will solve your problem. You know, it's basically if you just want to answer a quick question about the data. And so this is, so on the right side of the slide, I'm going to show you the best practices for actually using a pivot table in Excel to do an actual project like a real project. You know, maybe it's a small data set. You're just doing a portfolio project. You know, it's maybe nothing, you know, huge and fancy. It could be like that thing I was just talking about with facilities. But this is how you do it. And I'm going to demonstrate this is one, you get your data set and you put it on one tab in tabular format. So you just make it, you make it basically like a data set you could read into RSS. And you put it on one tab and nothing else goes on the tab that step two is you make a pivot table on another tab. And that's the only thing on that tab. Okay. Then you make the pivot table look at just the populated data from the first tab. And again, I'll show you how to do this. But basically, when you lay down a pivot table, and then you tell it what data to look at. And so we'll do that together. Now, once you set up the, so then you, then the pivot table sort of dashboard is available to you. Then I'll show you how to use that dashboard. And once you get the calculations in that dashboard you want, you copy and paste them out of the pivot table. So the pivot table needs to be disposable. Okay. So let me show you now the pivot table. So this is actually what the pivot table dashboard looks like. And this is what the, this is an example of when you make a sum. So if you go, oh, I was going to show you here's the blog post. If you go to blog post, it's the second one on there. Let's see here. You go down here. You can find that here, if you click on this, that's the data set I was using. It's called Black Friday Death Count. And it's basically a list of events that, a list of news events that show people fighting and getting hurt on Black Friday. Okay. And you can read the backstory if you want. So it's just public data and it's not very big, right? And so what I did was I put it in this Excel spreadsheet and you can download these from that webpage, all right? So this is the data set I actually, I copied and pasted it basically from that website. I added this order column just so I could count like how many rows. I put order because this is the original order of the data set. See, there's 48 here. And this year, data entered that from, and this is the description, this is the news article. And on that webpage, they had the deaths and they had the injuries. These three columns, I added myself, okay? So first, you'll see why I did this later when I show you this, is I added this category for period because I felt like there was some sort of time trend. So I added, you know, 2010 to 2014, or what was it, 2016 to 2018, there weren't any in the middle. I don't think this was the greatest data set. But anyway, then I added this report value, which you're probably like, it says one Monica, I'll show you the pivot table will show you why I did that. Then finally, I added classification, which is this, like I said, shooting, stabbing, hand to hand combat. And that's so I could make a portfolio project. So I could analyze this data and see if I could get something out of it. So as you, as I said on my, on the, when I was doing the slide presentation, data, pivot, and then other things, right? So here's the pivot table. Now notice, this is the way the pivot table comes to you. But I can screw up this pivot table right now. Okay. So one of the things you can see is that it says some of incident value roll labels, you know what this is, that hand to hand combat, this is my category. And then you know what this is, this is that category of the, the years, right? And I got it to look like that. Isn't that cool? Well, how did I do it? Right? Well, we're gonna, first, we're gonna uncheck things to see this uncheck, uncheck, uncheck. Look, it's destroyed. So that's the problem with the pivot table. Okay. Is it gets destroyed? So now when I click on it and see it's on, when I click on it, we're like, what do we do? Right? Obviously, or not obviously, you get to drag stuff. Now, this is already looking at the data I wanted to look at. So I'm gonna just show you how to use this one that's already set up. And then I'm gonna get rid of it and set it up again. Okay. So what happens if we put the, by the way, you drag these things, right? It looks like you don't, it looks like you check them off, but you don't, you drag them. So the first thing I'm gonna, I'm just gonna drag year to rows. Okay, see how it's like actually active like it was doing? So this is gonna do basically a sequel to select distinct. Okay. So now we know, okay, I'm done with this year. That's how you get rid of it. Let's do, let's do my, this period, drag that on a row. So you can see I like rows. See, it's just says distinct, but it doesn't say how many are in age. So if I want to knew that, I got to put that over here in values. So I'm gonna, notice how it says count of period. Okay. That's good because if we choose, we choose something else that can be like, for example, if we, if I do year and then I do year over here, some of year, well, we definitely do not want some of year. So we would go over here, value field setting. Oh, let's choose count. See that? All right. So one of the things I realized was I was having trouble counting like the reports, right? Like just how many were there. And so I have this here. So I just, you can just throw these away, right? So I had this incident value, I guess it was. And so if I made them all be one. So now when I put this over here, some of incident value is going to be the same as count of incident value, because it all says one. But that way I can count them. So that's a clue, right? Okay. So let's get, oh, what just happened? Let's get rid of these things. Okay. So now I'm going to show you how I made that sort of difficult one that was already here is I decided that I wanted the columns to be the time periods and the rows to be the type of like this type. So I wanted the rows to be type. So I put that under rows. And then I wanted the columns to be, let's see here was a period. That's what I called it over here. But the problem is there's nothing here. So I guess I put this here, and then I put, you could do some or you could do count, right? Like it shouldn't matter because I made them all be one, right? So now you see how I got that in there. I mean, there's other things you can do. You can, I could have just filtered in, for instance, only car associated, but that would have been only five or whatever. So the main point of this is you can see how easy it is to just look at data. Like if I throw this away, and I throw this away, and I throw this away, and I'm just curious, like, what does it say in injuries? I can look at the rows. We say there's some with zero, some with 20 injuries. What the heck is that? I hope a lot of them don't have 20 injuries. Okay, so this is some of injuries. If I choose this to count of injuries, let's see here. This is how, yeah, there were only two reports with 20 injuries, but a lot of reports with only one injury. So that's how, like, I mean, this is only like 48 or whatever rows I could probably eyeball it. But if it's like 500 rows, you can't really eyeball it. So this, you can do this. Now, I want to show you over here, see this nice analysis. Like this was number of reports per year. And then this was like number of reports using the period. You're probably like, well, where did these numbers come from? Well, here's what I did is I like went and I said, okay, like reports per year, I bring down this year, right? And then I'll do incident value, right? Like I can just count this. It's like it says some, but like I said, if I change it to count, it's going to say the same thing because I made that all be one. Okay, so now what I can do, so I'm going to show you this, I'm going to highlight this. And I'm going to do copy. And then I'm just going to add this, add just a sheet. And see how I went to the top here. I'm going to do paste. Why is it not doing it? I can do paste. It's not letting me paste. Paste options. Okay, I'm going to go back and we'll do copy. I think it's because I made that sheet. And then I'm going to do paste, okay, values, like this one, two, three, paste values. Okay, there it is. I harvested it. And so now you can go back like to the pivot table and you can change everything. You can change this to like period or you can do this, you know, which makes no sense. But here it is. Or like you're validating it or something. And then you just do copy because it's not going to stay there. And then you do, you go wherever you want it to be the upper left and you do paste special. And it just, or paste values and then it just puts the values in and then you can edit it. So that's how I got this to be here and then I made this in Excel. And then here's that same one. I've showed you this one that I just did. And it looked really meaningful. Like look at this. It's going down in a huge way. I mean, I don't think the denominator is changing much. And hand-to-hand combat is going down. Shooting is going up. It was bad news. So I read the blog, you know. The other thing I want to show you how to do is how to actually make the pivot table to begin with. Okay. So first of all, like this goes away when you're not on it. But guess what? I can get rid of this by just highlighting it and doing delete. There goes my pivot table. Oh my God, what am I going to do? So let's add it back. I'll click up here in the A1 and then insert pivot table from table range. Okay. So you get this table range. And then I click on this little thing and we're going to go back to our data here. And I'm just going to highlight the data that we really have on here, nothing else. Okay. That just keeps it the least confused. And then I do okay. And that's how I get this thing there. All right. So that's the main thing I wanted to show you today. Let me go back here. So the problem with pivot tables is they're not permanent. They like to pivot, right? And you can use them to do your dirty work for you. But you can't expect them to stay static, which is why you have to do that copy and then paste special. And you should always keep your data sets, every data sets in that tabular format, even your aggregated ones, because then you might even further aggregate them. So I was just talking about Microsoft Excel. And I was also talking about Zoom. And I was also talking about SAS. And these are all applications. And it can be actually really hard to analyze data from applications. In fact, part of why I was teaching you how to use pivot tables is because sometimes you get data from applications like medical records and you have to sort of prep it and you don't even know what you've got. In fact, that has been my experience because I used to run data warehouses. Research analysts like I started out to be, we're used to research data that is prospectively gathered, like BRFSS and NHANES data that's well documented. But if you analyze data from an application, like Zoom is collecting data as we speak, then it will be like really hard to do. That's what I have found. And because you just don't understand what's in the data, it's not like epidemiology data. So to answer that problem, I'm holding this workshop called application basics. And in January, our workshop, the theme is what does governance mean exactly? And so the objective of the workshop is to understand data sets from applications well enough to analyze them and produce results. Maybe not in Excel, pivot tables, but maybe, depending on how big the data set is. So we cover computer applications. We cover how are they designed, what kind of teams build them, how the data are stored, the terminology used in application development. And with this knowledge, you can break through communication barriers to get the answers you need to complete your analysis and be seen as an expert. So I'll give you the details. The title of the workshop is application basics, what does governments mean exactly? And I'm holding it on Saturday and Sunday, January 27th and 28th of this month. Each session starts at noon Eastern time and apologies to India and Pakistan. I'm so sorry in Japan. I know it's a middle of the night for you. And it runs about three hours. And it's on Zoom. And a normal price for, like I hold these workshops a lot, a normal price for a workshop like this is $250 to about $750. But you get a special cost, which is free. So, and again, I just want to remind you to follow the company page so you can get the resources. I'm posting them all on there. Hopefully you'll download the slides. You'll have all my contact information. So I guess nobody has any questions. Thank you so much for showing up today. I have to say to all of you, happy New Year. Thank you for watching this video, which is part of the Public Health Today to Science rebrand program. If you are interested in joining the program, please sign up for a 30 minute Zoom interview using the link in the description.