 Well, good day everybody. I'm so glad to see you today. Thank you for joining me for my little demonstration here on Quick Charts in Excel. I'm Monica Wahee. I'm a data scientist, and I'm a LinkedIn learning author, and I do a lot of teaching in data science. And so you're probably thinking, well, Excel, Excel is not very, you know, big deal for data scientists, right? Like data scientists use SAS. We use R, we use Python. So like Excel seems a little simple, doesn't it? Well, it is a little simple. And I don't want to oversell making charts in Excel, okay? But I do want to sell it because I think there are times when R and SAS is just too much. So if you have a small data set, not big data, you have small data. And in our field of data science, often where you have small data is actually about the big data. So if I have like a list of cities in my state, Massachusetts, where I have my customers from, that list of cities is maybe like 50, you know? So that's a small data set. So maybe my customer data set is huge, but my city data set is 50. So all right, so what we're going to do today is I'm going to show you how if you have a small data set, how probably it's overkill to you, SAS and R, you don't have to load it in there. Like these charts you can see on the slide, they're pretty good charts, right? They look pretty nice. Like this one looks kind of pretty here with these things. So if you need to do something quick, it's probably just fine to do it in Excel. And so that's what I'm going to show you today. Okay, so if you need to load data into a statistical program to make a chart because you have a big data set, do it, okay? Don't try to do it in Excel. If you're beginning with data science and you do not know how to use R or SAS or anything, go ahead and try it in Excel because that's what prompted me to learn R in SAS is that you get frustrated in Excel, but it's good experience. So you can kind of see what the problem is of why you need to learn these statistical programs. If you have a small data set that fits easily on an Excel spreadsheet, which I'll show you today, and when I say easily, I mean, you can fit like 100, a few hundred rows on one and with not that many columns, like maybe a small survey data set, and you might not really need to put that in R or SAS. But like I was saying, it might be part of your curation. It might be a part of reducing the cardinality, like the number of levels in a categorical variable. So it might not be your main data you're curating or you're dealing with, you might be just dealing with classifications on your main data. So this is a good way to answer quick questions. Make sure you don't know these slides because then you can get all these links, all right? It's just easier that way. All right, so here, before I do my demonstration, I just want to tell you about the considerations of making Excel charts, okay? So it's hard to do if you have a large data set, so don't even really try it, unless you feel daring. Okay? And the data need to be mostly chart ready. So one of the things that we do in SAS and R is a lot of data transformation. If you need to do that, don't do Excel, it's hard to do. I mean, if you do a little data transformation, I'll show you like what I did. You'd be like, okay, that's easy. But if it's a lot, like you need to make quartiles or something, don't try. Now there's some nuance working with Excel's automated functions to get the chart to look right. So for example, if you're an R and you want to make a beautiful chart, you can use the package ggplot2 and you can learn all the ways, all the options, and you can make this gorgeous chart, okay? But let's say you some hours on doing that, you know your chart will come out perfectly. If you use Excel the way I'm going to show you, and you make this gorgeous chart, you could lose it. Like it's not very easy to replicate it because you're doing it manually. It's like you're building it manually. You don't have any code. It could go away. So that's the main thing is that's why I call this quick charts with Excel. If you don't need something that's very fancy or very involved, then Excel's fine. If you, even if you're making like a bar chart, but you want to put like 95% confidence interval band or you want to put some error bars or do something a little complicated, then not Excel, okay? Excel's very good for communicating with your team. So like if you have a whole bunch of missings in a variable and you want to show it to your team, you can make a chart of that. You know, that's what it's kind of better for. So it takes a long time to format the chart right in Excel. So if you're going to do that, you might as well use ggplot too. So here are your best best practices for charts in Excel and then I'll get to our demonstration. So number one, put the raw data on one tab and charts on other tabs, okay? Raw data on one tab, charts on other tabs, okay? So that they're going to refer to that then use the Excel functions to get a base chart that visualizes what you want. So your order of operation should be format your data, that's step one. And then step two is kind of get the best chart you can get automatically out of their wizard before you start editing it. And I'll show you why, okay? You can make a lot of formatting changes, but well, this is why. They could be erased by Excel's automation. So if you're going to use automation with the chart, use it at the beginning and then tweak it because if you try to use some more automation and might overwrite it. And then what you can do is copy the chart as an image out of Excel and place it elsewhere. Now if you copy the chart out into like PowerPoint and you paste it as a chart, it'll be like interactive and it won't look the same way as it did in Excel, which is ironic like they're both Microsoft, right? But if you paste it as a picture, I don't know if I remember I'll show you how to do it. It's pretty simple, but it just comes out exactly the way you want it. It comes out like an image so you can manipulate it, all right? So everybody ready for your demonstration? Download. We can go to my blog post over here where I got data, this is kind of funny, but the day after Thanksgiving is called Black Friday. And in the US, I don't know if all over the world it's like this, but there's a whole bunch of sales. So there's been in the history of people going to these sales at shopping centers and getting in fights over like products. And there happens to be this online database, it's just basically a spreadsheet, online spreadsheet of news reports of people actually getting in physical fights on Black Friday. So I was like, okay, this is kind of a small data set, right? So let's look at the data set. What I did was I copied it in, right? So I put my own order here. So this is my tab data and you can download this. This is on GitHub if you get those links. So this is order. So I just called it order. So as you can see, this is not a very big data set. It's got like 48 rows, okay? And then the year of the news article, I called it like a report. That was already here and then I put the description as the headline, the count of deaths, the count of injuries. But these I actually added myself. So this is the kind of thing where it's transformation. Now I just typed this in. I didn't need like R to do this. So if you can do that, if you can put up with just making some fields that you're going to need for graphing, then just do that, which is what I did. So what I actually did was I created three periods of the dates because I thought that might be revealing. But again, you know, you have to do the visualization. Then I've got this report value. There's a different event I did when I was demonstrating pivot tables and I still haven't made a video of it, but I'm going to make the video. If you signed up to come to that event, I'll email you when I have the video and you can see what that was like. And that's basically how I aggregated that data was using pivot tables. But you don't need to know that part from what we're going to do today, which is where I got it to count. Basically the pivot tables is how you get Excel to count for you your data, right? And so what I did was I got it to count my data in the pivot table and then I pasted it here. Now, if you hate pivot tables, you could count this up yourself. You could say, okay, how many are in 2010? How many reports were in 2011? And then I made this time series graph here in Excel, which I'll show you how to do in a minute. Okay, but I actually didn't see, if you look at this time series graph, it just looks like this zigzag. It's like really not telling me anything, right? So when I did it by period, like here, I just did this, I can see a pattern. Like you can see that clearly the number of events is going down. Now I realize there's a denominator on here, but these are all like in the US and our population really didn't change that much. So there's something like going on here, okay? And so this is a bar chart. And then I'll show you here, here is sort of this more fancy, you know, stack bar chart because what I did over here was I classified them as a type of incidence, right? So we could look at that over here. Now I'm going to show you how I generated these charts. So first, I got the data there. Now let's first talk about this time series chart. Now you have to have like the year or whatever time period. And then you have to have this information. You could do it across too. But then now I'm going to get rid of this report and try to remember what it looks like. It's got these years across the bottom, the frequency here, it doesn't have any labels on it, just says number of reports. I'm going to get rid of it. Okay, I just deleted it. Now I'm going to show you how to make it. So we're going to actually go to insert here. So see this? This is the most confusing thing in the world. Okay, don't do anything. Instead, highlight your data. So we're going to start by actually what I'm going to do is start by highlighting this data. Let's see if this works. Sometimes it freaks out over the years and it doesn't like the years. So I'm going to start by actually let me try this. Let me highlight this. So I'm highlighting the data. It's already in the right shape. I'm not highlighting order or anything I don't want in there. And then see this time series chart? I'm going to do like this. Now I got to choose which one I want. Now this is where if you want automation, pick it now, right? We're going to start with an easy chart. So I'm going to just pick this one. Okay, that doesn't even look right. Okay, notice how this does not look right. It's because they're trying to graph the year as a quantity. So let's just not do that. Instead, I'm just going to highlight this, okay? Because I'm faking it out. So now I'm going to go back here and do insert chart here. And now I'm getting the right zig okay. But the problem is I don't have a year here. I have this. So now we're going to go to this magic thing called right click, select data. Okay. This is a panel that you can get every time you've already put down a chart object. And it'll help you configure the automatic referrals to your data. So as we can see here, it says horizontal category axis one, two, three, four, five. They just put that there because we didn't tell it. So I'm going to do edit here in this axis label range, I'm going to do put this over here. And then I'll click okay. Okay, there. Yeah, it's nice. Now the series doesn't have a name. I could add the name. Okay. I could just type something, but instead I'm going to just do this here and do okay. Okay, now I'm going to get rid of this series one. Okay, number of reports. Okay, that's all I want. Okay, and this looks right, I guess. Oh, no, that didn't look right. So I got to go back, see how hard it is. Okay, so I'm going to right click, select data. Let me do this in one step, or in two steps. So we're going to choose this axis range here. Okay, that looks good. So I'm going to say okay. All right. So now we got that a little bit better. Now I'm going to help you. I'm going to show you. Well, first of all, clicking, like I want you to look up here. Okay, what do we have? We went to insert to insert it, but what if I want to configure it? It's not obvious. But you have to click on this. And then when it's clicked on, you can see chart design and you can see format. Now, see this chart design is super cool. Look at all these cool things, right? They look so sexy. I'm going to show you the danger of using these things. Okay. So first, we're going to fix our chart up because this is not awesome, because we don't have our access labels, right? Let me move this over here. So we need access labels. So let's go over here. So this is called the plot area. I'm going to click on the whole chart and I'm going to go, see, I'm on the chart design. This is format. This chart design. I'm going to do add chart element. I'm going to do access title. Let's do primary horizontal. And then I'm going to also add the access titles, primary vertical. And see, it just adds this thing here. So I'm going to double click in here. Look, this showed up. That's interesting. Well, I'm just going to type in here. Let's say year. And then I'm going to double click in this and type and say number of reports. Okay. So that's how you get something to be on there that's not on there. So you click on it and you go to add chart elements. So if there's other things you want on there, like a chart title or anything else, this is where you would add it like we could do above chart here. And then we get this chart title. Say reports by year. Now, okay. So you might be like, okay, this is kind of boring. Why should we be boring? Why don't we choose something like sexy like this? Okay. So let's go choose it. That does look very sexy, right? That looks really cool. Let's try this one. That looks really cool too. But like you see, this is all caps. Let's say that I'm going to undo undo. Let's say that I was very picky about font and I wanted all the font in this chart to be aerial because of where I work, we only use aerial, right? I could choose aerial and now all of it is aerial. But as soon as I go over here and I say, oh, this looks so funky, it's going to change it to whatever font it wants. So let's say I fell in love with this, okay? I should fall in love with it at the beginning because now that I have it in that format, if I go over here and I make everything aerial, it'll listen to me. Although this looks, yeah, everything is aerial now, you know, because it, but like I said, if I go, even if I go back here and I re-choose it, it's going to go make it back. So that's what the problem is, is like you almost need, if you want to use one of these really sexy things that they have, you almost need to like practice with it first. And so like here, I've got the simple bar chart. And again, you have to arrange these things right, like this, and these are actually connected. Like if I go here and I change, like let's look at a big one, 2013 is 10. Let's say that I changed it to one. See, it adjusts, okay? Now it used to be 10, so let's do 10. Now notice how it keeps changing like the total here, like it was 10. And if I make this go down to one, it makes that maximum be nine. But if I right-click on here and I choose format access, see how this shows up? Oh my God, this is so. Let's say I make the maximum be 11, okay? I just made it be because I'm doing this. Okay, let's see your tab out here. Okay, there, now it's 11. Now if I make this one here, remember I was changing it? Let's say I make this 15, it's just going to go off the page. It's not going to adjust it. And so it's like very difficult to get this right, right? Because one minute you're hard coding something into it, and the next minute you're using something automated, and then it kind of gets out of sync. So if you were just like, let's say that I was having problems with the database and I just wanted to show the number of reports per year, that's pretty simple, right? Like see how I just I put the year and I put the number. I can even do this here. Let's get rid of this one. If I just don't really care and I'm in a hurry, I can just put this down. I can just choose this first one and see that's literally what I did. And it doesn't have access labels. It doesn't, you know, but if you're troubleshooting a database with your colleagues, it's not that important. It's not like you're going to a journal. Now I want to show you one more thing. And that is sort of how you can shop around in Excel for like a better way of making this. Like this is kind of a cool thing to look at. Oh, I forgot that I have the... So it's really easy to do a pie chart. Like you literally put this on here and you can go over here. And when you enter the pie chart, it totally knows what's going on like here. See how it knows what's going on. It already made this. And I, of course, use something else. You can do one of the things I often add this add data labels. And then you have this choice. I usually do outside end. There you go. Some data labels. But then this is the end. What if we wanted the percent, right? So once they're down here, I can right click on this and say format data labels and see, I have this choice. Like this value, I'm going to change it. Whoops. What did I just do? Properties. I don't even know what happened. Oh, it moved. So I'm going to say insert data labels outside end. Here's what I should have done. Is I should have done format data labels and then percentage and then remove value because that's why I went away. Okay. Now it says the percentage. So a lot. So now that I have this, right, let's say that I was to change it to this. Okay. The percentage stayed the same, but if I had changed the font, it wouldn't have stayed the same. And see, I obviously changed this distribution of report time. I didn't leave it number of reports. So the other, this one I wanted to show you because the way I visualize this data, you can totally interpret this. Like, remember how we already saw the shape that this is going down? Well, part of the reason, one of the things you see is becoming pretty big, like the whole thing here is shooting. And this is US. Remember, we've got this big gun control problem. So this great thing is hand to hand combat. So back in 2010 to 2014, there were a lot more fights and there was a lot more hand to hand combat, which is, I guess, turning into shooting now. And there's fewer incidents because everybody's shopping online, right? So I thought, okay, well, we've got this set up here. Maybe there's like a sexier thing I can do with this. And I really was trying it, like I, what was it? Let's see here. I started with this one. I put this one down and I said, okay, this is an interesting, it's the same data. But then I went over here and I saw some really cool ones. Like, look at this. Isn't that cool? And then there was another one I saw, like this one. No, it was, it was this one. See how it's like, you can see through it. And here's another thing I want to show you about these stacked bar charts. If you right click and you do that select data, you can switch the column in the row. So see that? Isn't that pretty? But again, if you want to go with these fancy things, you have to select it first and then you have to nuance it. Like, so you can see here I selected this first, you know, whatever this was and then I nuanced it and I was like, okay, because any sort of automation you start to apply to it is going to cancel, it might cancel out some of the changes you already made. And so it really is a lot of trial and error. But if you just need something quick, like I just needed to show this distribution or something like that, I can't tell you how awesome it is to be able to just format your data that way, just throw down this default bar chart and then you don't have to worry about like SAS or R for just something simple like that, just a small data set. So I've gone over our considerations and best practices for charts at Excel. So in conclusion, Excel charts can come up pretty good if you get fussy with them and spend a lot of time with them. And they can be especially helpful when curating data or communicating with a team about it, you know, just quickly like trying to communicate something where you don't want to make some code. And it can also work for a quick project like that Black Friday project, where I just didn't feel like using statistical software. And if you're new to data science, one of the things that I've noticed is that there's a lot of data in tables that are just present on the internet is very hard to interpret. Like it was not easy to interpret those trends we saw in the stabbing and the shooting and stuff until we made a visualization. So you already have got stuff to do. Now, what we're talking about here is Excel, which is an application. And R is an application. And these are computer applications we use as tools. But if you're a data scientist, you also analyze data. Like in you analyze data that often come out of applications. Now, if you analyze research data a lot, like you're used to survey data or data from like a clinical trial, the data comes to you in really great shape. Like if you download BRFSS data or surveillance data, you've got like a code book and all this information about what data points were collected, like what questions are asked. But if you're expected to download and analyze data from an application, like from a medical records application or from an application, like a smartphone application, you know, we have a lot of health apps now. And I see a lot of epidemiologists like me or people in the health data science field will be given data from these applications like their PII or whoever's leading the research will get this data, you know, they'll have permission to have it. But nobody really knows what to do with it because it came out of the application. It doesn't look like data from a survey. So that's why I'm holding my workshop application basics. What does governance mean exactly? That's our theme this month. So if you go to the workshop, you will come out by you will come out understanding data sets from applications well enough to analyze them and to produce results. The workshop is about computer applications. So it's a very big picture workshop about how they are designed, how teams get put together to build them and how the data gets stored in the application as a result of this. And I teach you a lot of terminology used in application development. And so if you attend this workshop and you get this knowledge, you can break through communication barriers and get the answers you need to complete your analysis and be seen as an expert, even if someone gives you data from an application. So here are the details of the workshop. Like I said, the title is application basics. What does governance mean exactly? And it's coming up at the end of the month at Saturday and Sunday, January 27th and 28th. Each session starts at 12 p.m. Eastern time, and it runs about three hours. So you still have time at night to play around and have fun. And a normal price for a workshop like this, I priced it out online. Comparable workshops are 250 to $750 per workshop. But because you came to this event, your cost is free. If you want to learn about how to understand data coming from applications, you definitely want to come to this workshop. And also if you want to have a lot of fun, you definitely want to come to this workshop because a lot of people are already signed up. It's very interactive. We're on Zoom. And there's a lot of group learning that you do. So there's a lot of networking. And hopefully I'll see you there. So again, I'm Monica Wahee. I'm data science coach and tutor and teacher and textbook writer. And I encourage you to follow our company page, the DPS company page, because I'm always coming out with new resources. Thank you so much for showing up. And I hope you have a good week. And I hope to see you at the workshop. Data Science Rebrand Program. If you are interested in joining the program, please sign up for a 30 minute Zoom interview using the link in the description.