 Well, good day everyone. Thank you for coming today to today's demo Are for health care data viz. I see a bunch of you are here. I guess I'm just going to quick get started because I know people will be coming in and You're all going to want to see my demo, right? So I'm Monica Wahee and I'm a data scientist in the health care space and I do a lot of education. So today's lecture is is So I work a lot with SAS and even if you don't use SAS You might know about SAS so SAS users and people in health care analytics are always looking to improve the clarity of our visualizations so I publish a lot in the peer-reviewed literature and I help my customers do that too and I teach people how to do that and SAS provides visualizations, but they're mostly for troubleshooting like box plots to look at distributions They're not really for prime time. So then you get stuck with what do you do? Do you use Excel? Well, it doesn't really make a box plot and And then you you you have those kind of challenges. So customization is important, but it's kind of hard in a lot of Statistics display programs, you know, people say well Tableau is so beautiful, but you know, Tableau is a little challenging, right? You can't fudge it. Your charts have to be accurate. You have to be using the the actual estimates from your data And also even though I publish a lot in the peer-reviewed literature like I'll make reports that you print out We're moving in a direction where if you're doing that, why aren't you making a dashboard? Why aren't you making something that you can automate and you can put dynamic data behind? I mean, can you can you make the next step of your development be that and The answer is no one's SAS or you know, SAS would like you to believe it's yes because they have some tools for that But in a practical sense, it's not very easy because SAS was invented way before the internet was whereas R was sort of invented along with the internet and So if you can make a visualization for a journal in R It's not too much to take your code and turn it into a dashboard or something at automated. And so Basically, I've been a healthcare data analyst like a bio statistician for about the last 20 years And as soon as R was available, the first thing I was using it to do was to solve data visualization problems in healthcare So I want to share with you today some quick and dirty tidbits just things you can quickly do in R To solve your data visualization problems and before we go too far. I want to remind you that won't remind you I'm announcing I'm going to be running again my free online workshop and application basics The one I ran last time we focused on SAS integration This time we're going to work we're focused on our like it the thing I just said what if you want to make a dashboard in our but you know You know, you have what Oracle or you have epic or you know, like is it possible and the answer is yeah So if you want to know how join my free workshop, so it's a six-hour workshop But I don't make you sit there for six hours. It's actually in three sessions. Okay, and there's no homework in between I'll put you in groups during the workshop and zoom groups you can do that and make you do the little challenges So you learn something right so each group each session it will last two to three hours depending on how many people show up You know, I try to make it not last too long But I do want to put you in groups and have you you know debrief to everybody And so you have these three sessions Monday September 25th Wednesday September 27th and Friday September 29th each session excuse me it starts at noon Eastern time and And and like I said takes two to three hours and then finally you have an independent private Zoom meeting with me for 30 minutes. It's a wrap-up session all right, and you get free access to all the online course materials and There's a link to register for the free workshop in the LinkedIn event description and it's an I tried to put it on the all the posts and stuff in case you want to go All right back to our regularly scheduled program so why am I talking to all these people about our for Like journal figures for data visualizations in health care that you're gonna use in health care You need to use to communicate scientific community or to patients or to colleagues Okay, because our Makes it very easy To visual put on your visualization what you want okay, so Imagine you've got these two bars and Error bars and you would put a dot in the middle like an asterisk to say, you know P less than 0.05 or something like that You just go crazy in other programs because you if it's an older program like sass You usually have to cause a calculation or do some sort of fudgy thing if it's a newer Um, uh, the newer sort of application like Tableau. Well, you have to learn how to use tableau Which isn't obvious to me. So of course to do this you have to learn that are but it can be so quick and dirty So that's what I'm gonna talk about today So in our so those of you are not familiar with our R is free and open source But if you download our GUI, which is like the simple version I'm gonna demonstrate today It's like visually simple if you download and install our GUI You'll find there's not much in it, right? It's not like base sass So base are not much in it some cool stuff, but not much in it Why because it's free and it's open source and so the community develops packages that you can install and Then it doesn't have to be very big So there are two main approaches and are for graphic as you can probably guess now Base are has some graphing capabilities like you make a box plot you can make histograms whatever in base are but I When I started using our I I was writing a journal article and I tried to use base are for my journal article And it looked pretty good. I mean the the base are produces pretty nice Visualizations, but I was trying to get picky with it and put dots on it or something puts some words on it and I was limited so What I learned back then and what everybody knows now is that the go-to package for plotting and Are is called gg plot 2 Now why the two budget, you know, there's a folklore. I don't know what it is But gg plot 2 is your go-to plotting package now I I'm just gonna be clear gg plot 2 isn't just one really package I mean it is a package of gg plot 2, but if you Google online for Instructions on how to make our plots almost every blog post will say well you need more libraries than just gg plot 2 But gg plot 2 is sort of this base package that works with other packages So let's say you were going to invent a new way of plotting something like a heat map or something You would probably if you were an R developer you probably base it on gg plot 2 So gg plot 2 is super basic package for visualization and are Okay, I'm not knocking base are base are is good for quickies like I use base are for box plotting Like if I'm going and I'm like, what's the distribution box like I even have a blog post about it using base are for box but Gg plot 2 plots are eminent like husband. I was just so gorgeous You can make them so gorgeous and it's not that hard actually you just have to be patient You have to fuss with it like you'll find yourself like I'll find myself when I'm getting ready to fuss with a gg plot 2 Plot for a journal. I'll give myself a good half hour to an hour because I want that thing to be perfect I'm you know, but I'm perfectionist, but still it's fun and you can do it and that's why I like it So many other are graphing packages rely on gg plot 2 like I said in this leg this lecture today This demo will be gg plot 2. It will not be base are okay So I'm already getting you into something advanced and if you download the slides to this you which you should be able to Do there's a link you you will get these links and there's I have a blog post About using hexam decimals colors and are which sounds hard, but it's not I have a blog post about adding error bars and Then there's github files, but right now I'm going to move right on to our demo So let me go find my are Okay, so here we are in our Let me see if there's any questions on the chat here. No, I guess not Okay, so I've got this data set that I'm going to demonstrate to you. So where this data set come from I it is associated with the peer reviewed article, but the article was about mice and I honestly Or don't remember much about the article. It was one of my customers needs some help Like visualizing his data. Okay. So if you're this is our GUI This is the console. This is kind of like if for sassy's is this is kind of like the log file Um, you know, that's the closest thing and see comments here are with them Uh, uh, what is it octo thoughts? So I'm going to read this data set and this rds data set and it's called metric plot data So I'm just going to execute this And now I'm going to execute this I'm doing control r to just show you what's in it All right, so let's go over this data set. So what's in here? It's so cute. It's a baby data set Um, it's actually a summary data set. Okay, so let me just make sure you understand what's going on this data set. So There were four groups of mice if I remember correctly a b c and d okay And this measure each of these is a distance that was measured It's something to do with periodontitis and I know mice don't get periodontitis. So don't start with me I don't remember the article. Okay. So in group a this measurement Here was the mean and here was the standard error. Okay. So where did I get that data? Well, I got it from somewhere else. I calculated it and I made this table so one of my other events in this lecture series is about how to make tables from scratch and r And that would explain to you how simple it was to do this but basically I want to use your imagination and imagine this isn't a few mice in a lab study I want you to imagine this is like surveillance data. Okay of like a lot of people And this is like four states like minnesota, florida, massachusetts, rhodiola, right? Okay, like four states, right? And let's say this measure is like Um, like this one is rate of smoking Right tobacco And then this would be like rate. You see what i'm saying and then this would be the same four states And this would be like rate of binge drinking and this would be like the rate So what would happen is if you were doing this in sass and using like brfs s data, you'd be like Oh, you're telling like proc univariant and stuff, right? And i'm like, yeah Well, you can do all that in sass and steal the numbers on the sass and quick throw them into bb duty set Just the size it'd be the same size because this is the structure you need for gg plot 2 So basically you have to cook your data set up to be in the right shape for gg plot 2 and I always think about it as You're you're going to need This value you're going to actually graph. So in this case, we're going to be making a bar graph But gg plot 2 Syntex is kind of similar no matter whether you're making a box about or whatever You're going to need to know the name of your variable. All right So then the next thing we do is we run gg plot 2 library So this calls in the library. You have to install this. So The other day I was helping a customer and she was using a really complex new Package so I threw away my old r. I just erased it and I downloaded it again And I had to install the packages again So that's what happens is when you erase your r and you download install You have to install your package. You might just need to make sure you've installed this gg plot 2 Okay, so now I'm going to run this plot. Actually, let me just run the plot right now. So this is my base plot That's what I'm going to call it. Okay So let's look at these colors. These are the default colors that come out and are if you just run a plot You don't tell it what colors you'll get this. Okay, so I went over what was in here you know a group a b c d right and Let's see here down here. So these are the measurements that are labeled So let's just go back to this. So this is the command with gg plot And it printed it to the screen, you know, you can do like If you do like that, it'll store it in this p for an object. People do that Um, but this made it come to the screen. So what's going on here? Well sass users who like data steps, you'll probably recognize what's going on because It's executing this line then plus then executing this line then plus and do I forget the pluses? Yes, I forget the pluses. They're kind of like Semi colons and sass right, but you got to make sure the plus there and executes this line and executes this line and then this is the end right and um and so okay, so this is what This is gg plot two syntax even though this says gg plot. Okay, so how does that syntax go? The first thing you call is the name of the data set you're plotting In fact, I think you can even put like file equals or data equals there. Okay, then this is really important The next call is this aes and if you ask me what that stands for I've been told this stands for aesthetic, but I don't know why And it's here where so this is a little weird you're designating um What x's and what y is and what fill is like you're designating all these things in the aes But if you just run like let me show you I'll close this if I just run this nothing happens I mean, I shouldn't say nothing happens. It's blank and the reason why it's blank is I'm just telling our like look in this data set and you're going to find the x is under measure and the The measure is the x Mean is the y you know not se or whatever it's mean and fill is group this one You know like I'm telling it that by the way if if anybody ever tells you to graph anything Use this these variables use this values, right? But now I'm going to add the geome underscore bar Like what do you think this is going to be it's going to be a bar fly right? I just showed it to you so This so you'll see that there's an argument inside geome bar like this gg plot 2 has this argument And then there's this gloss we're still in gg plot 2 land So geome bar has this argument position equals position dodge What that does is it makes the bars be next to each other? Otherwise it's stacked and boy did I get confused I first started that And then stat equals identity is the hack to make it I don't know look at the at the mean or I I'm not really sure But that's how you have to do it to get a good and I'm just going to run this code just to show you That adds all this right now everything it's not stacked because I did that position thing But it doesn't have any um access labels. So y label mean mm and x label is measurement So now I can run the whole thing And you'll see um that the the labels on there, okay, so um I've shown you how to make the most basic of Of plots and often the work Is in trying to actually figure out this little baby d set you're going to plot like what is the shape What info do you want to put in it because you notice I put se in here? And I'll tell you why right and see how this fill equals group kind of makes this work Well, let's say that I didn't have the name of that column wasn't group. Let's say the name of it was was um Condition or something and I didn't want that here. So then I have to go. What do I do? Do I just thought that data frame? That's what this called data frame or do I Do some sort of hack here So you can really go down the rabbit hole with gg plot 2 and getting it just right, but I wanted to show you um Let's see here. I wanted to show you how you add to the base plot So here's one way I can add to the base plot. I'm Gonna pull this out here. So this is this is the same old plot See this I'm gonna close this. This just opens. We're ready. Okay. So we're using the same data Only now what am I doing here? So see this cool colors So I'm making all I'm doing here is making a vector Okay, and this vector could say anything like anything between here. It could say like monica data Scientist or whatever, you know, and you can put as many as you want and this is just a character vector and I named a cool color So I'm gonna run it And then I'm just gonna show you what's in it, right? And it's just this but you guys know what this is It's hexadecimal colors and I had to use a color picker to figure it out And that's one of my blog posts you could look at you can explain how I do that. Okay So now cool colors is is a vector. It's a thing. Okay So here's our ggplot data again or a ggplot Command again, and you're gonna recognize this. Okay metric plot data aes measure mean group Those group. Okay. So everything's normal gm plot, you know, okay. Oh, oh here color equals black Well, what am I doing? Right? I I don't I don't like those default colors. I'm gonna get rid of them I'm gonna just make it black, right like painted black You know, like rolling shows so I'm gonna start by making it black But you can kind of see what's coming, right? And this is where it gets data steppy is I'm making this black here Adding the labels and then I'm using scale fill manual. This is such a cool command And values equals now I could say value equals this whole string But it's just easier to do it this way because I can change cool colors to something else. You know, um If you go to my blog post on the colors are has different ways you can refer to colors Like there are some programmed in colors like red green, you know And then the you can use rgb. There's just different ways and there are um There are packages that will automatically give you colors. So there's all these are boys. So let's just run the spot with the cool colors See that I call this like beach ball colors You know, like makes me want to go to the beach play with the beach ball. Um, let's see here. So Now I another um Of those blog posts. So let's say you don't want to go to the beach. You don't want the beach ball Um, so I used to be a fashion designer and one of the things we learned in fashion design Is that if we were sort of needed inspiration for colors, we should really look to nature Okay, and I actually feel like a lot of my plots are very boring And I have trouble getting colors to look different. You know, they all look the same So I took a picture when I went on vacation at a vineyard the wenti vineyard and I used uh colors from that picture to try and make my own palette, right So you can see here. I I make these hexa hexadecimal colors into um just Of variables like wenti sky wenti leaf wenti pole wenti graph and you can see this on my Blog how I did this and then I put them all together into wenti colors, right? and so now How do I put wenti colors in there? So simple I go see how this used to be cool colors. Now. I just replace it with wenti colors and we're going to go look at it So you see you can see here. Here's the sky the leaf You know the ground I guess the pole maybe this is a pole. I I'll have to look up my webpage All right, but see how this is this created sort of a new You know a nature and spiled palette and the so I'm I'm being all like fashion designery but the reason why I'm doing that is you sometimes need this When you're doing scientific communication visually like red is alarm and green is it's good and Sometimes you want a gradation like this is your yellow is you know on the way like you want to be able to control the color so you can control like your your communication and That's what I like that's one thing I really really like about our and so um, Let me see here. I'm gonna open Uh, oh, here's the one about adding error bars. So remember um standard error Remember standard error. Let's look at our data again So so far in our plot ggplot code You've noticed that I've talked about group the group Column and I've talked about the measure column. See here's group is the fill Even though we made it black and made it prettier last time still the group is what caused it to be filled different colors because that's what comes out on the on the On the legend. Okay. So I got that going and measure says is what's across the x-axis the three groups, right? And mean is is how tall the bar is So I haven't used se yet. Well, look what I'm gonna do now now. So remember how I I showed you if you just run this like nothing really appears I mean a blank plot appears because we haven't told it yet that we want gm bar. Um, and so To add to that idea we can add more stuff than just gm bar. We can add Gm error bar too But we have to add it downstairs here because we have to wait until all this other stuff is here Well, you don't have to have all the other stuff there, but you want to have some stuff here Uh before that because you want to make sure that the error bars are landing in the right place So let's look at this. So all all this is we're pretend. This is the cool colors. These are beach ball colors Um, so we got scale from that. So I added a plus And then gm underscore error bar. So it's like gm bar, but again You're gonna have different arguments and so Here's the aes now, um I'm putting arguments in here in the aes. I put y min equals mean minus se And y max equals mean plus se and Where you're probably going, how does it know what's the mean in the se? It's literally calculating this. It's saying Put the y minimum of the bar Whatever this is minus whatever this is. So you're like, oh, yeah, this is plus or minus one se, right? And if you wanted to se you could do two times or three whatever Or you could make a margin of error that sometimes I'll do that I'll make a margin of error because I'm doing confidence intervals. Well, whatever you can figure it out But you can you can throw it in here Like if you want to if you've got a weird margin of error, like you're using a pool to see or something You can just precook that stuff and sass and just throw it in here and then use it And then this width just says how wide it's going to be and then remember position position dodge And and I've got it dodging a little let's just go see what I looked I I think I so again This is where you go down the rabbit holes. Oh Look at this look at this and you can tell it's doing it right because this one is a little one And this one's a big one and doing this in excel is such a pain You have to go and add it each one of them and this just I was so happy because I can do it in excel but um, but anyway, so that's the error bars and um, you know, that's what I said I was going to go over but I also want to I in between Um setting this up and now I realized I wanted to show you something else and that is how you can add text Um to it. Let's see here. Oh, that's good. Which I added this to the github too So, um, let's get rid of this here. I'm looking at what did anybody? If you have questions you can put in chat Okay, so what if you want to add text and I mean any text You want to add data labels? You want to add just a message? you want to add like, um like a like, uh Dots or asterisks or p less than there a p somewhere or I don't know Yeah, you want we want to put a line across and and say this is the like quality line or something I just love our because you can use gg plot too and you can just do whatever you want Which is why you go down the rabbit hole for an hour With every plot before you're ready, right? So What what do we have here? Well, if I just run everything up to here We just we have this we have our base plot with the cool colors and no error bars nothing fancy, right? So that's up to here. So what I'm let's say I add this next Line, what does this next line do? It's called geome underscore text Well, what this does is this adds the data labels onto these bars, okay So the label equals now, I could say in the a s I could say legal equals mean, but the means in this I mean, I think let's see It's like raw means, right? So they're like long and they're going to be ugly as a data label So I made it be round mean and then zero is like just rounded cut off everything after the Desk Then position position dodge and this within this is again because the bar and then v just you probably guess with that It's vertical adjustment. I put 1.125. So this is like, you know how like these data labels will want to come out right on the line Like you don't want to write a line So let me add this see how easy it is a trouble So you just keep going one line But it is actually really easy to screw up your code while you're building one of these like you can get Everything going and then suddenly something doesn't work. So you want to be really organized and save a lot of versions um I think maybe our studio is better at that stuff. So here's what happened see that and it's not um It's not really perfect because see this is kind of dark So if this is a I was going for the journal I'd probably like oh, I probably want to lighten this or do something about this dark color But otherwise it looks pretty cool Then I also wanted to explain how you can just add text You just arbitrarily write anything on there and that's where you add this annotate here So annotate is a little fancy. It's a little You know Ggplot2 is a lot like data steps because it's totally and internally inconsistent, right? So suddenly this is annotate and then it says text and what does that mean? It means you're annotating with text, but you know But here this is pretty easy x equals 1.5 and y equals 210. So where do I get that? Well On the x-axis, this is one and this is two and this is three So 1.5 is going to be somewhere along here And then 210 well, we know where 210 is it's about up here, right? So I I'm trying to put something right about in this white area, right and the size is eight. Oh look, it's hi mom Too bad. My mom's not here Hi mom I should have invited. I didn't invite her. I don't know what she uses. I don't think so But she'd be open to it because it's free. We love free stuff um, all right, so as you can see in and So you can also see why this is going down the rabbit hole too because you'll sit and fuss with all these things you can google and go on to stack overflow and um, or or there's some good blogs and so Getting back to my slideshow here so Why ggplot2? So here's sass um Talking to me and I said sass. I wish I could just make the base plot and then add annotations and forming to it with extra programming just like I do sass in data steps But sass has sorry sweetheart. You can't right and so ggplot2 solves that problem So of course, I love sass I'm not going to leave sass But you know, maybe I just need to take the summary and the proc you to vary its stuff out of sass And make a little baby data set in r, which again, there's another event in this Um, I think I'm posting it tomorrow. Um, that's about how to make a table like that in r, which is pretty simple um Like I mentioned earlier, I was I just demonstrated this in our gooey Our studio is an integrated development environment, which is the It's just a different way of using um r and it opens windows Like you saw how that would open this output and then I close it for a plot Our studio makes it really easy. It'll just open it'll just refresh a window and you can make um Dashboards and stuff. Um, I'm just kind of old in old fashion. So our studio is easier on me, but But either way, I can give you my ggplot2 code from our studio and you can Kick it up a notch and make it into a dashboard, you know, but and you can use packages like Our shiny is a package that I've used with a collaborator to make an r dashboard And as you can see you can add lines of code for formatting And I just wanted to go over a few things. I didn't show you but these are just some features or tips about ggplot2 So first of all, like I was saying earlier, you do need to calculate the values You want to graph and make a plot data frame. So one graphable value per row um So we need you sass, right? We gotta figure out each value each y or whatever we're going to graph And and put that in the plot data frame. Okay, and it's not hard just to put those together from scratch So that's not bad. It's just you have to do a lot of research before you approach ggplot2 um Also the order you put commands down in ggplot2 Um will change how they execute. So remember that black and then cool colors um If I put something another thing on top of that it might screw up the colors it might screw up the legend and so um And so uh, you have to again save often save backups sass users know all about this Um, also you can save like I mentioned you can save the ggplot2 results as objects in r Like you can save it as a p or you don't have to call it p but a lot of people do for plot And then they'll apply ggplot codes to the p like they'll say p plus Arab geomar Arabor But I don't like doing that because I get confused in my own So I try to build my code so you don't have to do that Code readability really counts. There are a lot of nesting going on in ggplot2 plot So you really have to be very neat about your pluses and indentations Um, I didn't go over this but there are these things called themes And usually they're the last line you apply and they can be a they can give your plot a look like theme underscore minimal Makes it very minimal. You know how like in an excel where you could choose Your your plot to not have really any lines in it except the axes, you know, that's what the minimal does Or you could have theme maximal. I don't think it's called maximal But you have these different themes that really make it look nice Um, and so uh, so you can apply those but that's usually the last line of code also, um gg save is another command that works with ggplot and that allows you to output those plots in a certain size and shape like PowerPoint right and And it's great for journals, you know, because it just makes it resize it because you could probably see I can or I can show you that when you Like here, let me just show you See this plot here. Like this is just a window, right? Like I could make it really wide And if I wake it really wide and I do file Save as I'll just save this wide thing arbitrarily wide, right annoying so Gg save allows you to like export it in a perfect way and I actually have on my youtube channel If you go there and look around there's a Uh video on how to use gg save. Um All right, so, um, I'm looking to see if there's any, um Uh, oh, that's great that Karen Karen came. That's wonderful. Um, and uh, Those of you look in the chat, uh, my assistant Ebenezer is here. Thank you Ebenezer. You're so helpful Um, he has given us, uh, the link to sign up for the workshop. So So what the workshop is about is uh, those of you who joined later, um, I'm offering this free online workshop It's three sessions and it's about applications And what are applications? Well, if you don't know show up Sign up for a workshop. It'll be three sessions which each will last two to three hours It's based on an online course that you'll normally cost a lot of money But you'll have free access to as far as this workshop. Um, and The the workshop is about applications and how to do like application integration how applications are built How they work together, but the way I'm going to teach this workshop is we're going to focus on using r ways you can Put r in your application pipeline and if you're like, I don't know what any of that is don't worry because I teach you a lot of terminology If you're trained like a data analyst or a bio statistician You aren't really trained formally in computer science. So there's a lot of computer science stuff Like you didn't get in the classroom So this is to sort of fill you in on that and then let you like expand your ability to use r and sass and these other things And let them play with other applications like sql or you know All these apps we have now are your social media apps everything So if you really want a great hands-on experience working with other people um and learning about applications in in data science And especially how you can use r with them because remember r is free. You just have to know how to use it I encourage you to sign up for the uh workshop And and as soon as you do i'll get back to you and i'll give you the instructions on how to get to the chorus and And and what you need to do and the workshops are on zoom Just like this event and then uh and and if you want these slides just remember to go to the link and um That's with this event and download the slides and you'll have the link to all of those uh those resources let me um go This is the course online course and applications basics um that we're going to go over in the workshop that you'll have free access to This is um my blog post about colors and this this sort of explains using the code that I That you're gonna have access to you know that I just demonstrated And here's another post That explains how to use these um These bar. Let me see if I put the gg save. I should really link the gg save Uh video on there. I don't have it. I'm sorry, but yeah those links are all in the slides So make sure you download the slides all right any questions or uh any feedback or Um, if you asked me to program on the fly it probably won't work, but I'm willing to try That was kind of a poem right? Well, I wanted to thank you for showing up to our demo today And I'm going to be doing some demos over um september and thanks again to ebb and easter who's helping me He's going to help promote them. So you know everybody knows that they can show up and get the free stuff So thank you very much. Um, I hope that this starts your week out well And that um, oh, you're welcome I hope this starts your week out well and that you have a good week and that I see you at the next event Thank you for watching this video Which is part of the public health to data science rebrand program If you are interested in joining the program, please sign up for a 30 minute zoom interview using the link in the description