 Hi, my name is Christian Lee, and today I'll be talking about developing an R Shiny Interactive dashboard to increase interpretability of data, medical residency statistics over time. And so as that lengthy title suggests, the goal for this discussion is to showcase how R and Shiny can make data more intelligible. To do this, we're going to identify limitations of large data tables and isolated figures as a means of sharing information. Then briefly discuss Shiny and then demonstrate how it can be used to address some of these limitations. And so for this example, we'll use residency data. But of course, this is broadly applicable to all types of data and to anyone that's going to be working with data tables or figures and isn't that business of sharing information. And so what we're looking at here are two different screenshots from data reports. This is from the double AMC showing two and a half specialties here, different metrics that programs use to rank or to assess medical students. For example, there's test scores and research experiences, work experiences, the scores, the standard deviation and percentiles. And then here this is from the NRMP table one, which was a page of 320 or 220 showing all the different specialties the number of positions offered and the total number of applicants. And so both of these reports contain very useful information, but clearly it can become quite cumbersome. And we'll get into some of these limitations in the next slide. And so the first thing that jumps out to me is that the data reports and there's more than just these two are very disjointed. And so as a result, there's multiple different sources that contain some overlapping information, but also a lot of unique information. And so this just makes it a bit more confluted, a lot more confusing as to where to go to find information that you're interested in. Additionally, sometimes the actual size of just the reports or even the data tables themselves are just too large for the average person just to take the time to go through and make sense of it. Additionally, oftentimes these reports contain snapshots, meaning you're looking at a single application cycle. So you lack the longitudinal information, you lack the context of time, as you do not get this complete picture. And so similarly, oftentimes you also are unable to compare your specialties of interest to the other ones. This figure here does a much better job by having all these specialties on the same axis and then you can compare the scores. However, this is just one table of one report from one cycle. And this does have other limitations. For example, again, it's just a single snapshot and it's not interactive. And so how can we use R and Shiny to help make these tables and these figures just more useful to the users? And so taking a step back, what is Shiny? It's a package that you can use in R or Python. And it's extremely good for building interactive websites and web applications. It just has three components, a user interface shown here. And so this is the layout and the display, server function. So this contains the instructions to say make these figures and then just a Shiny app function that pairs these together. And so again, it's highly user friendly. And so I would recommend this to anyone that has been on the fence about using Shiny or wants some way to visualize their data and make it interactive. And so with that, I want to showcase here MedStats. This is a web application that was built using Shiny to help overcome some of these very limitations that we were talking about a couple of slides ago. And so at the very top, you have tabs. This first one here is very clear. We're looking at applicant metrics. And so by default, you're looking at all specialties, but you can go and filter this. So maybe you're interested in some of the surgical subspecialties. So we can just select some of those here. And then we'll go and delete all. And so as you'll see, the figure here responded to our selections here. And so that's the very first thing I want to highlight is that instead of now looking at these tables that contain maybe 20 plus specialties and you're trying to follow along, look at the score for one and look at the other one to compare them, now you can actually have them all side by side in these figures that are just going to adjust to your user selection and your preferences. You can select the year. Again, the figure is going to adapt accordingly. You can change the metrics. For example, when we look at the step two score, as well as normalization methods. So maybe we just want to look at the raw data. And so now we're looking at this for 2019, we're looking at these three specialties shown here, looking at the step two score, and we're looking at the raw data. So the red line, this is the average. We can see that the general surgery categorical program had the highest step two score in 2019. At the bottom is the table. As we were looking at a few sides it goes that we couldn't even get the full table into one screenshot, but here we're able to condense it just by reducing some of that information and just really stripping away and only hiding the really important information. And this table here also adjusts to the user selections up here. And so now looking at specialties stats over time, what I really like about this figure again is that it now has the different application cycles on the x-axis. So now you're getting that longitudinal information in that context of time. So you can see how things are gonna change from 2018 to 2021 and so on. So again, here maybe we want to look at the number of research experiences. We can see for several of these specialties they're showing these upwards trends. For example, let's look at this one, thoracic surgery integrated had 3.5 number of research experiences in 2018. And 5.1 in 2021. And so as you can already see, this graph is interactive. You're able to get more information by hovering over it. It has longitudinal information. And again, it is interactive to users. For example, if you only want to look at certain specialties. And again, this is just so much more user friendly than looking at these huge tables or these huge reports of 220 pages, trying to figure out which information is actually interesting to you or relevant. And then finally, we looked at one other figure earlier or one other table that was just showing the number of positions offered and the number of applicants. However, again, that was only a single snapshot. Again, with this figure here is we're looking at different application cycles. So we can actually see the trends over time. And so here we're looking at anesthesiology, diagnostic radiology, general surgery or the physics surgery. The PO here is the positions offered and a the number of applicants. So for diagnostic radiology, you can see that the percent change of positions offered over this timeframe has been 4.7. So not too much of a change, but the percent change of number of applicants has decreased its negative 16.6 over this timeframe. And so again, I hope these figures just highlight how we can use R Shiny to address a lot of these limitations of just using static tables and figures. We can make them interactive. We can look at trends over time. And so that's really it. And so thank you. If there's any questions, please just leave them in the chat below. Thank you. That was ridiculously good. I think I'm allowed to say that as the moderator. By any chance, do you have a microphone so we can take some questions or we can ask you for questions? Cessie doesn't have access. Hey, Linux Foundation folks. Yay! Christian, can you speak now? Hi, yes, I can hear you. That was really a lovely presentation. I teach in a medical school and occasionally I have the good fortune of having a person like you who show up who has so many skills. The million dollar question, how did you learn to do this? And can you offer any advice for young, possibly busy people who want to pick up Shiny? Where should they start? Well, so the way I learned is before medical school, I worked in computational biology. So I was exposed to R. And by actually just started R Shiny last year. And I think for most things, I always suggest get started with the project, start simple and just build up from there. I think even this is a great example. It's just taking a dataset from online, making some figures. Already online, the R Shiny documentation, just a few clicks and you can already create your own little dashboard. And it's also great. You can even host some of these sites for free. So I would say it's very user friendly. It's just a matter of starting small, but getting started. Right on. Were there any big lessons learned along the way? I think one thing that I ran into is that at first I just designed it for the desktop, but then later on when I was showing my peers, they all of them were accessing on their phone. And so that became a challenge is I had to go back and then add a lot of different. It's a code just to make it more user friendly on mobile devices. So that's just something that now going forward when I designed these apps, I'm a bit more aware of and that I have to also think about other platforms, devices. Yep. Been there, done that. The first time you try and open something on an Android tablet that's just a different size or whatever, been there, done that. Eventually it'll work. Can you do mind saying a little bit about how the sausage was made? So I saw a few packages. Was that a lot of plotly mixed in there? So just that one, I think that was showing with that interactive one where you could hover over the points. That one was plotly. The other ones were just GG Plot 2 are shiny. And if you actually go to the sources on the website, it will have all the different packages listed and as well as the data sets. And so right now those data sets are just static. My team and myself, we just actually had to do a lot of manual work to kind of compile these data sets. Again, the problem was that it's all over the place. There's so many different sources. So we compiled them together and then just from there analyzed it. Some of the calculations, for example, like the normalized score that was done at our end, that was not part of the raw data set, which again is another benefit of actually using shinies that you're not just limited to what the reports show. You can actually take that data and manipulate it in different ways that are going to be more informative to you. Right on. Loaded political question. Does this help your career as a medical student? For me, looking at it as a person who's far enough along, I can work on things that are fun. Have you gotten any feedback like stop and spend more time with the stethoscope or how has the reaction been locally? I think people are excited about it. A lot of my classmates, they find it very useful. And I think a lot of people also like they wish they had more exposure to R and this technology side. And so actually something I've done is I even hosted just like an introductory R session at my school. And I think as even mentioned in the chat, a lot of students are involved in research. I think having some experience with R makes you a bit more useful, it makes your work better quality. And yeah, so in terms of how it's gonna help me my career, it certainly has helped again in the research department. I do plan on perhaps mentioning some of these activities in my applications and so on. And I still wanna be involved in the whole R analysis and computational side as I go on through my career. It would be such a waste for you to not take advantage of clear talent. So that was actually my next question about do you foresee using these skills as you progress along? Do you expect that residency or some part of it will block you from this or? I think certainly time is limited. Actually next week I'm taking step one exam. And so that's a big one. But I'm always looking for ways to combine my interests and experiences with the computational side with what I'm seeing in medicine. And so I actually have a whole bunch of other projects kind of going on at the same time. And so I absolutely anticipate that it's gonna continue in the future. I don't wanna just be a clinician but also integrate the two. Yeah, at the University of Miami we're putting so much thought into the continuing medical education process to make sure that people have the tools to analyze what's going on in their clinics. And so like you are the, what's the right word? You're the model that I personally hope we have in the future where people have the tools to work with their data and build up visualization so they can understand and help their patients. I should be looking at the chat to see if there's questions coming in, bad on me. Let's see. So if anyone has questions we've got a couple of minutes left, just a couple. Feel free to throw them in the chat. I see a good one about asking if we use PDF tools or Tabularizer to get the data out. I try to hold onto different methods to extract the data. I tried scraping it, but it just wasn't working. And then when I would compare the PDFs to what I was getting as the output they just, there was too many incongruencies. So we decided that just to make it as accurate as possible we just had to go through more of a manual process which is unfortunate, but something that is cool is that we're actually working with some of these organizations now are kind of in the beginning phases of these conversations of maybe we can access this data in maybe just more physical formats and going forward we don't have to do so much manual work. Yeah, the whole process of parsing PDFs is a nightmare. So there was a question about do you think more med students for learning are at similar tools in med school or is most of the learning happening after graduation? What do you see around you? I would say there's an interest I've been asked several times, like, oh, what's the best way to learn R or Python? But then I find that there just really is limited time to really learn the tool as well. But I would say that if you're passionate about it and you have, and it's something you wanna do you wanna build something then to go for it. And again, it is very user friendly but I would say probably it's gonna be after residency is one of my peers I have expressed interest will actually have the time. And so I think ideally it's something that you would learn during undergrad and could use those skills during medical school and afterwards. Right, I see that in my students and obviously you're the example of why learn it as an undergrad because when you get to apply problems you've got the tools to take it on. Let's see what else is it suggesting from, oh, please. Now just one thing to that ends I saw also comment there about more undergrads or using it using R in classrooms and I was also one of those students but I find that a lot of students don't quite understand the scope of R they just think it's just for linear regression or just to do some very basic stats and they don't understand that you can make functions or even that you can use R Shiny and make web applications that there's the scope of it is so much bigger than just doing some basic analysis. And so that's another aspect of it was actually a major focus of the talk that I gave to my classmates was like you can do a lot more with this than you might think. That sounds like a great panel for next year to talk about the different uses and how to get people to realize all of the magic that's available. There's a request, can you share the R code for your app? Yeah, I think it's on my GitHub and I can definitely, I think it's public but I'll double check that and... Right on. But if not, and I will make it public but I can try to send that link later on. It's a beautiful app, I wanna steal it. I wanna steal from it or be sure to add your Creative Commons license on there so we know what we can do. It's just a lot to love there. Oh boy, here we go. Have you considered using Python to convert data to Excel, to or from Excel? So is Python in your wheelhouse as well and are you trying to combine R and Python? So I didn't think about using Python for that. Well, actually I think that is something I did try and so I do have also experience using Python and I think also now Shiny is available in Python which is really exciting. But I just, and I'm not quite sure on the details as to why I was struggling with extracting the data from the PDFs. And, but even with R you're able to scrape data that's something I've done before as well but typically that's more in the table format versus these PDFs. But so I would need to spend more time thinking about how to use the Python to convert the data as opposed to again doing it manually but that's something to definitely look into. With many years of experience the reason why we're having trouble working with PDFs is because you're human and humans aren't designed to work with the raw data in PDFs. It's incredibly challenging. Okay, our time is up. Thank you for taking the time to talk live and that was a brilliant presentation. I believe.