 Hi, I'm Peter Higgins and I'm here to tell you about the medical data teaching package. So why do we need a medical data package for teaching? Learners find relevant examples motivating, and these data sites illustrate data challenges they may face. There are a few medical data sets in our packages, but these are widely scattered. Many of these data sets are poorly documented, and it's hard to understand where the data came from, and they're quite bare bones. It is really convenient to have data sets wrapped in one package. It's easier for students and for instructors, and you can reuse data sets across teaching concepts. So currently, medical data has 15 medical data sets ranging from the very old trial of six therapies for scurvy to a 2020 COVID testing data set. The code can be found on GitHub at this link, which I'll drop in the chat, and this is an overview from the package down website for the package. The contents include two historical reconstructions of data sets, the 1747 scurvy trial and the HMS Salisbury by James Lind, which was published 10 years later, and the 1948 streptomycin for tuberculosis trial. It also includes five other RCTs ranging from cell and DAC from polyps to endomethysine for prevention of post-ERCP pancreatitis. There are six cohort and case control studies, including the previously mentioned COVID testing data set, the esophageal cancer case control data set, and cytomegalovirus, and after bone marrow transplant. And several of these data sets come from the American Statistical Association, teaching of statistics in the Health Sciences Resources portal with permission. There are also two pharmacokinetic studies, which are great for teaching line plots and GAM models. The documentation of each data set is probably the most important part of this data of this package. The definitions and details are available on each variable, units, range, and levels for each factor variable. There's background in each study, a description of the study design, intervention, and measurements, specification of the study outcomes, some suggestions for uses of each data set, and fall help files, which you can get in our by typing help and the name of the data set. And there are link description documents and code books on the package down website for each of the packages as shown here. And here's a image from the package down website, but it would probably been better for me to go there and show you what this contains. So the package down website starts with an overview, how to install and use the data sets, a plea to donate data sets, and then these links to description documents and code books for each of the data sets. And so for Streptomycin and TB, you have a description document with background on the study, a link to the actual manuscript, some other background links, the abstract, and details on all the subjects and variables. You also have a code book for each data set with the variables, long variable labels to give you details, the units for each variable, and if they're factor variables, the levels and definitions for those. There's also a reference tab, which includes links to each data set, which give you what you would get from running help scurvy or help the data set name in R. So you get a basic outline, the format and variables, the source, and a lot more details, including links, in this case, to the original, admittedly, the Google books version, a treatise on the scurvy in three parts, of which a very small part of it is about the very first reported randomized clinical trial in humans. There are also articles or vignettes, an introduction to the data set, and an example of using this for teaching entitled Making Tables that I created. And it gives you the opportunity to walk through how to do something with your students with code chunks that can be copied and pasted into R from loading the data to building out and replicating table two from the strata mice and data set. So we walk through how to do that, starting with a one variable table using the table function and janitor package, and then the outcome radiologic six months against the two treatment arms as a more complicated six by two table. You can then add a total row with the function dorm totals. You notice this is hyperlinked to an explanation of this particular function. As you can see here in the arguments to it. And then you can add percentages and formatting and additional counts or ends. So you have both the counts and the percentages as well as the total row. And then if you want, you can export it to flex table and format it at will to make it a pretty version. And then it offers a couple of challenges to the students to try this with the strip TV data set or the indoor RCT data set, applying what they've learned. So that gives you a quick overview of what's available on the website for this package. And let me head back to this PowerPoint slides. And from here, I've got three asks and one give in the spirit of the buy nothing project. I'm going to ask folks to possibly contribute and add examples and please comment in the chat if you think there are particular examples that would be helpful for teaching a plea to donate data sets for folks who have access to anonymized data sets. And a question for folks in the chat. Should I add untidy data sets? So let's get into why do that. So my first ask is for people to essentially try out this package, use medical data and its data sets for teaching. And if you come up with good applications to a particular data set, tell me about it. Click on on the GitHub site, the issues tab and open an issue. Give me a set of your code and a little explanation in a repress of how you might have used a data set to teach table construction with GT summary or scatter plots or forest plots or game model. Whatever you think is a good application, tell me about it and give me a little bit of code and I'll try to make it into a useful thing yet. The second ask is for folks to donate data sets and make this an expanding package. If you have access to medical data sets like randomized controlled trials or cohort studies or case control studies that are of reasonable size, not 600,000 rows, because there's only a five megabyte limit on CRAN, and they can be anonymized. Fake names and fake study IDs are helpful, but it can't be traced back or in any way be a HIPAA violation and you have to have appropriate permissions to share the data along with a reasonable level of documentation, possibly a codebook or even a publication to help produce the high quality documentation that's part of this package. Again, open an issue, let me know about it, start the discussion. And my third ask, and this is mainly for opinions and crowdsourcing in the chat, should I add some untidy medical data sets to help teach data wrangling? Maybe very wide medical data that need pivot longer or untidy medical data that needs separator, unite or separate rows or common data cleaning problems or color coded data that need tidy XL or the unheader package or unpivoted package. Let me know in the chat whether that's a good or a bad idea and which of these would you be your prioritize priorities for teaching data wrangling? What kinds of problems do you think are important to teach? And I have one gift. I have medical data heck stickers that look like this for the first 100 people send a self address stamped envelope. It's important to include sufficient postage. And if you've never done this before, you put your name and address on an envelope, along with a stamp with sufficient postage, you fold this into thirds and put it into a second envelope, the sending envelope, mail this out or envelope with again, sufficient postage to make it to Michigan with a self address stamped envelope and close to me at my work address. And I'll drop this in the chat. And I will send you heck stickers. Also breaking news medical data package is now available on CRAN. It's my first package on CRAN. So you can now just install packages from your local CRAN mirror. Thank you for your feedback and any GitHub issues you want to add. And thank you for your time. Please ask questions, provide feedback, and let's discuss this in the chat. Thank you very much for your time.