 Hi everyone. My name is Amy Schwann-McCoy and I'm here to talk to you today about moving backward with R. Not moving backward in your programming skills or techniques, but moving backward in your courses. So using this idea of backward court design to build an R ecosystem for your programs, classes, or workshops. So start by just defining backward course design. Backward design starts by asking two questions. So as the instructor, you'll ask yourself first, what do you want your students to be able to think and do by the end of the course? So what are the knowledge areas or the competencies you want your students to achieve and how will your students be different? What's changing or what do you hope changes about your students by the end of the semester or workshop? So essentially when you're designing your class using backward design, you start with the learning objectives instead of a topic list or a book chapter. When you're designing things backward, you go through a three-step process of first identifying what you want your students to be able to do. So what are the goals? For example, in a math stats course, one learning objective would be that my students can simulate data using a Poisson distribution and then use that simulated data to answer research questions. After you have that list of results, you'll then think about how to assess it. So how will the students show you as the instructor what they've learned? And then once you've got all that down, you'll plan your learning experiences and your instruction for the semester. So what are some benefits of this approach? For the instructor, right away you've highlighted the most important outcomes of the course and you know what you want your students to work toward. You also have a guide for assessment. Those most important outcomes are the ones you should be using formative assessment to help reinforce and summative assessment to help grade at the very end. For the students, this provides an aligned and purposeful experience. So students feel that the course is more cohesive and more purposeful. There's less busy work and of course, designed this way. This also helps your instruction become transparent and more explicit. You're clearly articulating what you want your students to get out of the course because you've already thought about this. So how can we use this course design model to teach students to program in R? Well, we always have to remind ourselves that we are not our students and this is especially true when we're teaching programming. So for learning R for the new user, there are tons of obstacles. We can't just sit at the front of the classroom and expect to throw a bunch of knowledge at the whiteboard or a bunch of code at the students and have them get it right away. Some students will, but for most, this is just simply not effective. Okay, so what makes it hard to learn R? Well, R is open source. We love that about R, but we also hate that about R because the open source nature means that there are a lot of resources to getting packages in R, a lot of resources to learning about them, and one really needs to become familiar with all of those separate resources. As we've probably all learned, and as Jenny Bryan made perfectly clear, it's the best talk at our studio conference. Help files and error messages just simply aren't always helpful. If I showed an interest at student this error message, I'm pretty sure they'd walk out of the room or shut off the recording. Just done with it. Non-standard syntax is an obstacle. The caveat here is that the tidy verse has helped this pretty dramatically in the last few years, but not everything is tidy verse syntax friendly, so students still have to have some understanding of that base syntax as well. Non-standard output, output from every package looks a little bit different. Non-standard conventions. Here are just a few examples where capitalization is different from similar functions, punctuation changes. You add in packages, version control, function conflicts, and then all of a sudden these are the students. So we need to be purposeful about how we're bringing R into the classroom. This backward design model can really help us avoid some of the pitfalls of teaching and learning in R, and to illustrate that I want to show you a little bit about how we incorporated this when we designed the data science program at my institution, Creighton University. So what you're seeing here is a diagram of the data science program, and our data science sequence is really split up into four different tracks. Our students complete a computer science sequence, specifically introduction to programming, object-oriented programming, and data structures. All students complete calc 1 and calc 2 along with applied linear algebra and diff eq. All students complete stat modeling, intro stats, so a two semester statistic sequence, and then we have five dedicated data science courses. We have an introduction to computing and scientific thinking for data science, intro to data science itself, machine learning, data visualization, and then a capstone. These shaded courses here are electives, students either take two computing electives or two math stats electives. Now from the statistics side of things, these six courses incorporate R. So we have R in our intro stats course, our stat modeling course, math stat one and two, intro to data science, and machine learning. So we have six different courses that all are using R in some way, eventually a seventh with applied linear algebra. Here's what really makes this challenging. There's multiple entry points into the statistics or data science curriculum. There are actually three places where a student could come into the class or into the program. Students could come in, get excited from their intro stats course. We have a basic intro stats. We also have a health statistics course at Creighton as well. So they could come in through one of those intro pathways. They could also first be exposed to R in introduction to data science, or, and this happens with many of our math majors, their first exposure to R might be in mathematical statistics or probability theory. Not only that, but we have different pathways through the program. So thinking to our math majors instead, they might just take math stats one and two and never see any of these applied courses. Students might take intro stats and then go to stat modeling. They might take intro stats and then go to machine learning if they've also taken data science. So there are lots of different student experiences in this curriculum. These students who are coming in at these multiple entry points need to be well prepared to succeed later on. And also they need to be not bored. So that means that these three different entry points need to have some differences as far as not just the content, but the computing that's in the courses as well. So here I've listed some learning objectives that we have for our students at Creighton from a programming standpoint. So we want our students to be able to make data visualizations using the tidy verse. We want our students to be able to do data manipulation also using tidy verse, but we also want them to be familiar with the base packages. So base our data manipulation, so the bracket syntax, lattice plots and so on. We want them to have an understanding of both. We want our students to be able to do traditional inference and simulation based inference. We want our students to be able to calculate probabilities, simulate data. We want them to be able to do some modeling, both basic regression modeling and advanced modeling. We want our students to be familiar with machine learning. We want our students to know about deep learning and we want them to be able to do some matrix calculations. This is a long list of learning objectives. And if you broke this learning objective list out into packages, the list would get even longer. So using those backward design principles, we've started with this list of topics, specifically this list of computing topics. Once we have this list, we can figure out where to put them throughout the curriculum so that the sequencing is logical and there's not a lot of overlap in those three entry points. So for example, let me highlight the difference here between the intro stats courses and the data science courses. So there is some overlap here and the overlap is in visualization. So students taking that introductory course do their visualization primarily in ggplot2. They do a little bit of manipulation as needed with that introductory statistics course but the basic emphasis in that class really is inference and modeling. We use a lot of the mosaic package to help us out with that because it provides standard and simple syntax for a new R user. The overlap here with visualization is a little bit. We're definitely doing that ggplot stuff in intro to data science, but it's not enough overlap that the student would be bored. Moreover, we're doing a lot more customization in intro to data science than we are in introductory statistics. Both of these classes are a prerequisite for taking machine learning. So at this point, students have that tidy syntax. They also have a strong understanding of inference. They can jump into machine learning packages like carrot mlr3 seem to be tidy models and have a basic idea of all of the R syntax that they need to accomplish things. So the programming isn't the learning object or isn't the obstacle to learning. For students that are taking stat modeling, the only prerequisite here is that visualization and mosaic from the intro stats course. So these students are well prepared to work with that modeling syntax in R, y till the x. On the math stats side of things in probability theory math stats one, we focus on sort of the base syntax. So base probability distributions, things like R norm, P norm, Q norm, base plotting plays well with that syntax as well. So that's where we've chosen to incorporate some of those base R structures. We also revisit ggplot a little bit in math stats two, because that's the point where students are now no longer satisfied with base R programming. So they get a little bit more visualization at this point. They get some base modeling to complement the theory of inference the math part of the course. They also get some matrix manipulation content here as well. So that's how we've structured our course using this backward design thought process. If you're thinking about doing this in your own programs, your own courses, your own workshops, what do you need to think about? The first thing you should think about is what your student looks like. So try to visualize your student in terms of a learner profile. What does a student look like at each entry point? The student who starts with intro to data science probably looks pretty different than the student who starts with math stats. And we need to be aware and prepared of those differences. What do they need to know? And also what don't they need to know? What can you maybe read out for a later course or experience? Certainly think about what's the scale. If you have a single workshop, your learning objectives are going to be a lot more precise. You're probably going to cover much less content, much fewer packages than you would through a semester course or even a course sequence. What is beyond your control? So what are your students going to learn from other sources? For us, since our students that are majoring in data science are taking a lot of computer science courses, we leave some of the more technical details of programming to our colleagues in computer science. If that's not what your students look like, you may need to cover more traditional programming concepts like loops, if else statements in your courses with R. You also might not want to count on that. Certainly there's variability from semester to semester workshop to workshop. So how much do you want to take for granted that a student knows? Next thing to think about is the support. So how will you help your learners catch up? How will you challenge the advanced students? What wiggle room is there in this new model or curriculum once you've built it? So finally, when you are designing your next course, your next workshop, even if you are designing a full curriculum like we did, always remember that some things are funnier backward, some things are better taught or better designed backward. And I will leave you with probably the best gif on the internet. Thanks for joining me.