 I'm Daniel Jacobs, and I'm gonna be talking about learning R through space repetition and the remember package. First, I wanted to introduce myself. My name is Daniel Jacobs. I have over 10 years of experience writing code and things like video games, data science, and for the last several years, I've been working in ed tech and educational technology, and I've been doing a lot of statistics work. Currently, my title is Director of Data Science Engineering, and today I'm gonna be telling you about the remember package, which you can see for yourself at www.don'tyouremember.com. So the first thing I wanted to talk about is why the remember package. What problem does it try to solve? It's this one. There are a lot of R packages out in the world, and it's hard for me to remember them all, and it's hard to know which ones to learn next, because there are just so many of them out there. There are things with names like Stargazer and test that, and I remember learning about them all at a conference I went to last year, and it was so much that I didn't know what was important, and I remember coming home from this conference and just very excited about all the new things I could learn, but not even remembering what they all were. So here's another story about that. A couple of years ago, at this point, I went to a talk about a package called purr, which I'm sure a lot of you are familiar with, which is a way to iterate over a list. So the old way, before I'd learned about purr, it was I would write a for loop for cat in cats, print cat, and the new way was to use the purr walk method to iterate over the list and to do the printing. And I remember going to this talk about purr and hearing about what it was and thinking, that looks better, that's just better than what I'm doing right now, I wanna learn how to do it, and I wanna incorporate it into my regular programming practice. And I'm sure you've all been to other talks where you felt that same way, where there was a new package or method and you saw it and you thought, that's really a thing I should be doing, but it's hard to change your programming practice. So I went to this talk, I went home, I went back to my normal programming, I used purr for a day or so, but I quickly forgot about purr, I quickly reverted to the method I was using before, which was for loops. And I think part of the reason for that was it was just easier for me because I knew it. Might not have been better, but because I was so familiar with it, it was what I was used to, it was hard for me to change what I was familiar with, and it was hard for me to learn the new thing well enough where I didn't have to look it up and refer to documentation in order to use it. And in order for me to really get something in my programming practice, I need to know it well enough where I don't always need to look it up. So around the same time where I was struggling to learn purr, I was reading this book because I work in ed tech called Make It Stick. And the book draws on recent discoveries in cognitive psychology and other disciplines to offer different concrete techniques for becoming a more productive learner. And the sorts of things that it talks about are active recall, so making an effort to really think about the stuff you've learned. And different strategies for studying and doing better on tests that are based in science and research. And another thing it talks about is space repetition, the different ways to recall information. So one of the things it talks about is the space repetition algorithm for learning flashcards. And the algorithm works like this, you create your flashcards, you review your flashcards every day, and every time you get a flashcard wrong, you put it in a section for cards that you need to review more frequently. When you get it right, you put in a section for cards that you need to review less frequently. And if you do this enough, eventually the cards that remain are the ones that you really need to review. And if you review them enough and you start to learn them, and then over time, you have fewer and fewer cards you need to review and you have learned everything. And this is an effective strategy that a lot of people use for learning vocabularies. This is a vocabulary or a series of flashcards you might use if you're trying to learn basic English, or just the very basics of it. There's a vocabulary here. And in a way, learning R or learning a programming language, it's still a language. Obviously there's more to it than just vocabulary, but R as a language has a vocabulary. There are all these terms you need to use. And you can look them up, you can always go to Stack Overflow and find them, but the more that you have memorized and internalized, the faster and more fluently you can code and then you're only going on Stack Overflow when you really need to go there, when there's something new. So I think part of what you can do when you learn a programming language is really try to just learn some of the vocabulary of it. And the strategies that work for learning the vocabulary of a language can really be the same ones to help you learn the vocabulary of a programming language. That's what the Remember package does. It has a method to do flashcards according to the space repetition algorithm. So here's what that looks like. You can call Remember flashcards. It says, get ready to start your flashcards. Look for help in the browser window. You currently have at least 25 functions you need to review. Don't worry, we'll do them 10 at a time. Press any key to start. And then it'll start with your personalized flashcards. So it'll say something like, do you feel comfortable with detach from the base package? And you can say, yes, I feel comfortable or no, I don't. And it'll move it into the appropriate bucket of flashcards for you. It'll also give you a little clue, a little reminder about what the package does. So it'll show you the R help or it'll give you a reminder what the function does. So it'll show you the help. And that way you can just have a little reminder and activate your memory and it'll help you recall for later. And then it just goes on to the next question. So it'll say, do you feel comfortable to walk from the per package? With this point, I do now feel comfortable with walk from the per package. So I would say, yes, here. And it just iterates like this. It goes through your flashcards. And eventually over time, if you do this enough, you start to read the help regularly, go online, read documentation and references and start to really internalize some of these methods that you're trying to learn. So as you go to all these talks and conferences today, you can start to think about what things you really wanna commit to studying and learning. So I talked about remember solving two problems. One of them is that there are a lot of R packages and it's hard to remember what they all are or hard to really remember different parts of them. Another problem is, it's not there which ones to learn because there are so many packages out there in the world. And not everyone needs to learn the same packages. There's some packages that are difficult or advanced and maybe you really only need to know them if you've been working for R in several years or if you're doing something very specific. There are also a lot of packages that are domain specific in fields like astronomy or biology. And if you're an astronomer or biologist, you need to learn that package and you need to learn what other biologists are using. But if you're not in that field, you really don't need to know them. So it's really relevant to know like what are other people around you learning? And if you're in a conference like this one, even if it's virtual, you can still talk to other people and get a sense for like what people are learning. But if you're out on your own in the world and aren't connected to the community in the same way, it's really, it can be not obvious which R packages to learn and which R packages you specifically would benefit from learning. So that's the second part of what Remember tries to address. This package uses data to help make recommendations for you about what packages you can learn. So the kinds of data sources that it uses are the code that you specifically are writing, the books and talks that you're reading and listening to and the wealth of other people's code that's available in places like GitHub. And we can combine those together to compare what other people in the world are doing with what you're doing or what similar people are doing to what you're doing and get a sense of what's different. And maybe in that difference, we can find some important methods that you would really benefit from if only you knew they existed. A lot of this is built off of a special capability that we have in R, which is that it's really easy to walk the abstract syntax tree, which is the data structure that underlies the code and makes it easy to parse and understand for the computer itself. So this is an image of the abstract syntax tree from the advanced R book by Hadley Wickham. And we're gonna take advantage of that and take advantage of how easy it is to parse in order to figure out what methods are being called and then to use that to turn it into flashcards and recommendations. So that's what we do. So here you can see my code example from before where I called the walk method. And now you can see the pseudo code here, which is essentially what Remember is doing under the hood, where you can create flashcards based on a line of code or a several lines of code or an entire file of code. So here you can see the results of creating flashcards and we can find the methods that get called. We see the walk from the per package and print from the base package. And we can count things like number of uses and other statistics to get a sense of not only what are you using, but how frequently you're using it. And then we can start to build up a data set like that to really do some analysis to make personalized flashcards and to make recommendations. So another thing we can do is we can do the same trick of analyzing the code and creating flashcards from it from our markdown files or our files that we might find online. So we can, for example, take chapter two of advanced R by Hadley Wickham, which is about names and values and turning into flashcards. So here we create the card deck based on the markdown file that underlies that chapter. And you can see all the methods that Hadley called in that chapter. And some of those, so I can now add those to my flashcards if I'm new and I want to review. I can also add those to my flashcards and it will be smart enough to know like, oh, I use library a lot. I use numeric a lot. I already have an expert in those so we don't need to see those as much. But the method called GC really never used that. So that is a method that we would really like to be high up in my review queue so that I can use it regularly and get used to understanding what it is. The other thing we can do is personalization. So not only do we want to review a chapter, but we might want to look at my code to see how I as a programmer work. So you can call init remember to have remember track what you do as you interact with the REPL. And so every time you use a new method, it'll make a new little marker to note that you yourself have used that method at that time. And then you can build up statistics about how you are using R. You can use these to create flashcards. And if you want, you can also use them for your own data analysis. So we can track things like, because the first time you use the method or how many methods you've used it. And that's possible because we're in the REPL. So we know exactly when you use called the code. And we can see which methods I use a lot. Here we're just looking at ggplot2. I use theme a lot. I use geome text a lot. Well, some of these other methods I've used very rarely. And one issue we have is that not every method is equally important. How do you know what's important? And this is where recommendation comes in. And I'm just gonna give an example of that relating to visualizations, because it's the thing I do a lot. So tidy Tuesday, if you're not familiar with it, is a weekly challenge for visualization. There's a, now what was nice about this for building a recommender is that there are 400 entries already easy to find on GitHub. So there's a lot of code. So I downloaded all that code, ran it through the code parser, built up the statistics, and then I was able to compare my code to what was available for the tidy Tuesday entrance. So here you can see the top 10 packages for tidy Tuesdayers. And notice if we wanted to, we could have gone deeper and looked at the top methods within ggplot2, but here we're just keeping it simple and looking at the top packages. So these are the top 10. And then I was able to compare my personal repel statistics with the tidy Tuesday statistics. And a thing that jumped out at me was, well, what should I learn? Well, one package jumped out is a thing I had never used before, which was called Stringer. And I'm sure a lot of you are listening and you've all used Stringer before. And I'm sure some of you haven't, some of you have. But after I realized a lot of other people were using it and I wasn't using it, I incorporated it into my practice, I read up about it, did some research and now I use Stringer all the time and I really like it and it's made me a better program. So this is an example of a recommendation that I was able to get through using my code statistics compared with the community's code statistics. Now to leave you with something to go home with that you might benefit from, here are the top 40 packages for tidy Tuesday. And you can just read through this list and if you see something here that you haven't used, keep in mind that a bunch of other people who do tidy Tuesday do use this package. So you might want to try it out, whichever package you see here that you're not using and see what it is and see if it would help you as a programmer. For me, I discovered the packages Glue and Hear like this and those are very useful and now I use them all the time because I know what they are. So that's the remember package. It was designed to really solve two problems. One was to help you remember things, that specifically different methods that you learn are. And the other one that is still in development is to help you discover what to learn next and what packages are important and relevant for you. And that's the remember package. It's currently available on GitHub where you can try it out. You can also visit it on the web at www.don'tyouremember.com. You can follow me on Twitter. I do a tweet about it occasionally. That's DJCub7. And a couple books that were referenced in this talk that I think are very good books are Advanced R by Hedley Wickham and Make It Stick, The Science of Successful Learning by Brown, Reddiger, and McDaniel. That's all. So thank you very much.