 The final element of data science and communicating that I wanted to talk about is reproducible research. And you can think of it as this idea, you want to be able to play that song again. And the reason for that is data science projects are rarely one and done, rather, they tend to be incremental, they tend to be cumulative, and they tend to adapt to the circumstances that they're working in. So one of the important things here, and probably if you want to summarize it very briefly is this show your work. There's a few reasons for this. You may have to revise your research at a later date, your own analyses, you may be doing another project and you want to borrow something from previous studies. More likely you'll have to hand it off to somebody else at a future point and they're going to have to be able to understand what you did. And then there's a very significant issue in both scientific and economic research of accountability, you have to be able to show that you did things in a responsible way, and that your conclusions are justified. That's for clients, funding agencies, regulators, academic reviewers, any number of people. Now, you may be familiar with the concept of open data, but you may be less familiar with the concept of open data science, and that's more than open data. So for instance, I'll just let you know that there is something called the open data science conference and odse.com and it meets three times a year in different places. And this is entirely of course, devoted to open data science, using both open data, but making the methods transparent to people around them. One thing that can make this really simple is something called the open science framework, which is that osf.io. It's a way of sharing your data and your research with an annotation of how you got through the whole thing with other people, it makes the research transparent, which is what we need. One of my professional organizations, the Association for Psychological Science has a major initiative on this called Open Practices, where they are strongly encouraging people to share their data as much as is ethically permissible, and to absolutely share their methods before they even conduct a study as a way of getting rigorous intellectual honesty and accountability. Now, another step in all of this is to archive your data, make that information available, put it on the shelf. And what you want to do here is you want to archive all of your data sets, both the totally raw before you did anything with a data set, and every step in process until your final clean data set. Along with that, you want to archive all of the code that you used to process and analyze the data. If you use a programming language like our Python, that's really simple. If you use to program like SPSS, you need to save the syntax files, and then it can be done that way. And again, no matter what, make sure to comment liberally and explain yourself. Now, part of that is you need to explain your process, you know, because you're not just this lone person sitting on the sofa working by yourself, you're with other people. And you need to explain why you did it the way that you did, you need to explain the choices, the consequences of those choices, the times that you had to backtrack and try it over again. This all also works into the principle of future proofing your work. You want to do a few things here. Number one, the data, you want to store the data in non proprietary formats, like a CSV or comma separated values file, because anything could read CSV files. If you stored it in the proprietary SPSS dot save format, you might be in a lot of trouble when somebody tries to use it later and they can't open it. Also, there's storage, you want to place all of your files in a secure, accessible location, like GitHub is probably one of the best choices. And then the code, you may want to use something like a dependency management package like pack wrap for R or virtual environment for Python, as a way of making sure that the packages that you use, that there are always versions that work, because sometimes things get updated and it gets broken. This is a way of making sure that the system that you have will always work. Overall, you can think of this too, you want to explain yourself and a neat way to do that is to put your narrative in a notebook. Now you can have a physical lab book, but you can also do digital books, a really common one, especially if you're using Python is Jupiter with a Y there in the middle. The Jupiter notebooks are interactive notebooks. So here's a screenshot of a very simple one I made in Python. And you have titles, you have text, you have the graphics. If you're working in R, you can do this with something called our Markdown, which works in the same way you do it in our studio use Markdown and you can annotate the whole thing, get more information about that at our markdown.rstudio.com. And so for instance, here's an R analysis I did. And is you see the code on the left and you see the Markdown version on the right. What's neat about this is that this little bit of code here, this title and this text and this little bit of our code then is displayed as this formatted heading as this formatted text and this turns into the entire R output right there. It's a great way to do things. And then if you do our Markdown, you actually have the option of uploading the document into something called our pubs. And that's an online document that can be made accessible to anybody. Here's the same document. And if you want to go see it, you can go to this address, it's kind of long. So I'm going to let you write that one down yourself. But in some, here's what we have, you want to do your work and archive the information in a way that supports collaboration, explain your choices, say what you did, show how you did it. This allows you to future proof your work so it will work in other situations and for other people. And as much as possible, no matter how you do it, make sure to share your narrative so people understand your process. And they can see that your conclusions are justifiable, strong and reliable.