 So good afternoon everyone. It's great to see you all here. And it is my pleasure on behalf of many people who you will meet during the course of this symposium to welcome you here for this party slash symposium to launch our book, The Practice of Reproducible Research. For those of you who don't know me, my name is Justin Kitzes. I'm a postdoc scholar in the energy and resources group here and also a data science fellow at BIDS Berkeley Institute for Data Science. Right off the bat, I want to introduce two other people who are Fatma Dennis here in the front and Daniel Turk who are the other two editors for the book. Fatma is a postdoc in neuroscience and also a data science fellow in BIDS. Daniel is a former data science fellow in BIDS and now system professor of statistics at Williams College. As I said, the three of us were the kind of editors or compilers of the book today, but we're also fortunate to have many other folks who were part of the core team who helped to write these chapters and put the book together and you'll meet many of them as we go through today. So now right off the bat, in the interest of not burying the lead as they say, but at the risk of sending you all to your phones and laptops, I'm going to put this URL up right now. So this is the book itself. This is the open online version of the book. You are welcome to read it, of course, and share it. Please share it widely if you would like. For those of you who prefer print versions of books, print publishing runs just a tiny bit slower than the internet does. So print versions of the book will be out from UC Press this fall. So if you are interested in the print version, just keep an eye on the BIDS website. It'll announce when the print versions are available for purchase also. Now for those of you who would prefer to hear about the book rather than just reading it, which I hope is all of you. That's why we're here today. So I'm going to start off by just telling you a little bit about the project that led to the book and, of course, the book itself as a way of introducing you to what you'll find in it. And after that, I'm going to turn over to several of the other folks who've been deeply involved to go into more detail about what you'll find here, what we learned from the process of this project and what you might learn from reading the book as well. So to start off, the very beginning here, oh, this is the book, see it shortly. But to start off, I want to say that this book owes its genesis really in nearly every respect to these funders, to the Gordon and Betty Moore Foundation and the Alfred P. Sloan Foundation. So about three, four years ago, these foundations got together to make a serious financial and intellectual commitment to supporting data science research and activities, particularly at UC Berkeley, the University of Washington, and NYU. And here at Berkeley, the funds from this grant were used, among other things, to launch up the Berkeley Institute for Data Science, which, as most of you know, is over in 190 Doe Library. Now one of the first things that BIDS did was to bring in a crop of fellows. And these were students, staff and postdocs already here on campus. They were brought in for joint appointments with the Data Science Institute. And this first group of fellows actually included all three of us who ended up editing this book, in addition to many of the other core contributors. And it was through BIDS that we met, got together and sort of hatched the idea for putting this project together. So, of course, we're in the Bay Area. So as traditional for projects that launch in the Bay Area, we always have to be the X of Y, right? So you have to be like the Uber of food, right? Or like the LinkedIn of Kazakhstan, or like something like that. So in that spirit, we actually, towards the beginning, were thinking about this book called Beautiful Code, which some of you may know. So this book contains a relatively large number of essays written by software designers who describe a particular example of code that they wrote in detail and talked about why, in this case, they found it to be beautiful or particularly interesting. So in other words, this book was thinking about teaching, how to think about how to write good code by example. In the same way that architects would learn how to do architecture by examining buildings that had already been built, right? And so in short, we started off thinking about this project as producing a collection which you could call something like Beautiful Code for reproducibility. That is, we wanted to collect a similar set of real world, very concrete examples of scientists attempting to make their work more reproducible. Now, of course, before I say any more about reproducibility, I should say something about what that word means. So of course, there are several different definitions that people would use of the word reproducibility, but for the purposes of the book, this was the operational definition we were working with most of the time, which we were arguing with sort of the basic foundational bottom type of reproducibility to deal with, which we referred to as computational reproducibility. So this is up here, but in short, it's the idea that you did a project, ran a study, and produced some result, figure, table, number. A project is computationally reproducible if you could give something, a set of instructions, a set of files, a set of data to someone else, and they could reproduce, recreate, or get back the same answer that you got. So this is very basic and very foundational, of course. And there are other ways of thinking about reproducibility, which are actually covered in great detail in chapter two of the book, Assessing Reproducibility, so you may be interested in that as well. So if we wanted to be the beautiful code of reproducibility, how would we go about that? Well, true to the inspiration, we decided to gather case studies. They became contributed case study chapters in the book itself from practicing academic researchers specifically in the data intensive sciences. And of course, given the relationship, the location where we are right now, and the relationships between the data science institutes, most of these case studies ended up coming from Berkeley, NYU, and the University of Washington from those initiatives. To keep those case studies somewhat consistent from author to author, we actually gave authors a template to follow. And I won't say too much about that other than to say that it helps to make them more readable and more comparable to each other. You can kind of flip through the same structure from chapter to chapter and see how different authors responded to the same types of questions. And Daniel's going to come up right after this and talk a little more about that. And after Daniel, in fact, we'll hear some examples of some of the case studies from the book. Now, I'll note at this point, when thinking about case studies, and other speakers will stress this as well, that these case studies aren't always success stories, per se. These aren't just stories about how great everything went. Just as important as the successes, these chapters describe the different challenges, partial successes, partial failures, difficulties that the authors encountered in trying to work towards the goal of making their research reproducible. And those are aspects that you'll find in all of these case study chapters as well. Now, finally, for those of you whose eyes are glazing over a bit at the thought of reading through all 31 of our brilliant, but not entirely short, case studies that you would find in the book, we did have a core group of authors who took it upon themselves to put together a synthesis and summary of the chapters themselves, which you'll find in part one of the book. And these are the sections of that. So following an introduction, you'll find this first, the chapter that I mentioned just a few minutes ago, about assessing and defining reproducibility, followed by an example of a very simple basic template for a reproducible workflow, small scale, single investigator project, tabular data. This is a particularly good place to start. You can see the kits is there after. This is a particularly good place to start, I would say, in my humble opinion. Especially if you're a beginning investigator and just wondering, how do I think about this workflow process in a way that I can get started? That's a good place to start. Following that, we have a chapter that sort of summarizes the basic characteristics across all the case studies and then the lessons learned chapter, which tries to synthesize sort of what we took away. So not just what we found in the chapters, but what we took away as lessons for today and for the future. We close off the book with a chapter specifically about where we go from here and then a glossary, which is actually much more than a glossary. It's an extended sort of discussion and annotated list of tools, methods, terms that you'll come across within thinking about reproducibility. So that's basically what the book looks like. Parts two and three, of course, have the actual case studies themselves. Now, as you might expect unsurprisingly, the rest of what we'll talk about today with you follows pretty closely the structure of the book. You just were introduced from me. And coming up next is going to be Daniel Turk, who's going to give a description of the case studies themselves in a little more detail. Following that, we have four brief lightning talks, which are going to present the information found in some of the case studies. We have two authors from Berkeley and two from the University of Washington. They're representing the disciplines of economics, statistics, neuroscience, and archeology, which gives you a sense for some of the breadth of the authors that we did actually have contributing chapters to the book as well. Then we're going to have Katie Hough describing some of those take-home lessons learned and Karthik Ram giving the way forward. At the end, we did leave a good amount of time also to have a discussion. So we'll bring some chairs up here, we'll sit up here, and hopefully we can have a little bit of a talk about what we talked about and reproducibility more generally. As you mentioned, my name is Daniel Turk, and I'll be discussing the basic features of the case studies. They include both the process of collecting them, how we came about to get this collection, and then also summarizing some of the main trends and statistics about the actual case studies themselves. So as Justin mentioned, this entire initiative came out of originally the reproducibility working group, a small subset of that at BIDS. A number of us, the core contributors and authors, decided to actually undertake this project together. And that started in BIDS with us sitting down and discussing what actually a case study would look like, what we thought these should appear as. As you mentioned, put together what we called the case study template to provide some standardization across all the case studies. It's pretty straightforward, you can see here, consists of a couple of major parts in introduction with biographical information about the author, and then also an opportunity for authors to sort of describe their discipline and area of research to provide context and set the stage for their actual reproducible workflow. Then there's the core sections of the workflow, consisting of two parts, one the workflow diagram and second the narrative, the diagram is essentially a flow chart showing the end to end pipeline of the reproducible workflow where boxes in the flow chart represent tools used, intermediate products, data sets, perhaps scripts, or the various steps of the process, and the arrows or connections between those, obviously showing the flow of information and the process itself. This diagram is intended to go hand in hand with the workflow narrative, the other major section, which essentially gave the authors an opportunity to walk a reader through their workflow diagram, explaining things in more detail than is possible in the diagram itself, and also as we know, no workflow or no research process is really without problems and hiccups, so in the narrative they're intended to explain the difficult points where they had to backtrack, what the problems were, and also explaining, for example, the time required for the different steps, things that are not shown in the diagram itself. These core sections were then complemented by several sort of summary sections, listed here, the pain points, the benefits and tools. Now the information for those summary points is also intended, probably appears in the narrative itself, but having those appear together, consolidated at the end, would give the authors an opportunity to focus on those and bring that information together. So readers, for example, rather than parsing through the entire narrative, could go ahead and digest quickly, see what, say, the main tools or the key benefits from a particular workflow were, so having that all concisely at the end. And finally, there were several optional questions at the end of the template, general questions about reproducibility, optional for the authors, but once again, asking sorts of questions such as why reproducibility is important, what the major incentives they see for reproducible research are, the tools they used and how they came to learn about reproducibility, et cetera. So depending on the answers to those, you know, subsets of those questions were included in the final book as well. So with this template in hand, we then set out to collect the case studies, and as Justin mentioned, we ended up with a collection that largely contributed from the Partner Data Science Initiatives, here at Berkeley BIDS, obviously, the New York University Center, and also the E Science Institute at University of Washington. Largely from those three centers for data science, and also one major milestone we had for collecting case studies was actually a small workshop, a reproducibility workshop hosted here at BIDS back in May of 2015, in which one of the sessions we hosted gave participants an opportunity to draft a case study with members of a reproducibly working group there to guide them, peer reviews, editing in real time, et cetera. So we got a large number of contributions from there, after which authors could take their case study home, continue revisions, and do a final submission to appear in the book. So as we started to receive these case studies, we started to look over them and notice that essentially most of the flowcharts, so the diagrams, could be broken down into one of three major stages. Stage one, data input, stage two, consisting of processing, and stage three, the analysis of the data. Stage one is wherever the data came from, be it collecting it oneself, scraping it from the web, or sometimes as well, simulated data, for the output being a raw data product, going into stage two, where the processing takes place, the cleaning of the data set, and then that product flowing into stage three, the analysis of the data, drawing conclusions, and we included in that stage the output of the workflow, be it a manuscript or software, et cetera. So having identified these three major stages, which most of the flowcharts can break into, this provided a means for what we called a natural classification of the case studies into one of two categories, high level case studies, we defined as describing all three of these stages, the input, the processing, and the analysis, and then low level case studies, we defined as just describing a subset of these stages, for example, just focusing on stage three data analysis. And as the names also imply, the high level case studies, having a much broader picture of the entire pipeline, generally we're at a much higher level of detail, providing less technical information, less nitty gritty details of the implementation, whereas low level case studies focusing on a much more narrow section of that pipeline, therefore oftentimes provide much more technical detail. So we thought this was a nice breakdown, and actually this high level and low level classification persists into the book. As you saw in the outline Justin gave, they're actually separated into two sections, the high level studies and the low level studies. So with this classification in hand, it lets us take a look at the breakdown of case studies across these various contexts. The high level case studies, there are 21 of these, focusing on all three stages of the pipeline, and the remaining 10 were the low level case studies, and we can see how this falls apart between the three stages on the right there, the 21 of them covering all three stages, and then among the 10 low level case studies, four of those focusing just on stage three data analysis, next two of them with data processing, and the others you can see there with a small number of the case studies focusing on two of the three stages. So it's interesting actually to see what people doing modern reproducible research, how they've envisioned their workflow and their pipeline as to what stages of this it focuses on. Also interesting was to look at the disciplines represented by the case studies here. So this quickly just gives the primary discipline represented by each of the case studies. Two major interesting observations here, one as Justin alluded to, the breadth of the disciplines represented in the case studies. It's not a very narrow segment of academic research, or one primary field is actually a wide breadth of academic research areas. And the second thing you might note here is that there is a slight tendency, representation of the core sciences or perhaps the STEM related fields, the largest representation being in areas of mathematics and statistics, then things such as neuroscience, computer science, environmental science, et cetera. So largely these sort of sciences represented, but a wide representation in any case. And finally, looking at all the case studies, we also can observe some very high level trends among the case studies themselves, which really gives us some good insight as to what modern reproducible practices, the things people are doing these days. One of the most prominent trends among the case studies was the existence of version control, and more specifically, now currently the use of get and or get hub for people doing their version control. 80% of the case studies people were using get or get hub for their version control is about 10 or 15%, unfortunately, had no mention of version control whatsoever. And then the remaining couple percent there used something else for version control. For example, mercurial, we saw, or SVM. So this is one of the strongest trends. Another one we noticed was the importance of open, publicly available data. 60% of the case studies either exclusively made use of open data, or in the process of the workflow described, actually made their data publicly available. So this is obviously another trend we see here, and I think it's increasing with the emphasis on transparency and reproducibility these days in scientific research. Also can consider the primary languages being used here. Obviously all the case studies make use of many different tools and languages, but just considering the primary language use for computation or analysis, we saw over half of the case studies making use of Python, very understandable. People really appreciate the IPython notebook, makes it very popular, and also the readability and quick development cycle available with Python, so no surprise there. Following that was R, just over a third of the case studies, and I myself subscribed to that being the primary environment for statisticians to use, and also the large libraries of user-contributed packages available make R very popular these days. And then finally, the remaining 13% make use of a wide variety of other languages, not even worth trying the list here, but among others, for example, Julia, C, C++, Matlab, Java, JavaScript, among many others, so a wide variety in that remaining 13%. And a final observation about the case studies is just the output of what these reproducible workflows were producing. Largely probably because they're drawn from areas of academics, but almost all of the case studies did result in scientific publication or a manuscript for publication. That was almost universal, but in addition to that, perhaps more interesting, a large number of the case studies actually produced software products intended for larger use, or creating a workflow itself that other people were intended to use, for example, for data cleaning or for analysis, something like that. So those are the major outputs we observed. So with that, I'll stop here. It gives a reasonably good idea of the case studies, where they came from, and some of the major trends, and we'll move on to several lightning talks about the case studies so you can see these in practice. Thank you very much. Thank you so much. So, well, my name is Ariel Roquem. I'm a data scientist at the University of Washington eScience Institute, and I'll tell you today this case study. I joined the University of Washington eScience Institute in March of 2015, pretty much just in time to join that May meeting in 2015. At the time, I had just finished a postdoc at Stanford University working in the lab of Brian Wundell, and we had just published a paper on a neuroscience topic, and that is the topic of our study is the white matter. These are these brain superhighways that connect different parts of the brain. Here, an exposed track of white matter connecting the center of the brain, the thalamus to the visual cortex in the back, and the method that we used relies on the fact that when you look at these nerve fibers, water can diffuse along the length of these nerve fibers much more readily than across their membranes, and we use that in MRI to measure the diffusion of water in vivo in humans inside of an MRI machine, so these pictures here are slices through, this is a horizontal slice through pretty much the middle of the brain, and as I'm rotating around this thing, the gradient, the magnetic field gradient, is rotating through this magnet and sensitizing the measurement to diffusion in different directions. That's called diffusion MRI, and we can use that in order to make inferences about these white matter connections and to build computational models using a method called tractography that connect the diffusion of water in different parts of the brain together into these wires that estimate the connections that exist inside of the white matter. Now, the first step in doing this process is a model in each one of the voxels or these are sort of volumetric pixels in the MRI image. Here you see 150 measurements and 150 different directions of diffusion gradients, and the distance from the center of this cloud is how much diffusion there is in different directions, and a typical thing to do is to estimate some kind of model, a simple model of this diffusion. For example, this model called the diffusion tensor model estimates a three-dimensional Gaussian distribution of diffusion in different directions within this voxel. This is this rank two tensor here in the middle. But a question arises and one that isn't really asked a lot in this field strangely enough is which model should we use for a particular data set? Say I've made some measurement, which model should I use? There are many different models in the literature and you might ask which model fits the data more accurately and that's exactly what we did, and in this paper that we published in PLOS in early 2015 we suggested that researchers can use cross-validation. You can measure this measurement once, you can fit the model and then you can try to predict another measurement using this, and you can compare that to test-retest reliability, which gives you a natural scale for estimating whether the model is good. Here we're simply comparing three different measurements with different diffusion weightings, two different models, the diffusion tensor model that I showed you and an alternative that we proposed here called the sparse fascicle model, which is an invariant on a model that's quite popular in the literature. Okay, so this served as the basis for the case study. Here is the big Rube Goldberg machine that produced that. So you can see the paper down here in the corner of PLOS 1. That was where the paper appeared. We had it actually deposited in an archive before, but in between up here on the left top corner is the MRI scanner where the data came from. And the data goes through one process and there's computational processes that happen here, including some pretty big computers down at Stanford. And an important part of this work was, as mentioned, Git and GitHub and the development of a software library that I named Osmosis, because Osmosis is not diffusion, and that was written in Python and created the visualizations and simulations and the figures that went into the paper. So every figure had an iPython notebook that appeared in a documentation folder inside a folder that was named Figures. And eventually that folder contained all of the notebooks. Now, one of the challenges is that is actually sustaining this kind of reproducibility, right? I was working by myself on this big library of stuff and I created a lot of stuff that isn't useful to anyone and I created some stuff that may be only useful to me. Distilling out and sustaining this reproducibility is really, really hard. One way to do that is to take whatever is developed by the single hero developer that tries to control everything in whatever they do and try to incorporate that in a larger community context. So a lot of this work went eventually into a more broader community-based development project which is called DiPy, Diffusion Imaging in Python. I tried to distill out whatever was useful from this and put it into this library that we maintain as a collaborative community. And so this is from the documentation of this package. You have this cross-validation for model comparison that went into eventually into this software package. And I think that's a good model for sustaining our reproducibility over time. Another kind of challenge that appears here is this HPC cluster. Not everyone has one in their lab or in their university. How do we make large computations, large and complex computational workflows available to other researchers? And so the solution to that has now come in the form of developments that I've done since. As mentioned in the introduction, a lot of the data, we made data available publicly, but there are actually projects that are making, creating these large brain observatories and making a lot of data publicly available for any researcher. For example, the Human Connectome Project and NIH-funded project is making available more than a thousand individual measurements of diffusion MRI. And anyone can download these data and analyze them. And we were interested in doing cross-validations as kind of model comparison on 900 different brains. Now 900 different brains using K-fold cross-validation is a lot of computation. And the way that we scale that up is using Apache Spark on Amazon web services. And this also allows us to be reproducible in the sense that if you click on the link here, that will take you to the GitHub repository that then allows you to enter your Amazon web service credentials and run this at the same scale. Okay. And with that, I will end. Thank you. Yeah. Okay. So my paper is on the title of the actual paper that underlies the workflow is called Occupational Deaths and the Labor Supply, Evidence from the Wars in Iraq and Afghanistan. And yeah, so I'm Garrett Christensen. I'm at BIDS and also Berkeley Initiative for Transparency in the Social Sciences. So I kind of try to think about this reproducibility stuff a lot. And the question for this research paper is whether this one, this first slide or the next one that I'll show you is more common. This guy on the far right, that's Bill Chrisoff, an orthopedic surgeon who was 61 years old when he enlisted in the Navy Medical Corps after his son Nathan to his left, to his right on your left, was killed in the Marine Corps in Iraq in 2006. Obviously under normal circumstances, people who are 61 are not allowed to enlist in the military, but this guy deliberately signed up as a result of a death. Whereas you might think that other deaths, particularly, for example, Pat Tillman's death by friendly fire that was miscommunicated or obfuscated by the administration, that probably perhaps led to fewer people signing up for the military as a result of death. So my question is the recruiting response to deaths in the military and how I organized this workflow, I had to get data from a lot of sources including through Freedom of Information Act requests to the office of the Secretary of Defense. And so that's for the recruiting data and then data for deaths was actually publicly available as well as a whole bunch of other demographic and sort of geographic characteristics, got a bunch of data. So I put it all together, do all my data cleaning, I do all, I'm an economist, most of us do all our work in STATA and wrote the paper in LaTeX. And of course, as you can see in there, there's lots of feedback between, once I've got the data, start analyzing it and say, oh, I should probably be controlling for this as well, go get the data, do it again, and then start writing the paper and say, no, somebody says go get some more data or that's the wrong type of analysis, do this instead. So lots of feedback in these loops. And write a paper and just an hour ago, I finally got to revise and resubmit on this paper. So, and I started it six years ago. So some of my issues and pain points, I am at the mercy of the Office of Secretary of Defense. A friend of mine actually filed the very first FOIA in something like 2005, 2004 or 2005. We got the first data in 2006. You submit a FOIA and if you're like me and don't have a lawyer, you get a response in three weeks saying we're not going to comply with the statutory two-week timeline. Instead, we're going to put you in line. There are currently 1,800 people in line. Good luck. And so then a year and a half later, you get a CD in the mail. And then six months after that, your friends who still live at that old place where you used to live realize, oh, they should probably give you that CD. So two years later, you get some data. And yeah, so I use Stato, which is proprietary, but I wrote the paper in LaTeX. When I started, I wasn't great at LaTeX. I actually did some of the table work in tuning the tables in Excel, but in the latest version, I cut that out. This is causal inference from observational data. I have longitudinal data, county monthly data. So I use fixed effects, but there are parallel trends, assumptions built into that. Economists do this sort of stuff all the time. I think there's no way on Earth that anyone starting from scratch with my data would come up with the exact same models. Even though I can reproduce my work, because I've had to do it for six years and go back to it, I can actually with one click get the same numbers again, but somebody else working independently, I don't think they would choose to do this exact same analysis. So I started in 2010. I learned to get in 2015. So originally I was doing a lot of save as with dates and I have had to go through those files a lot and opening 20 different versions to see when on what date I made a change. It got the job done, but now it's all in GitHub. So my conclusions as far as the paper goes, potential military recruits are deterred from enlistment by local deaths on the order of 1%, local meaning a death of a soldier from your home county, but not by deaths from more distant locales. I can't verify the completeness of the raw data. So I asked the secretary, the office of the secretary of defense for the entire universe of enlistees, but I have no, like I didn't, I am not allowed to run the sequel request on the defense manpower data center server. So I don't know and there's no way to verify that. I can largely reproduce my results, but as I said, I doubt anyone would independently choose the same models I came up with. And as far as the data now, I got it through a FOIA request, but now the data is all up on Dataverse and the code's all on GitHub. And that's it. Thanks. So I'm Kelly. I am a third year PhD student in statistics and I'm a BIDS fellow here. So when this project was started, I was first year and I decided to write about sort of the first project I really worked on as a grad student and I was just getting my feet wet in the whole reproducible workflow. So this was a collaboration with my advisor, Phillip Stark, who wrote, I think, the preface to this book and some researchers in public health. And what we wanted to ask was after we control for known predictors of bad health outcomes, so namely alcohol and smoking, does salt consumption have an effect on 30 year life expectancy? And all our data was aggregate at the level of nations, gender specific. So we couldn't really do causal inference per se, but we wanted to look at this correlation and in doing so, we were testing out a new permutation method that we were working on at the time. So this work sort of had two threads. The first was the statistical methods part and the other was actually looking at this data. So here's my workflow diagram. The first part was really just sitting down and thinking about what we were going to do before ever touching the data, like how should we approach this problem and address it with the method we're working on? The second was actually getting our hands dirty with the code and namely writing the code to do these statistical methods, which I put in an R package on GitHub. So doing the R package thing is really nice for myself, putting it into a clean format where I can test it and document it and also putting it out there for others to use. And finally came the data work, actually doing the analysis, visualizing results, writing it up and then putting both the data and all the analyses on GitHub. And what ended up happening was, you run the analysis like Garrett said, you realize you did something kind of fishy or you need more data. And so it ended up being an iterative process where we actually went back to the start and collected more data as we went. So the key tools that I used were RStudio. So I'm a statistician, a lot of us work in R. And RStudio has some really nice integration for documentation and making R packages. So I relied heavily on those. And also version control, I used Git and GitHub, but that wasn't perfect. So when you're doing statistical analyses or any kind of analysis, you want to keep track of the decisions you're making, not just the actual files themselves. And so I found myself trying to come up with ways to sort of version control my analysis and really clearly document the decisions I was making. So I thought that Mitter and RStudio were actually pretty useful for doing this. One thing that everyone I talked to seems to have trouble with is getting our collaborators on board with using reproducible tools. It's partly just a cultural thing. If their field isn't using this heavily, then they're not going to probably. And what I ended up doing a lot of the time was I would make changes to the documents, push it to GitHub, and then have to send them a PDF anyway. And finally, I mentioned we had to go back and collect more data. It was in PDFs of papers. We could have written scripts to do it, but it wasn't that much data, so we just hand entered it. And unfortunately, that step is not reproducible. So it wasn't perfect, but I think statisticians in particular need to think about working in this way more often. Reproducibility often gets blamed on statisticians because it's always the statistical analysis that's wrong. So if we can do our best to explain what's going on, then people can poke holes in it, but they can't say that we're sweeping things under the rug. I think it's critical for my own efficiency. I'm just being selfish, but it helps me work a lot faster and keep track of what I'm doing. And finally, if we're developing new methods, might as well put them out there for people to use. Otherwise, what's the point if it's just sitting in a paper and nobody knows what to do with it? So I think it's critical. Yeah, that's it. Thanks. So I just want to give a few personal reflections on the event that we came to that led to the formation of the book and then a little bit of detail about my case study and sort of where I come from there. So I'm now going to just work at University of Washington. I'm currently an associate professor, but at the time I came to this workshop, we came together to write this chapter where I was an assistant professor. So between then and now, I got tenure, I got a major research fellowship and I was able to do quite a different thing. So it was being quite a personal professional to change me between then and now. And so I feel like I've got some good perspective to cast my comment on what it was like. And at that point, participating in the book was quite an exotic experience for me because in my community, we rarely do anything with any type of code and we don't really talk about all the people from the idle. So I was doing some good code, which was my secret hobby at the time anyway. And I did it sometime here in many of the disciplines that I put new from the internet or from things that are published that haven't really so it's basically, I haven't sort of seen the fear in your eyes when they're talking about their own world and the challenges they face and so on. So it was really important for me to get a sense of how other people are experiencing and the challenges are working reproducibly because after that point I thought all of my experiences would be unique at this experience. So it was literally transformative. I got a better perspective of how it works or technically. And also the process of attaining the workshop, I think most of us who were there wrote most of the chapters in the period of the actual workshop, which was also another exciting sort of burst of productivity. It also felt to me personally like I was kind of running a contract with myself. Like I've done this experiment with this one project, it technically made it as reproducible as possible and it's prevalent in this new tools and ideas and then I sort of lay away from that this is something I'm never going to be able to do again. And I always look at people in the workshop doing this routine because it's also one tough thing because we struggle to get to it. You ready? Oh, thanks so much. So a couple of reflections. I went from being reproducibility to being a curiosity to now being able to take a position of being able to participate in this workshop and from being a disgruntled reviewer sort of complaining privately about how unreproducible things were and now I'm very vocal. I think getting tenure was critical in that step. Now I'm very happy to speak out in my professional organization, the Society of American Archaeology, we've just organized an open science interest group where we have a manifesto coming out shortly about scripting and version control and open data and so on. And I've also been quite happy to speak be quite vocal about reproducibility and data science and the whole thing in our academic community. This just came out today or yesterday with a couple of my colleagues in eScience at UW and just another guy off Twitter who uses R and is very interested in reproducible research. So in some ways quite a sort of life-changing moment. So here's my... You can see it? Okay, I think, yeah. Workflow diagram. We excavated an archaeological site in Northern Australia that had been excavated a couple of times in the last three decades. Here we've got more artifacts and all the ones. And the analysis is very conventional. I'll just hide this out of the way, will I? Very conventional analysis. But, you know, I tried to adopt this sort of reproducible tooling so that the data came in from my collaborators and spreadsheets and those things. And I made a custom R package and used these other sort of tools in our ecosystem so that I could have more accessibility. So you can imagine this bit like an onion. There's a bit of R script. Then there's the tooling of an R package. Then I'm running it. Then I have a docker an image that is holding it also. I've got to do a protection in terms of my computation environment. And then these other bits and pieces that are sort of more or less familiar to doing a research project. Shuffling around with the Word document, reviewing with other stakeholders where I sort of stitched together this thing and it wasn't something I could ever do again. And I'm pleased to report to you now, a couple of years later, that almost every substantial research project I've done since that one has been in this and more. With my own, there's been a few data in the middle of a time, but basically this sense of the R package as a research compendium has been extremely efficient and productive for me. And a couple of the others in the group are now being very active in promoting this way of organizing research around the unit of the R package. So that's been something that's been really transformative for me professionally as well. Which, where's, here we are. Okay, so the workflow diagram. A couple of really crucial parts. The book-down package enables cross-referencing for figures and tables and citations, the usual sort of scholarly writing apparatus. And now, fully functional in R and R markdown, and that just simplifies writing a sort of self-contained document with code and narrative text and citations and cross-references is now very simple. The sense of open science have produced an excellent sort of data repository and pre-print service that I now use heavily and makes collaboration very easily. That's another crucial component of my sort of modern workflow since the time of our workshop and this initial first effort. The pain points, I think, as Kelly mentioned, other people, getting other people to sort of work in a way that's consistent with the way I'm working and get out of their spreadsheets and so on. That's a challenge. Dependency in our package is a major technical challenge, so in the course of this one project that is the case study the GG Plot 2 package which is a wonderful graphics package had a major version change and crippled all of my plots and it was really frustrating. So I'm now working on these two kind of solutions that PackRed is like storing the version of the package locally on my computer and EmRan is a sort of snapshotted server of our packages and I'm experimenting with both of those to try and solve this problem. This major headache for me still and trying to work reproducibly is keeping the packages as I like them. So the future for me, the challenges going forward trying to get this more into my teaching and sort of persuading students, especially graduate students who are working with other faculty, my colleagues who are not working in the same way as me, that this is worth their effort. I'm having some success with that now and as my work becomes more visible especially the reproducible elements of it, it's easy to do that. And now I'm quite interested in building the interest group in my professional society the Society of American Archaeology to especially have some influence on the way editors in the journal are working. So we're looking at influencing scholarly publications especially the badging system that the Center for Open Science have developed. We want to see more of that in our scholarly communication systems. So I feel like now I've settled on a very sustainable model for doing my research reproducibly and now I just want to get everyone else doing it or more or less like that. So here's the sort of thing I'll probably get tattooed across my back shortly. The way I'm thinking about reproducible research and how I sort of want to teach it and do it and encourage other people to do it. This is the kind of conventional model here, the common organic research paper the advertisement as Denome and Burpack famously described it. And then on the other end the kind of thing that most of us are working towards we have a document or a compendium that includes code and the version control and data and narrative text and with some sort of, you know, persistent factions between them like DOIs and so on. So this is what I'm sort of seeing as a vision that I'd like my colleagues in archaeology to be working towards in the future. Thank you very much. Hi, okay, so I'm Katie Hoff. I'm a professor at the University of Illinois now in nuclear engineering but I used to be here at Beds. So books are great. Right? They're great to read. And my experience with this particular book was that I read it before it was really ready so that I could synthesize it so that other people don't necessarily have to read the whole thing. But I'll tell you that I think this experience brought me a great deal of empathy for all of the scientists around me and I know that the community around me is trying to be as reproducible as possible. And I knew that before reading any of these case studies. But I think there was a lot more frank emotion in some of these case studies than I was truly expecting. And I felt deeply connected to people in domains very dissimilar from my own. And I think that was a really fun experience to read that and see the same lessons across a lot of domains. And so that's the majority of what this chapter is about. What kind of pain everyone was sharing. What kind of brilliant strokes or brilliance they had in the context of this journey. But basically we're all climbing this same mountain of scientific reproducibility. And so maybe you're climbing the north face and I'm climbing the south face. But it's a very similar grade lots of similar terrain and we I think learn very similar things. So let's talk about what we learned. So I'll tell you about the incentives that people express their pain points, some of the recommendations that the case study authors had and I'll give you a little data about it and then some remaining needs. So the incentives, you know, these are reasons that people wanted to do this. They wanted to be more efficient and more successfully collaborate. They wanted to see their science being done in a way that could be extended by other scientists and they really wanted to see verifiability. And I really liked some of these quotes at the bottom. You know, someone said they wanted to focus on science and having a system around their code really helped them do that. And I felt that there was good force planning and Ben mentioned this just now actually in the context of making sure things fit in this previous mold and a safety for evolution. So maybe you're the one extending your work later. And it gives you some safety if you know you can reproduce past work. And I thought that these really like capture what everyone who submitted case studies falls into one of these incentive categories. But what hurt? What was our shared pain? There were so many, so many pain points. And I would say and both Ben and Kelly mentioned this it was about people and skills where reproducibility broke down. It was typically some some collaboration denominator that people didn't share all of the tools that they needed to in order for the tool stack to work properly because there was someone who didn't know get or someone who didn't know latex or someone who preferred to work with word track changes and email stuff back and forth. And that's the nature of people and collaborations. A lot of people have problems with dependencies, build systems, packaging. They had trouble with accessing exactly the same hardware or predicted that people trying to reproduce their work would have trouble accessing the same hardware. This ranged from HPC systems to MRI type machines that I don't know much about. Lots of people had trouble really incorporating testing into their work. We'll talk about that a bit. Publishing as a collective seemed to be like a real pain point especially with this emailing back and forth issue. And data versioning. No one knows how to version their data. Honestly there are industry tools for it but I don't think that we've converged as a technical solution for data versioning. There are many technical solutions but I think very few people would agree on a best practice for data versioning. It's coming but scientists definitely don't have a strong way to figure out how to version their data. Now version control of code is a different story just to be clear. I'm talking about versioning the data. And also people didn't feel they had enough time to be reproducible and their data was often very restricted. Those are the pain points. And all of the authors had recommendations along the same sort of lines that I would have guessed going into this. It's stuff that's already been mentioned by all of the case study authors that came up here. Use open tools. Try to version control your code. Use documentation. Automate things. Don't hand enter data because that's going to be a step that you can't reproduce. So if you can avoid it try not to do this or that. People typically made these recommendations after failing to do one of these things. So the case studies are very humorous in this way. People as scientists are very good at self-reflection in a very honest fashion. I think it's very nice to see. Now there's other recommendations that I don't think were as universally shared but were noted enough by a few people that I think are worth mentioning here. Avoiding excessive dependencies. This is the answer to having the pain point dependencies. Or at least package their installation so there's a one-click option. That takes a lot of work. It takes software engineering which is not a skill that we are typically taught as scientists. Software engineering is pretty far from scientific domain learning but packaging their installation requires some software engineering. Use GitHub. Get DOIs. Digital object identifiers for archival persistence of both your data and your code. So digital object identifiers are that weird number that you see on your journal article and it's always going to point to that article. The same thing can be done for code. So get those. Plain text data is preferred because it is a timeless format. You can always understand text data. You know, if it's an Excel access type of format then you're actually talking about folders upon folders of XML files on the back end of that kind of binary file. It's very hard to version but text is timeless. Explicitly set pseudo-random seeds. Rather than just expect that you'll remember which seed and it's in the code. Explicitly set them in your code as part of your reproducibility so you get the same pseudo-random numbers generated every time. A couple of these I should note actually very explicit ones are from the stark Audubonny Millman study. Workflow frameworks are often very useful but they can be overkill so a lot of people who tried frameworks in which to plop their science that were supposed to give them a deal of reproducibility found them clunky. That's not to say that it's not worth pursuing workflow frameworks that could be a great home for just science you want to pop but real frameworks around reproducing your code is typically reproducing your sciences. A lot of people found them not quite right for their specific action and so it felt clunky. And there were some people who had recommendations that most other people did not agree with. So these are all individuals and so I'm going to just bring up two of these quotes that I thought were unique and have a more maybe practical experience of reproducibility that I don't necessarily agree with but you know this case study I don't want to embarrass anyone but you know this case study author really points out you know actually you know we may have had some errors in our scripts because we didn't really test it and so anyone trying to reproduce our work should probably not rely on our scripts right. And that's valid in many many ways is it reproducibility not in the computational reproducibility sense but yes in the science sense. So there are subtleties that you run into in these case studies where you say wait that statement is about being opposite of reproducible if you think about it you know it's actually coming from a very natural place from science and wants to seek to avoid error. This outlier scientific funding and the number of scientists available to do the work is finite and therefore not every scientific result can or should be reproduced right. So ask yourself which ones do you reproduce probably the ones you're going to publish do you reproduce every prototype effort no and should you probably not not if you ever want to get anything done. Now if you have a system in your work that's always going to allow you to reproduce everything that's so much better because you can effortlessly capture every task that you do. But you know if you crack open a Jupyter Notebook and try a thing or two maybe it doesn't have to be the most reproducible thing in the universe like yeah you should put it on github and now it's almost reproducible but should you put in hours and hours upon time making sure it's got unit tests and stuff maybe not right maybe not for every result and I think I should read some of these alternatives or thoughts that should get you thinking in this book. Right so there's a lot of popular tools anything that was listed more than a couple times I you know called out there's a lot of Jupyter use a lot of github a lot of python a lot of late tech this isn't surprising right these tools are really the core like NR there's they're really the core of reproducibility you know you're lying on open source libraries and open source tools and you're putting an all up in github python was really popular r was really popular there's some Java and C++ Daniel put another more beautiful graph up so I won't go into this but what I'll say that I did find that there was a distinction between r people who use r and people who use python in terms of whether or not tests were present in their workflow and I think it's something to note that python has numerous testing libraries you can write a unit test in python in many different ways using many different tools whereas r only has like to my knowledge one testing library so it's less common to see in that domain I'm not saying our users don't test their code I'm just saying these are users typically didn't test their code as much and so it'd be and there was more so just to note sure there were fewer r users but it's it's worth noting that of them you have tests versus no tests and if you look closely you have this sort of difference python users were much more likely to do to write tests without going into everything this chapter about lessons learned actually goes through all kinds of stuff that maybe we need to look towards in the future that could fix reproducibility as an experience for the scientists so that we can climb that mountain with a little less pain and a couple I'll just call out is that you know we should maybe understand why people choose to do unit testing so that we can increase the number of people who test their code because it was a smaller number than I predicted among this group of scientists who were trying to be reproducible so we should figure out I think why people test their code and make sure that that incentive is clear to all scientists we should broader do some adoption around publication formats that allow parallel editing so that more people will put their latex up on github rather than emailing Word documents back and forth because it causes a lot of frustration you know data storage versioning and management tools should be better and we should adopt maybe file format standards within certain domains that was a common complaint that a domain didn't have a particular file format standard and people didn't know how to store their data that's all I had to say and I should add I really got a lot you know I didn't write this chapter on my own I got a lot of feedback from other authors and especially from that from the editor so there you go yeah so I had the fun job of both writing a chapter about my case study but also kind of synthesizing all the themes from the book with Ben Marwick in a chapter where we tried to look at themes similar to what Katie brought up but in a much more broader sense we didn't sort of get into the nitty gritty details and so I'm not going to get into like all the details from the chapter itself but I wanted to bring three different themes that we were interested in addressing and those are sort of the sort of the gaps that exist in the different types of workflows that we have looking at some of the challenges that we could address going into the future and then there are some opportunities for people that are involved in this whole research life cycle and so one of the big things that came about is that in the past decade sort of the line between someone being a research software engineer and sort of a traditional domain scientist have just become very very blurry so most of us are doing sort of domain science but we're also writing a lot of code we're pulling together algorithms we're cleaning up data so we're doing a lot of these different types of activities and we're having access to a lot of different types of data these days and so new kinds of hardware new kinds of software have given us access to a ton of data that we can use to work on a whole bunch of things like climate change until some executive order of bands research on that things like disease outbreaks, drug discovery and so on and so one of the biggest challenges which Katie brought up is that we're relying on a diversity of hardware and a diversity of software and gluing them together in very crappy ways and so this could be somebody's workflow and for you to take this understand it and reuse it is very very hard and one particular challenge is that there's lots of different gaps so we don't quite understand how data flows out of one piece of hardware or one piece of software can be readily used at the next step and so on top of gaps like this in the software and hardware itself there's also this big challenge with training so Katie brought up things like some people might use version control some others might not and so there are challenges with having access to this this kind of training which programs like software and data carpentry are trying to address and taking somebody's workflow and trying to reuse some part of that in your own research can be very very challenging and sometimes the outcome is not so great and so a lot of these gaps are slowly being addressed by better tooling so anybody here can use jupiter notebooks anybody heard of it? a few people and so a lot of these different pieces of software that we're trying to use are now much easier are now linked together in a much easier way with literal programming frameworks like jupiter notebooks and then RStudio and RMarkdown for the R people and the other point that I wanted to make is that it's very hard to do very good re-perceivable research so it's not just simply a matter of dropping your code and data somewhere and then letting someone else pick that up and so the challenge that comes with that is the challenge of incentives so publications are still the currency in this game and anybody that spends their time doing sort of these good practices or adopting these good practices are not yet rewarded and so there's a lot of different work in this space with projects like impact story tracking sort of the metrics from all of this but we also looked at sort of what is the tangible benefit to sharing your data, sharing your code, things like that and so in the chapter we talk a little bit about how to sharing data with your paper increase your citation increase the visibility of your research and so on and so I encourage you to go check out that section of the chapter paper that came out after we wrote the chapter on sort of these these different benefits that come from being an open scientist and then the last bit of our chapter we sort of go over some of the opportunities to make reproducible research more of a norm going into the future and there are three particular areas in which we can improve or encourage these practices among researchers and one of them is from funders and so at least with more in Sloan which funds the Data Science Institute and various projects like Jupiter, Arapansai and many others there's potential to ask researchers to deposit their code to deposit their Jupiter notebooks and their data and so a lot of change can come about that way a second opportunity is for folks like the research librarians to kind of address these problems that span different silos and so things related to challenges and reproducibility data archiving, data sharing is something that even folks like research IT in there could help address and then the last opportunity is for journal editors and journals to sort of help out and so a couple of journals that I'd like to point out as very contrived examples are the journal which actually where you submit a paper that has already been published as a replication effort and only when an editor can actually hit make and rebuild a paper will we publish that paper so if you cannot replicate all the statistical results, the tables and the figures nothing gets published and then the journal of open source software where we sort of submit publications around software but until someone can install it, run it and actually use it and look for tests and documentation, no one is actually going to publish that but more traditional journals can also ask authors to make their code publicly available, their data publicly available to sort of encourage these different pieces to come together so it's not all doom and gloom we end the chapter in a very positive note that the ecosystem of tools are emerging and getting better integrated so Jupiter works with other things, all these different tools are working together and the incentive mechanisms are also slowly falling in place, it's not perfect but federal funders now are okay with reporting products besides publications and there's better ways now to track all of these different kinds of outputs so go read the paper, go grab a copy of the book, buy one for your family I'll hand it back to Justin while they're getting set up, I just want to say for anyone who did come in at the last minute I want to mention again the URL where you can find the book, it is practicereproducibleresearch.org so if you want to read the whole book, again if you weren't here at the beginning, practicereproducibleresearch.org that's the imperative form in the URL, as in you should practice reproducible research, not the descriptive in the title and then while people are sort of getting set up, we're going to bring out some chairs in a second I'll also say one more reflection to kind of get started that comes back a little bit from some of the things that some of the other folks have said one of the threads that ran throughout the whole process of writing this book from all of the authors the case studies and the core authors was the discussion or the debate about whether working reproducibly made you faster or slower and you could imagine a lot of the implications of the take home messages depend to some degree on whether you think ultimately working reproducibly makes you faster or slower at your work and I don't think there was actually any strong consensus on that people felt both ways they felt that in some cases it was ultimately making them more efficient they felt in some cases it was wasted effort and which side actually that comes down on in the long run I think personally is probably quite an important question that emerged from working on this book any questions for us any thoughts yeah and there are microphones for people asking questions first of all just many thanks for this entire session that's of last question what can we do at scientific societies specifically and this came up there was a mention of this in the talk but specifically at conferences if we get societies in a place where we push for helping scientists to do reproducible research we could run workshops what are things we could do in an hour or two in a workshop at an academic society to just give best practices to the researchers I can mention something that the organization for human brain mapping is doing already which is handing out an award for the reproduction of a study so a paper that comes out with a reproduction of a study is eligible for an award it's a few thousand dollars this is actually authors of a couple of the chapters in this book and the initiators of this is Chris Gorgaluski and Russ Poldrak from Stanford are initiating this are some of the people initiating this so a related thing that I'm involved in in archaeology so many of you are familiar with software carpentry and data carpentry in the two day workshops they run so we've condensed them down into a four hour program that we're going to run at the annual meeting of the Society of American Archaeology and we're also having a so that's an extracurricular activity about the idea of reproducibility and the related activity we're doing is having a regular academic scheduled event so there's conference paper presentations and there's another event called a forum which is more like a discussion we're using the forum event to do code demos of stuff that archaeologists have coded for publication but the focus is going to be on running the code and showing how you can solve certain archaeological problems using our code so that's another kind of training event that we're going to go through in a sort of professional way but in a sort of professional academic format through this conference forum and another thing that we're doing with this open science interest group so that's an obvious thing that can be done at any professional society is to organize a section or committee or an interest group that is specifically focused on reproducibility and open science and the one that I'm leading for archaeology is directly inspired and I don't think anyone can do that for their own thing but one thing we're going to do is take the center for open science badging which is designed for journal articles and journal editors to administer we're going to administer it within our society for the conferences so people who want to present a poster or a slide deck of the paper they can have our interest group review it for one of the open data or open materials badges and then display those badges on their poster or on their slide deck to really communicate their commitment to reproducibility so I think these are kind of things that are broadly applicable sort of transdisciplinary not specific to archaeology any discipline can implement these kind of things at a sort of meeting level or society level and I think the level of interest that I've received in doing this in archaeology has been really very high warming and I expect you will enjoy a similar reception in your research communities as well I would like to add something actually to this we have been discussing this a lot in the reproducible workshop at the BITS before even this book came up how we can educate people to do reproducible science and you don't even have to go to conferences to get that we are at the university we are at the core of this we are here to teach these practices and one of the things that we talked about in similar lines like what Ben is describing is to have a kind of like a workshop type of reproducible practices workshop this book is probably like a good I don't know first start for handing you something over but it turned out in the discussions that we had that this is a harder problem if you want to put that in a sentence of like a workshop or like a I don't know like one day of workshop like sulfur carpentry or data carpentry type of workshop yet I do think it can be part of it can be done and Ben just told us that it is possible to do it which is great so I think we should just make this happen more and more just in the universities and maybe in the conferences as well so I also want to make sure we get to others so if you have one more thought and I'll see if there are other questions so one other comment I want to make is you know data carpentry and those folks if anybody wants to get involved in this particular project please do so data carpentry and my team are trying to build these short lesson modules on things like zoom no-do how to deposit data how to use picture how to set up a make file for your research things like that as very short topics that are well documented and maintained by a bunch of people that you're attending a conference and you'd like to do a little workshop to smash different topics that you think might be of interest and go teach that and we're hoping to build out this nice menu of topics and make that available to all these different societies we can talk about protocols we can talk about other fun stuff other questions yeah we'll get you a microphone as some of you mentioned publishing is at least in academia sort of the name of the game have you started to see any motion or shifting forward progress about concerns about reproducibility questions rewards for it feedback from reviewers like has there been any motion in that direction really beyond like journals that are specifically designed for that right so in the journals that are not specifically open I'll say that I do know of a few and myself included individuals who choose to take it upon themselves to explicitly reject papers that don't have enough information to build the code or don't have any citation of their data you know I have this list that's I think Lorena Barba is responsible for the PI reproducibility manifesto there's other lists of this kind of thing and I think you can make your own list and I've sort of made my own list of things that I really want to see in a paper to guarantee reproducibility but it's stuff like that like is there a DOI for the data and it's a little grassroots and some editors really hate to then assign you papers after some time of this but I think that's where it needs to start but Ariel I think you had a thought yeah I was recently asked to remove links to code from a paper so it goes in all directions right my feeling is that I'm seeing more as a peer reviewer more papers that have code like in it or related mention code but I don't know if that's a real effect or just because I'm always banging on to the editors that we should have more papers with code and so when they get a paper with code it automatically goes to me because they know I'm obsessed about it so I'm not sure how objective my measure is but my feeling is that it is becoming more pervasive especially in my field scripting it's starting to increase in popularity over a sort of point of click analysis so it's interesting personally that if I do a talk somewhere on my campus at UW for like a centre like a centre for statistics and social sciences and I talk for them maybe once every two years over the sort of eight or nine years I've been there in the first couple of talks not many people there, not much interest they're mostly about open science and reproducibility the last one was packed and then I had four or five requests like we couldn't make it to your talk can you come to our lab group to know anything about and so I figured that there's kind of a momentum behind these concerns now broadly in any areas of science I would add at the risk of inventing something so maybe someone up here could verify I believe that Nature Publishing Group has just added an explicit statement in their submission process that you are supposed to include all code as supplementary information that's the latest one that I'd seen and I think Harold another question so how do you separate reproducibility as a way to learn tools, gain experience and see how other people have done their research from the opposing view that you want to reproduce something somebody did because you want to check the integrity of that person you want to write a comment where you want to take that person and the results apart I don't know I think I have an anecdote I don't know if it answers the question but the paper that I talked about we actually plus one where we published this paper has the possibility of asking questions and comments on the paper we got a question that was sort of a little bit vague this was actually a researcher that I know and asked a simple question about this and the opportunity to say okay you're asking a question about the characteristics of the distribution of the noise and how that affects this here's a set of notebooks that shows you what we did you want to pick us apart show us what you want what exactly do you want us what is it that you want to pick apart here can we have a discussion instead of having the discussion in vague terms can we have this in concrete terms here's code, here's data let's talk exactly what you're talking about a piece of that question I think though is sort of related to fear of being exposed and this is something I think that comes up when we talk to a lot of people I think we've all heard it when talking to other folks about reproducibility is scientists who are genuinely afraid either of having made a mistake and they sort of fear that they've made a mistake and don't want to be exposed just in case they had and being open and reproducible is a way for someone else to more rapidly find the mistake that you may or may not have made I also I'm in ecology, environmental science and there's also folks who want to make themselves the smallest target as possible to shoot at in general they want to they feel a somewhat political not maybe literally political but political incentive to provide a small target area and providing all of your data and code makes you a larger target I think as sort of what he was saying, I think those are real concerns they're natural concerns and you can't do anything other than weigh them against the corresponding benefits but I know I've heard that several times so I want to make a comment that addresses Rachel's question your question and your comment together so one thing is that it is very grassroots so the two journals I mentioned are very very contrived for the purpose of verifying things at least anecdotally we're slowly trying to change this by getting people like Katie to be a reviewer or like someone I know who knows how to use github and so I'm slowly starting to see that change even within my own community of people that I went to grad school with that hated code and then suddenly they're tweeting about a paper and then I could see a link to a github repo and I'm just like you know what github means like this is amazing we've reached a certain kind of crowd so I think that is slowly permeating but before I lose my train of thought since I'm talking about three different things anecdotally someone I know published a paper in bioinformatics where one of the comments was why didn't you try this other approach but both reviewers were quite familiar with all the tools involved because they're computational biologists and so this person told them cloner get repo check out this branch we tried that they liked that direction and the paper was accepted within a couple of days because they were quite happy with everything they were able to check everything out and so there's that potential for finding a reviewer that is versed in the same things that you know but to finally hit Justin's comment someone recently pointed out to me that even though there's the fear of being exposed it's actually like your best defense against in good faith I have done everything so here's all my code and everything else and if you publish this it's now your failure to verify that I did the job correctly I've shown everything the reviewers could have seen everything and rejected it and so there's that defense which some people have communicated to me I think we have time for one more question so maybe we'll get someone in the middle there who hasn't asked one yet and we'll be around afterwards if any of you have other questions you want to chat about we'll be around for a few minutes afterwards also maybe this is kind of a scoping question but it does touch on some of the larger issues of what after reproducibility as I understand it this book doesn't cover any particular actual reproductions of work so what are some of the issues there in terms of do you have the neurocognitions and the economists swap ostensibly reproducible papers what are some of the criteria that she would use to try and anticipate dealing with first order complaints about who's ever going to bother to reproduce these things we just charge ahead in terms of benefits and pain points I don't know I'd say the goal of the book was primarily to show what people are doing so they can investigate and what you're presenting is a great idea for another activity what would you then do and I think actually the model of these journals that Karthik has been mentioning Joss for example on which I'm a subject editor and a few of us are sitting here the activity is very much that you find someone who understands the methodology and attempts to build the code check the abilities of the code and there's a specific checklist of things that you check whether or not it's going to run does it make the plot that they say it's going to make and does it have a class that does the thing that it's supposed to do I don't think there would be such a great idea to have an economist try to do it for a typical bioinformatics paper because it might get maybe I think you have to address it at such a high level if you don't understand the methodology that you might get a positive like oh yes this is reproducible you would get a lot of false positives in terms of reproducibility I think sometimes the reproducibility you have to understand the method a little better to know what you're reproducing to really investigate that would be my answer to that if you just swap domains it might not be as robust I think the future is in disciplinary specific reproducibility we've seen a couple of disciplines journals appointing reproducibility editors like the American Cystical Association where someone in who's expert in the domain also expert with computational tools is tasked with assessing the reproducibility of paper and I think that's the logical step forward after this kind of broad survey of what people are doing and the problems they're facing is then for us to go into our own disciplines and fertilize the practice of reproducibility and have reproducibility reviews within our own field because we have the domain expertise sufficiently rather than struggling with all the overhead of the domain as well as the computational stuff so I think that's what I see as the next few steps I would like to give maybe a practical example to this I'm also in the idea of domain specific and even if you go from the domain to the specific lab how you make sure that your things in the lab are reproducible and it's not only on the hand of your collaborators and yourself maybe that maybe within the lab already you can start letting other people who are not involved in your project or in your students project to try to use their either own pipelines usually in a lab it should be a standard pipeline but this is always this is how research goes like in the lab already there's like few other versions of the same codes or same type of code so as the other students replicate the results and don't publish before that I think that's a very simple practical possibility how to start again in a small scale so we can spread and to broaden that out a little bit and then take us to a conclusion for the formal part at least I think what Fatma was talking about a lot of the questions touch on this idea of community norms in principle at least you know academic science is somewhat self-governing in that we decide collectively as a community what we think is appropriate and what we're willing to publish and what we're willing to review and what we think is appropriate to do and a lot of the questions are on journal publication reviews how you get incentives I think a lot of them really at the bottom come down to the question of do we collectively decide that being able to independently verify each other's work would say of course it's important and it's been important since the 1600s when we had the modern idea of science come out and there's something to be said for that there's something to be said for the fact that it seems logical that we should all be able independently check each other's work and I think a lot of the discussion centers around this idea of how do we develop a culture or a set of norms that require some aspects of that to the comfort of our fields or something to that effect so I don't want to go too far past the time I did want to say one more thing which is we are actually continuing to collect additional case studies we're not sure where we're going to publish them exactly but if you were excited by anything you heard please come up and find one of us after and we will actually give you some instructions for how you could write up and submit a case study for your research if you were interested so with that thank you all very much for coming thank of course to all the co-authors and co-editors for all their efforts again we appreciate seeing you here thanks very much