 So, what I'm going to present here is a bit about a pilot project that we've been doing at Rensselaer Polytechnic Institute with Elsevier, and it's on a product that Elsevier had called in pilot. They called it Data Lighthouse, but it will eventually become a production product called Data Monitor. And Rensselaer Polytechnic is one of four institutions that have partnered on the development of this product. But I want to start by giving you a little bit of the genesis of why we went down this path. So there we go. So what I have here is a list of research objectives. This list is probably no different than the research objectives of your own institution if you're a research institution, research university, whether you're a public or private, the objectives here are basically to grow research, establish a wide amount, a number of partnerships, whether they be private or public partnerships or corporate partnerships. If you're a large research institution and you have global connections, you're looking for international partnerships as well, you want to make sure that you're getting the highest quality level of faculty or research faculty. And your administrators want some metrics and assessment tools associated with the research output and research activities. So the trend has been for many research universities and the libraries associated with them to build a research and institutional repository. And as I quoted Cliff Lynch here from his article back in 2003, the notion was to build some kind of centralized storage system that would allow for the depositing of not only research but published research. So if you're involved with open access or pre-prints, those types of activities, this might be the model that you've pursued at your institution. But rents layer, we are not a large research institution, I've seen some of the other sessions this afternoon, places like Penn State that have campuses all over the state. We're not in that kind of situation, we're a school of about 7,500 students, but we do have a lot of research that's going on, a lot of federally funded research from DOE, DOD, NSF, NIH. We have ROTC programs, so we do a lot of aeronautical engineering and there's a lot of federal funding that comes from the military. Our largest corporate partner is IBM, we do a lot of research with IBM Watson. So we already have a vast amount of activity, research activity that's going on, even though we're a small private school. So at a small school and for our library, which is a staff of about 20, can I really start building a research and institutional repository? Is that really feasible for us? And so the considerations that we had is we looked at people process and then technology on the people side of the house. I don't have a large number of staff, we don't have a research data librarian. The ecosystem for scholarly communication and scholarly research has become increasingly federated and highly decentralized. So we know that there are, as was indicated in the plenary this afternoon, we already know that there are a number of discipline specific repositories that exist out there. So if we started to build something, are we duplicating efforts? Or are we creating a workflow that requires a researcher to duplicate efforts? We also know that a number of the large STEM publishers are also starting to build systems that will, they will deposit research data associated with the publication on behalf of the authors. So the proposition that I put forth is that maybe there's a different model for us to be looking at, particularly if you're a small research library. So the first two, they're no different whether my large research library or a small research library. We want to have, we want to be able to archive research data in some method. And as I've seen in other presentations, we want to be able as librarians or informationists to be able to put metadata and ontologies around the research data sets. So that you can go find them. The goal, of course, is to try to foster cross-disciplinary research. Because in an increasingly competitive funding environment, the fund, the proposals that are going to win awards are those things that are unique. And if we could create linkages between disciplines that are seemingly disparate from one another, you actually have a better chance of receiving funding. Now, at Rensselaer, the libraries are considered part of our centralized IT for the institution. So I report to the chief information officer. We don't report to, say, the vice chancellor or vice provost or provost for academic affairs. So I see the library and having some kind of inventory and knowing where the intellectual capital and intellectual property of the institution the output is, I see it as contributing to part of cybersecurity strategies, tech transfer, intellectual property. But the difference here is I am not offering to become, nor is my central IT, offering to become the repository for all of this. So in the pilot project and as a development partner with Elsevier, what we wanted to do is still have an ability to track where the research data is being stored and where is it available? Is it openly available? Is it private? Does somebody else own it? Because it's a multi-institutional collaboration. So what Elsevier proposed working with us on was to check what's the publication output of our faculty, of our researchers. And they use their tools, such as Scalix, and check out in things like Mendeley data and Scopus. And the idea is if with those tools, we can't find that data associated with the publication is already existing, we would then go out and solicit information from the actual author's researchers of published research. We also verified some of this against the data site tool. And all this is collected in a dashboard that I, as director of libraries, can go look at. And I can share with my administrators that says, here's an article. And here's the research data set that's been deposited that's associated with that published research. So what I have here is an example of an email. And Elsevier calls it an email campaign. Basically, the tool will generate a series of emails. And the email basically says here, congratulations, Professor So-and-So. Here's the article. And you can see the article in the italicized text up on top. And we're saying, as a library, we're encouraging you to deposit your data. We're not specifying where you're depositing your data. We're just saying that we encourage you to deposit your research data. And you can see that there are the two big buttons, yes or no. Depending upon which button you click, you then get sent down another chain of logic that asks you another series of questions. So if you say, yes, I have data, we then start to ask you questions like, what kind of data is it? Where is it stored? Are you the owner of the data? Can you give us some description about the data? And you can see some of the questions here. You have some fill-in-the-blank boxes. You have some buttons that you can choose from multiple choice. And so basically, what happens is if you respond, that's the end of this. If you didn't respond, we send you an email reminder saying, please respond to the survey and provide us with some more information. And that's what the diagram here just shows you. Thanks for responding, or you didn't respond, we send you. We poke at you again and say, please conduct the survey. So what you see here is the results from a campaign, the test that we ran at Rensselaer Polytech. And you can see the first block here. This was a researcher who said, yes, I actually have data for each one of these publications. And we can see that it's linked over in the far right-hand corner of this screenshot. And then we also have a no at the very bottom there, where somebody said, yeah, I've got an article, but I don't have any data associated with it. And the very last thing, you have an option if you get too many emails and you're tired of responding to this. You say no. Now, the reason I make note of too many emails, the way we did this analysis, if you were a co-author and not the primary author and you were a co-author on multiple publications, you received multiple emails. That's part of the learning that we discovered as going through the pilot. So what we did is we asked Elsevier to run an analysis of publications from our faculty. So we have 406 faculty that were included in this survey tool. And we ran an analysis of publication output from Rensselaer researchers and faculty. And we started at January 25th of 2017. And publications that were published between that date and May 10th of 2018, those were the publications that were captured. And those were the triggers that allowed us to send out the email. We ran this campaign for 15 days. We actually ran it at the end of the semester, which was also something else that we probably shouldn't have done. But the way timing worked out, that was the best time that we could actually get it all approved by the provost and by our VPR and campus council, et cetera. You could see a number of individuals opened the email that was sent to them. We also sent a number of reminders. And you could see on the far left-hand side, there was one faculty member who received over 70 emails. That's because that individual was a author or co-author. So needless to say, lesson learned, don't send that many emails to one person. So here was the goal, is to run this test. Now, of course, this was testing to see what's the viability of this particular tool that Elsevier is working on. But it also gave us some feedback as an institution about what are the current practices for our research faculty at Rensselaer. We'll understand a bit about how this prototype works, and whether the email messaging, whether you've flooded somebody's inbox with too many emails, or whether they're going to respond, whether they get four or five, et cetera. And then validate what other recommendations we could come up with that actually makes sense and would be acceptable to a researcher. So you could see, even though we sent this out at the end of the semester, right up against study week and finals, we still got a pretty good response rate back of 19%. So of course, there were a number of faculty that didn't receive emails. And that's because during that particular time, that snapshot of publication output, they didn't have anything. So if they had something that was published as an abstract, but not as a journal article, we didn't capture them. They weren't going to get pinged with an email. But you can also see what the responses were here for those that did respond. And a number of people said, I don't have data. I have data. And that includes what Elsevier found by doing some initial analysis in addition to what researchers responded, saying I do have data. And then you could see there are four individuals who said, I've had enough of these emails. I'm not even going to bother responding. Or one email is too many. So from doing this analysis, what we discovered is that even though we don't have an institutional repository, there are researchers that are actually depositing research data sets in other existing repositories. Now, the large black pizza pie segment slice on the bottom is Google Drive. So we might want to say you probably don't want to be depositing stuff in Google Drive. But this is a lesson learned. We actually have some feedback about what some of the practices are. So you can see here, it's a wide variety of practices. But when we got done with running this campaign, we also ran a series of interviews with some of the responders, which was pretty interesting. So there were individuals who said, I don't have research data. But then on some additional verification using tools like data site and looking at Mendeley, we could see that there was actually more than what a researcher said. The researcher said, I don't have data, but actually, they do have data that's deposited someplace else, which raises some other questions about, well, how much do our faculty actually know about where their data is ending up? So here, again, is the summary of the numbers of what basically the results of running this test campaign were. So you can see we have a large material science program. So you can see there are a number of people already depositing stuff in a crystallographic data center. And then even though some people said, I don't have research data, they actually have. So we interviewed a number of faculty to find out some more details. The survey gives you some initial data. But when you actually sit down with a researcher and find out what are the research habits, what kind of experiments do they run? Do they rely on postdocs? Do they rely on undergraduates? Do they rely on graduate students? Is this a collaborative, a multi-institutional type of research? We discovered quite a bit of interesting information. The main point that I want to bring out here is that most of them don't see the value of having a data management plan. And the reason is they see no consequences, no penalties, no repercussions of not having it. So they may write it into their federally funded proposal, grant proposal. But at the end of the day, since nobody's checking on them, what's the consequence? But when you start discussing with them, what do they do for research data? We had one faculty member who said, I had this postdoc. They graduated. They moved on about four years ago. And I want to go back and find that data from that experiment. But I can't find it, and I can't find a postdoc. So maybe you do need a DMP. Another faculty member indicated that they were less interested in the DMP than they were in the notebooks. And the question came up about, well, I have all these lab notebooks, research notebooks, that are all handwritten. Is there a way that I can digitize all that stuff and start to research that? Another faculty member described their research process, what they considered to be part of research data management. They said it includes from the point where I sent the text message to the postdoc and say, run the experiment on the spectrometer this particular way. So this individual considered research data to be this incredibly broad spectrum of information, basically capturing the processes, as well as the raw data set, as well as graphs that were generated using SAS or SPSS or Tableau or R. The other thing, of course, is research, what you describe as research data, is all dependent upon what your area of research is. It may also depend upon instrumentation. So there's a wide variety. And if you're in the humanities, what you describe as a research data set is obviously different than what you describe in engineering. But I was just at the Research Data Alliance meeting last week and there was a session on engineering. They were trying to come up with standards for engineering data. And of course, they discovered, well, if I go through the subdivisions of engineering data, that's all different. So civil engineering data is different than mechanical engineering data, which is different than electrical engineering, et cetera, et cetera, et cetera. So our next steps on this is to revise some of the questions that we asked in that email campaign and survey. So we're going to reword some things because we understand that some people didn't respond or didn't understand how the question was laid out. We also discovered that we need to educate faculty and researchers about what research data management is all about and why it would be a good thing. We want to encourage people to actively deposit their research data. Again, the model here is that we're not building the repository. We can make recommendations depending upon your discipline. You might want to deposit an archive. You might want to deposit a med, et cetera, et cetera, et cetera. And then give them some sense of how that depositing can be done. Now, I should note, I started by saying we were one of four institutions that are working on this project with Elsevier. We're the only one that's saying we're not building an institutional repository. The other three either have one or are in the process of building one. So the way that they're going to use this tool is different than the way that Rensselaer is intending to use this tool. So with that, I thank you.