 Our presentation this morning is based on a brief that Danielle and I co-authored a few months ago. It's informed by Danielle's involvement in several Ethica SNR research studies about changing behavior of scholars, which we will cite later. And it's also inspired by my experience when I was at Cornell University Library in designing and implementing digital scholarship and research data management services. And maybe also, again, in the university environment, witnessing the challenges scholars face in their daily practices, especially in engaging and finding supporting services. Building and creating collections is such a central stewardship role for cultural heritage organizations such as libraries. So what we want to do today is to kind of flip the coin and position collecting as a central activity in scholars' daily workflows. So what we will do is I'll start with a discussion of collecting as a part of scholars' daily workflows and then I'll turn it over to Danielle for her to discuss how institutions can effectively provide research data management support services. We aim to allow ample time at the end as we look forward to hearing your feedback and your ideas. What do we mean when we emphasize that scholars are collectors? So let me start with a story, which is inspired by a recent Center for Research Libraries research brief. You might know that during the last 10 years, the internet use and content have boomed in the Latin America region across disciplinary spectrum. So if you were a group of interdisciplinary group of scholars studying political movements and their impact on environmental change, you'll be really interacting mainly with the open web. Even commercial publishing has started moving to open web in this region. So you will be finding scholarly articles, research papers, climate data, you'll be finding social media interactions, interviews about maybe political movements and knowing the highly fluid nature of online content and knowing the fact that you might not find the same page even the next day as a scholar, what you'll be doing is you'll be using various tools, whether they're open access, open source or whether they are commercial to extract, to scrape, to download these pages. And actually we are hearing often now these tools being characterized as research workflow support tools or also we are hearing as productivity tools. As I said, they could be open source or they could also be different commercial. Actually, the irony is that media libraries and archives are very active in harvesting content, from the web, maybe on major matters interested, my matters that are of interest to this group of scholars. However, collecting their own evidence data through building their own collections allow scholars more control and greater accuracy in means of knowing, for instance, when a page was captured. This story really captures the findings of ethic SNR research on scholarly information practices. You might know that ethic SNR has an ongoing series of qualitative projects that explore how research support needs vary by discipline. So far, actually, we covered 13 different disciplines. And in addition, there is a tri-annual faculty survey to examine the evolving attitudes and behavior. And this actually, in a nutshell, very well captures the findings, the empirical findings. There's strong evidence that the vast majority of scholars organize resources on their own computers. Actually, interestingly, we will be releasing the 2018 data in a few days, so stay tuned. It will be kind of nice to see how the behavior are evolving. So based on, again, as I said, our experience and empirical data that we have been gathering for almost 10 years, let me go over some of the characteristics of scholars as collectors. Just to illustrate further what we mean. First of all, scholars create personal collections over the course of their careers. And these compilations vary widely, depending on the discipline, and they might take several different forms. They may have digitized archival materials, they might have recordings of interviews, maybe numeric data sets from experiments, visual materials. And another important thing to note is the terminology. Although we very often refer to as research data, when you talk with scholars, what you will find is that they will use different terminologies. They would call information, they would call their research materials, they will call evidence, but data is definitely still continues to be maybe closer to probably the conceptual frameworks of scientists than humanists and social scientists. During the last 10 years, we have witnessed the development of research, data support services in many libraries, and that I think one of the motivating factors was the open science movement, especially open access, replicability of research studies, and also probably funding requirements, or funder requirements, I should say, in means of whether it's providing research data management plans with project proposals or public access requirements after a project, after scholars start implementing a project. So as Kamensky articulates in this code, there's quite a bit of attention on big data, shareable data, open data, replicability of science, so on and so forth. However, a vast majority of scholarly work generates what people refer to as small data. Now this long tail curve is very familiar to you, and as you know, there's this kind of intense, steep beginning where data have reused value, they lend themselves for standardization, whether it's metadata standards or interoperability, but then there's a very long tail that where scholars collect information themselves or in small groups, and what we want to highlight is that if you look at this long tail, not everything is shareable or should be discoverable or archived for posterity, there is definitely that kind of transient, transient aspect to data. That is created during various stages of daily workflows. It's not a linear process. It's actually non-linear and quite iterative, and that again, it's not only research data that is, that's a product of implementing a research project, but writing a grant, maybe doing a public science presentation or even a fundraising initiative. Scholars are creating their own collections, they face a number of challenges. In our brief, we categorize them into four groups, organizing, storing, sharing, and preserving, and that just to keep our discussion more succinct today, I will just give three examples, not go through all the information we presented in the brief. So, my first example is a very common one that you hear often. Scholars, as they collect their own content, they need guidance for selecting storage and backup configurations, and I think it's very nuanced, nuanced in the sense that sometimes they need to store just like a backup, almost like a dark archive that their storage needs are different. Sometimes they need to store in such a way that they will be using some software applications to analyze data, so therefore the storage need to lend itself for that sort of storage. The second example I will give is related to the heterogeneous aspects of data that they are collecting, which means that they will have JPEG files, TIFs, they will have audio-visual files, they might have PDFs, they might have 3D images, and one of the challenges that they face in regard to heterogeneous aspect of their data is they research workflow support tools or productivity tools that they will be using. Sometimes they don't necessarily work well each other, some of them are commercial, they lock the data, some of them are open source, and the project sometimes dies down and there's no support for the open source tool that they are used to using. And a third quick example is that of course, scholars have legions to their home institutions, but on the other hand, if you look at it, again, empirical data over and over illustrates us that scholars also have commitment to their own scholar networks, their communities of practice, and during their careers actually they move. They rarely stay in one institution. So as they are creating these collections, as they are storing these collections, one of the challenges they have is how they move these collections from one organization to the next. And also, of course, if they are engaged in a collaboration that includes multiple organizations, again, it pauses a very similar question. So they express need for further support in terms of storage, organizing, managing, discoveries, and now I'm going to turn it over to my colleague, Danielle, who will actually share with you some of our findings and recommendations. So up until now, we've been talking about what the scholars are up to and really putting them at the center of discussion around collecting. And to put that back into dialogue with the kinds of support services that are out there, I start with this visualization. So in essence, Oya and I have identified four stakeholders that are doing some work to support scholars as collectors. Perhaps an exaggeration, but I hope that the visualization illustrates that we have different groups doing tiny little parts or helping out with tiny little parts of the scholars process, but of course, no one of them is particularly comprehensive. Perhaps if I was a better graphic designer and wasn't relying on smart art, I would maybe have figured out a way to also recognize that sometimes some of these other smaller circles also intersect a bit. So just to walk you through these four main stakeholders and what they're currently doing to support scholars as collectors, I start with the funders. So we have the growing trend among funders to require those who are funded by them to make some form of their research outputs openly available. Sometimes that's the published content that could include data, but of course, adherence to these requirements is very uneven and is unenforced and quite difficult to enforce. And another common critique of this approach to moving the needle on openness is the reality that there is much less support for supporting for the infrastructures that would be need to make the content available and preservable. Then we have the advocacy groups, particularly open data advocacy groups. And similar to the funders, the focus is typically on what I would say is a very narrow component of what scholars are working on, which is the emphasis being on data. As Oya already highlighted, the reality is that there's so much other forms of content that scholars are working with, creating and collecting over their life cycle. So this is a challenge with this advocacy angle. Of course, there's also a very strong emphasis on the value of making things open to society at large or research communities as opposed to scholars, individualistic needs and practices on the ground. Now that's not to say that this emphasis isn't important, it's incredibly important, but I highlight this here in particular because there are trade-offs. One approach and the approach, I would argue, of most advocacy groups is very top-down and thinking about the very broad landscape when there is also the reality of the experiences of the scholars on the ground. And as they are the ones that need to be, our partners in this work, it is important to be thinking about their day-to-day experiences and what we're actually asking them to do. And then finally, I want to acknowledge that this kind of advocacy work has been taken up in some geographic areas much more than others with Western Europe and the UK and somebody can explain to me afterwards if the UK is Western Europe, but it has much more traction there. Then we have the tool and service providers and this is some, on one end, there's a lot of vendors doing this work and then we have the more not-for-profit angle, but in essence, the goal of these stakeholders is to serve scholars through tools and services. And I would say that the emphasis here is on a broader array of content. They're probably the group that is doing the best job at this. The reality is, is that a lot of these tools are framed around optimizing scholarly activities or even more neoliberally or nefariously around assessing scholarly activities and their value to the institution. And of course, there is an aim of many of these groups to monetize research content and or the usage analytics that would come out of when the scholars are engaging with them. And finally, we have the academic institutions or perhaps it would be more accurate to say some portion of some academic institutions. Here we find an orientation very strongly in the direction of research data management. So that's going back to the earlier thread that is typically focusing on one particular form of information that scholars are working with. One could argue that they're largely reactionary. This isn't necessarily the locus of innovation in this space. And as I think the folks at Penn State did an excellent job talking through yesterday at their CNI talk, there is the issue of centralized coordination. And if anybody missed that talk, I would highly recommend connecting with the Penn State folks because they just did such a beautiful job of explaining the hard work they've been doing to centralize the various groups on their campus who are interested in research data management. It is really hard work. It often is under-recognized. It's not the kind of thing that you can boast about in the same way you can boast about other shiny new things on your campus. So my hat's off to them, really. And then ultimately, even when we do come up with solutions that are very in-house or campus-oriented, they're just too resource-intensive to build them out, even when they're successful. I mean, it's a situation where they're typically a victim of their own success because it's really hard to scale any of these kinds of services when they're created in-house. So Oya and I are particularly interested in the challenges that are faced by academic institutions when it comes to supporting scholars as collectors. So I highlight here just a few of the risks that are the reality of not, you know, thinking about these issues seriously and particularly through the paradigm of being holistic about the approach as opposed to scattershot. So first and foremost is the reality that there's the potential for a decline in research productivity. And I don't even mean necessarily just for the individual scholar, although you could say that as well, but I'm also thinking about the institution as a whole and the fact that research productivity is a metric that we are often evaluated by. Then there's the reality of simply there being a lot of work duplication, service gaps, the fact that, you know, there's risks to IT, particularly through issues pertaining to privacy. And then finally is the reality that there's a real risk of ceding a lot of control to external vendors. That there's, you know, the potential for a loss of ownership and control if institutions don't take a centralized approach. We are spending so much of our energy on talking through journal subscriptions, UC, Elsevier, but this is the future of vendors taking the control away from institutions. So I would recommend that this is a big reason why we should care about this. So we, Oya and I have thought of some ways forward. They're recommendations and part of the reason why we really like to come to places like this to talk about this work is because we want to hear what people actually think is viable as a way forward, what makes the most sense. So these are our recommendations that we pulled out for our report, you know, first and foremost is designing and promoting services that really think about scholars, not only as collectors, but particularly as curators. This idea that particularly in the humanities and the social, humanistic social scientists, people are collecting primary content in a lot of, you know, really novel and important ways. Any kind of effort to harness that activity would be really exciting. We strongly recommend reframing research data management services to be more inclusive. The reality is, is that they just alienate the whole component of the academic population and are not accurate in framing how people are for the most part doing their work. You could argue that across the entire academic community that there's a lot of, you know, digital information fluencies that need to be fostered. Number four is something that we're particularly interested in, it's about what kind of opportunities there are to actually collaborate with tool providers, either to ensure that institutions, you know, don't seed too much control in this area, but also to make sure that tools are built in a way that are actually useful. We definitely recommend developing more university-wide approaches and policies, and once again, I think that, you know, I'd love to hear more narratives by groups like the group at Penn State about what they've been doing. And then finally, of course, there is still a physical media component to what people are collecting, and so any sort of work around thinking about scholars of collectors really needs to think about the diverse information content types, including the analog. So we, of course, welcome any questions you may have for us. We have two questions in particular we've highlighted, which we love people's reactions to. The first is around who on campus should really be taking ownership of a more holistic approach to supporting scholars' workflows, and whether our paradigm resonates. Do you think that thinking about scholars as collectors could be a good rallying call across the institution? Who should be taking up this work? Is this really the role of libraries, or should it be someone else? You know, Oya and I both come from a very strong library background, so we're always really interested to hear if this is the kind of thing that libraries in particular want to take the mantle up on, or if it just doesn't make sense. And then our second question has to do with how people can actually work with vendors productively and not-for-profit groups around workflow support tools. What would actually make sense here? How far away are we from this? It makes me think a lot of Kathleen Fitzpatrick's talk yesterday. She has such a rallying call around collaboration and not just between institutions or within institutions, but also about these other external groups that we work with. What is possible here? So now we open it up, and we apologize that this is not the most conducive space for conversation, which the video recording will never know. Thank you.