 Open Scholarship, the impact on research, teaching, and learning remarks by Parminder Reina, Charlie Schweik, and Trevor Munoz at the 160th ARL membership meeting, convened by Brian Schottlander. My name, as you all know, is Brian Schottlander, and I'm from UC San Diego, and it's my pleasure to be the convener of the last program session of a very full and thought-provoking day. I want to begin with an apology, and that is I'm sorry to say I am named neither Jim nor James nor Jimmy, but if you will promise not to call me late for cocktails, you can call me any of those things. This final program session deals with the impact of open scholarship on research, teaching, and learning. I don't need to tell you all that our research library community is fully, if not constantly, it seems, engaged in discussions of all things open. These days, if it's not E, it's open. Open access, open source, open data, open research, open knowledge, open scholarship. Which is to say we're engaged with our respective faculties and students in their conduct of research and scholarship, and in supporting them in that conduct by providing them with the tools, the services, and the infrastructures they need to openly disseminate the results of their work. There are, and you will hear some of them this afternoon, a growing number of examples and case studies of successful ventures in this space, but there is more work to be done. We are very fortunate to have with us this afternoon three distinguished faculty members from three very different walks of life, who are leaders in promoting and developing open scholarship. Each of them is going to focus on a different aspect, either disciplinarily or format wise of the open discussion, and they will share with us their perspectives on how this discussion is having an impact on their research, their scholarship, and their teaching. Now you have in our program bios of the speakers, which I will not repeat, save to say that they highlight both the depth and the breadth of their experiences with and support for open scholarship. Let me introduce the three of them briefly before turning the podium over to them. To my immediate left, Dr. Parminder Reina is the Canada Research Chair in Geroscience in the Department of Clinical Epidemiology and Biostatistics at McMaster University. Dr. Reina will talk with us about using open sources and open data in both the educational setting and the clinical training session setting. At the end of the table, because he's feeling a little under the weather and is being kind and sparing us from the exposure, is Charlie Schweik. Charlie is Associate Professor in the Department of Natural Resources Conservation and the Center for Public Policy and Administration at UMass Amherst. He is the founder and co-director of the UMass Amherst open source laboratory and the associate director of the National Center for Digital Government. Charlie will talk about his experiences in using open educational resources in teaching and conducting research into issues of public sector policy development. And then finally in the middle, Trevor Munoz is the Assistant Dean for Digital Humanities Research in the University of Maryland Libraries and the Associate Director of Myth, the Maryland Institute for Technology in the Humanities. Trevor whom I had the pleasure, as some of you may as well, of hearing speak at the Berlin 9 conference last November in Bethasda, will share with us examples including some sort of late breaking examples of particularly impactful open digital humanities initiatives. Each of them will speak for 20 minutes-ish and we'll have time for questions and discussion at the end of their presentation. And with that, please join me in welcoming to the podium Professor Parminder Reina. Thank you very much and I know I'm standing between you and your drinks at the reception. So I'll do my best to stay within the 20 minutes so we can get on to the next speaker. I guess a few weeks ago Sue contacted me to see if I would be interested in talking at this conference and she tried to explain it to me and it made sense to me when she explained it to me that it would be a good place to come and talk about some of the things we are doing. And when I was preparing my talk, then I started to question myself, why me? Why am I going to be talking to some of the people? Because this is not my usual constituents. So I guess I have a few things to say that might or might not be relevant to the folks in the room. One is we are doing research which is sort of changing in a way we did research that it doesn't belong to a researcher who is developing the research program or the framework related to particular research. So I think probably that's one of the reasons I'm here. So I'll share that with you what we are doing. But before I can do that, what I'll do is give you a context within which I work, my program of research. I won't go into any technical scientific details here but we'll give you a sense of why we are doing what we are doing in relation to open access data from the program of research that we have created. And I'll give an example of some of the approaches we have taken because I'm talking about health data. It raises issues of privacy, confidentiality, ethics, and that actually brings different challenges and opportunities when it comes to thinking about open data. So I will talk about opportunities and challenges as well. And within the challenges, I'm actually going to talk about institutional barriers and I think that will be of some sort of relevance to folks in the room as well. So why me, I think I've talked about that. I think I might have something to share with you, hopefully it is useful. But the area of research that I work mainly is population aging and another area which I wish I had more time. I think that could be very relevant to this group is I work by direct two centers and both centers look at evidence-based medicine. So what we do is we look at the published literature in medicine and try to look at the quality of the research and try to make sense of the evidence which is used to design clinical practice guidelines and the physicians can practice better medicine. So I'm not going to talk about that here today. So one of the reasons open data, open access to data is becoming an issue in our area is that as we know that most of the developed world, the population is aging quite rapidly. And actually it is not just the phenomena of the developed world, it's also happening in developing countries such as China, Brazil, India, and they're actually aging much more rapidly than we have and I'll give you a couple of examples. And also this change has happened because of the declining infertility rates and many of these shifts in demographic patterns are quite permanent and has quite profound impact on human life and society in general. Not only the healthcare system but society in general. And these pyramids here, they are pyramids but what's happening is that as populations are aging, these pyramids don't look like pyramids anymore and I'll show you an example of a pyramid from Canada. And this is a population trend, a population totals in Canada by age, group, and year and this is going from 1995 to projecting into 2050. So in 1995 we had some sort of a pyramid. Here you look at in 2050 when we get there, this is going to be upside down pyramid. So we're going to have some of the too fastest ageing groups which is our baby boomers and the 80 plus and we're going to have more and more centenarians but the challenge it raises is that if we want to understand how to improve the quality of life of older people living in their own homes and in their own communities, it is actually a collaborative effort. No single country is going to be able to understand so you have to think about programs of research that span different parts of the world and that means you have to be able to have mechanisms to share the data. So we have a study in Canada called Canadian Longitudinal Study in Aging, it is just beginning and again this is a strategic initiative of one of our funding agencies, the main funding agency in health research and we have many, many researchers across the country involved in this. So everybody has some stake in this initiative so one of the stakes is that they will have access to data and we also have different cultures and we have biologists, we have geneticists, we have epidemiologists, we have social scientists and they all have their own disciplinary needs when it comes to data and disciplinary challenges and all of that has to be taken into account when you are sort of thinking about these large initiatives. And secondly and there has been shift and this is happening more and more not only in Canada or US but in other parts of the world is that these large expensive publicly funded projects that are happening are really not just research projects, they are platforms. They don't serve four or five people who design them, they actually are a resource for the whole community of researchers. And so our CLSA here which is a fairly expensive enterprise and I will give you some numbers later on. We are designing it as a research platform which is basically an infrastructure to enable state of the art interdisciplinary. In a simplest way you can say is that we are creating data registries here for people to use and answer multitudes of scientific questions and this is not for you to sort of memorize and I'm going to test you on this but it gives you a sense of the type of information these large enterprises are actually collecting. They are driven by some of the research questions but they are also designed to be a little broader so other people who haven't thought about the research questions can come down the line and be able to address research hypotheses and in fact if you look at this the type of information we are collecting is quite personal and sensitive. So that adds another dimension of complexity, how do you share this data in an open way and how do you keep the individual participants privacy and confidentiality in mind. And another area which one has to keep in mind and we can have a discussion if we have time is that these are wide where very different data says there is social data, there is clinical data, there is genetic data and they all have their own challenges in relation to how you allow people to use those data in a larger community. In this one we are also collecting blood and urine samples and I know someone is here from Hamilton, right in the room, from McMaster, I saw the name. Anyways, so Hamilton is going to be one of the largest urine capitals of the world. So this is not a tourism plug by the way, Hamilton is a lovely place. But the idea is that these blood samples, we are not just taking these blood samples and storing them, we are actually sort of doing different things with them. We will have serum, plasma, DNA and live cells that are again very sensitive materials that when you are managing large studies like this. So just, this is a very quick overview of our study. We are looking at 50,000 people. We are going to study 50,000 people for the next 20 years. We are going to be following these people every three years for the next 20 years. We will do 140,000 telephone interviews. We will do around 210,000 home interviews. We will have 210,000 visits to our physical assessments and blood and urine samples that people will provide. By the end of it we will have 8.8 million biospecimens collected and stored. 300,000 follow-up calls so we can re-contact people. We would have asked 129 million questions. 219 million data points collected during the CLSA. And around all together, collectively, different combinations that we will have. We are thinking by the time we finish we will have around 340 million anticipated data points that forms this research platform. So you can see it is enormous and it can grow further if people decide to add other studies to it to answer specific questions. So how do you sort of look after these data and also allow it to have an open access. So you will give me a warning just before five minutes because I can talk for two hours. So warning to you. And another area as I said that this whole aging phenomena that we are studying is a global issue. And I mentioned that it is not just an issue of the developed nations, it is becoming a huge issue of the developing countries, especially the countries like China, India and Brazil. For example, France, when they went from the life expectancy of 45 to life expectancy of 70 years, it took them 113 years to get that increase in life expectancy whereas China has gone from life expectancy of 45 to life expectancy of 66 in less than 26 years. And these countries don't have infrastructure or social programs or healthcare delivery systems to manage that. So they are actually struggling to deal with this major, major demographic shift that is happening in these countries. So part of the plan that we have is that we are actually building a global observatory. So there is a Canadian longitudinal study on aging in Canada, there is a health and retirement study in the US, there is an English longitudinal study on aging in the UK, there is a study in Sweden, there is a study coming up in India, Korea, Japan and other parts of the world. So the idea is that somehow we will be thinking, either some of the studies are already out in the field and some of them are starting now, so we will be doing some sort of harmonization across these studies to pull data eventually and look across jurisdictional differences and try to understand some of the policy things that are going on these countries how that impacts aging. And within that we have to think about data, whether we do a retrospective harmonization or prospective harmonization and we need to have information about the study itself, these different study access to catalogs, access to data dictionaries, sort of mapping catalogs and data dictionaries across different studies to say whether it makes sense to pull these data. A lot of open source software has to be developed for us to be able to share information across jurisdictions and all different jurisdictions have different laws and legal issues when it comes to very sensitive data. And one of the challenges that we are already facing in this is how do you access data and because different cohorts have different ethics, consent and that has restrictions on how the data gets shared and different jurisdictions have different ethical and legal requirements and I wouldn't spend a whole lot of time on data format but that also becomes a challenge how are these data archives stored and whether they are compatible when you want to pull them. So these are some of the challenges we are thinking and it is basically this is what astronomy has done because basically you think of these cohorts as a telescope and there are many astronomical telescopes all over the world. They are looking at different parts of the sky to come up with the overall picture of the sky. All the cohorts around the world are basically telescopes and we are trying to compile that picture. So I'll go back to my longitudinal study on aging which is the Canadian longitudinal study on aging and affectionately we call it CLSA. Wish we would have come up with a better name but that's what it is. So we are developing this as a research platform. I am the principal investigator. I've spent ten years of my life along with a few of my colleagues to develop it but we have decided that and this is a conscious decision not just by myself but from our funders and our collaborators that even as a principal investigator which is the norm in science that most of the principal investigators have the right or first dibs to the data. We actually made a choice. I'm like any other researcher in the world. We created it. We have spent time. We will put this data in the public domain with all the caveats and the protections that one has to have. But if I want to access data I have to go through the same process as someone who was never involved in developing this protocol or this project or this platform. And but that has issues in itself is that how do we get credit for our ten years of our life. So I will come back to that in a minute. Another issue that comes in in relation to creating these research platforms which have open access to data that do you give out individual level identifiable information to researchers or you create aggregate data that you can't identify individuals especially if somebody can be identified or comes from a very small community and you can pick the postal code and identify where that person lives. So the decisions about individual level versus aggregate data have to be made. Now most researchers will tell you that aggregate data is generally not that useful. We want to have individual data which is true. So we had to come up with a plan that if we are going to disseminate raw data for two researchers who will require individual level data how do we do that. So and in doing that what we have done is that we have actually developed policies so we had to develop infrastructure for people to send in their proposals. It goes to a data access committee which we have developed which is independent of us. They evaluate these proposals they say okay we are going to give you these data so it comes down to I think you are going to be talking about governance later on but this comes down to governance of our platform is that we have so they will agree or disagree to share that data based on whether it fits the consent because we have to always keep that in mind and our consent is fairly generic so most likely majority of things will fit and once they have agreed to say okay these researchers can have access to data they are only given data to analyze that particular research question or sets of research questions they have proposed after they have finished they have to return all of that data not just the data that we gave them but any data they actually created after that and that again goes into this public platform. So in future people don't have to redo those things and we sign written memorandum of understandings and agreements with the institution that they have to abide by these rules. So this is the there's a little cost to them that they have to have some due diligence but they get access to individual level data. Now there are some data which we actually did quite a bit of work before we actually launched we did public consultation across the country where we actually interviewed potential participants so if we create a platform like this how would you feel like if we are sharing these data and most of the time actually the general public didn't have many concerns the only place they had a concern which surprised me I thought they would be very concerned about genetic data that was not as a bigger concern but people were more concerned about data like data on depression because they said if that data gets out it has a stigma attached to it and insurance companies get hold of it we might not have health insurance or whatever that might be so we actually made a conscious decision that we in our data sharing policy we have said that the private sector will not have individual level data access access to individual level data but they can work with some researchers in public institutions and collaborate and answer certain questions that might be of interest to them and they can use the results but not the actual identifiable information. So I guess I have a couple of minutes and another push that is this push is coming from our funders as well for example NIH in the US is pushing researchers to put their data in the public domain but after they have analyzed and published their work in our country push is actually coming even stronger they're saying doesn't really matter whether you have answered your questions yet or not you got to start moving data out in the public domain as soon as possible now funders have they're pushing it but they haven't actually thought about the chaos it creates because in research environment we are talking about intellectual property proprietary technologies that are developed and you can't sort of send out data at the same time develop or protect your IP so there are a lot of things that are going on that raises challenge in our case we decided we are actually not going to have any intellectual property limitations we are going to give it to researchers their institutions have IP policies we will just stick with that and keep it simple and as I mentioned one of the things that we are doing is the reason we wanted to have a free access to data open access to data is to harmonize for example more and more research is coming out and saying you know this gene is associated with this disease in large population based studies or clinical studies don't have opportunity to replicate those findings because you need a different population and you'd need large numbers sometimes you need numbers in 500 to 600 to 700,000 people so creating these platforms that are open access and designed to be shared actually allows the opportunity for us to harmonize data across jurisdictions and and replicate our findings and I think another reason for open access data especially from research is that it actually increased the sort of speed of discovery like it is not in the hands of four or five people and they will take 20 years to analyze five six times and publish two papers and that's it if you have a larger community you're actually using public resource to increase the speed of discovery and at the end of the day why the reason we are doing this is to improve the health and well-being of the population so from that point of view this open access data makes a lot of sense. Now challenges that you know I'm very fortunate that I'm at a university that is very progressive and and so from my point of view and I was also very fortunate that I have a very productive parallel program of research where I could publish but if you're developing a program of research which takes 10 years and then you hand it over to general community of researchers it actually cuts into your productivity and you're not likely to get tenure and promotion and and so one of the areas where there has to be some thought given is to I don't think this should be a barrier but the institutions actually have to rethink how they give credits to people and and and this happens at the peer review level and and and this is true also for some of us who actually want to do knowledge translation or dissemination of research findings to policy makers we don't get credit but it's actually probably one of the most important work as a scientist you can do so these are some of the challenges we have that actually become barrier I know many of my colleagues when we were first designing this initiative backed out because they thought this was not worth their time because they will not be given the credit and it's their choice but I think there is a time and opportunity to make people think differently about how we use public resources and how scientific endeavors move forward so those are the few other things I wanted to say perhaps some of the other things will come out in our discussion and if you want any other information from me this is my email address and our website and Facebook and Twitter so we are very modern study thank you my talk is now going to turn us to more of the education side of our mission and so I'm a faculty at the University of Massachusetts and I'm going to talk about several things so I was thinking about preparing this so one I just want to open this up with a discussion about the web as an existence proof for the power of openness that's one point I want to make then I'm going to talk about the use of open scholarship and teaching and my experience my practical experience at UMass Amherst and two classes that I've taught then I'm going to zoom out to the global scale and take a couple minutes to talk about my research about open in this case open source software collaboration so I'm going to reflect a little bit about a study I just finished and then I'm going to finalize this within it's describing the attempts where I've tried to actually organize an open educational collaborative internationally and what I've kind of some reflections on that so I'm going to do that in 20 minutes or 10 15 minutes hopefully so this is probably an obvious obvious question but to many of you perhaps all of you but I like to start this by saying you know thinking about the web from 1994 to 2000 why did it grow so exponentially in that time period anybody having ideas or thoughts on what drove that phenomena well the way I'd answer it and actually Tim O'Reilly of O'Reilly Media raised this too was it was driven by the view source options in Netscape and Internet Explorer so the way in those early days of the web the way people developed websites was that they would go to a website they'd view the source so open source they'd learn from that source they'd copy it derive a new derivative create their own website and during that four or five years we saw this amazing global growth in websites during that period now Tim O'Reilly didn't say this I don't think but I believe this was the most successful distance learning program in human history it really was and I think that's an existence proof of why openness to source is so important um okay so let me let me take this now to the teaching side and what I did in Amherst I'm going to bring it down to more of a practical experience of Amherst so I was lucky enough we've got a provost and and a library that's supportive of open access and we have a program that provides faculty about a thousand dollars to try to develop open educational resources for the courses and so I applied for two of these to higher grad students to help me and I did one my first course was I'm on the production side where I was trying to produce some open educational material and give it to the community and then I did another where I was actually a consumer of open access stuff for my courses so I had two different courses with this and the first one I ended up self-publishing some educational material that I that I did around geographic information systems I had about 200 pages of my own content I went through and made sure everything was was my own content that wasn't any copyright issues I licensed a creative commons at attribution non-commercial share like license I put a PDF version up on our institutional repository and then I ended up putting it also on lulu.com so that students could order a course pack from it in that way and since I did that I've had between our institutional repository and Lulu I've had about a hundred downloads of that content and I pulled out a couple student reactions here from this and it's not surprising the students were really thrilled with this mainly because it was saving them money that there were also comments about that by having these electronic resources they weren't carrying around these heavy textbooks and stuff like that but my other experience that I just finished two days ago was an open access course pack if you will around environmental policy my class in environmental policy and there I ended up using ebooks from ebrary and ebsco and other PDF files that I we found on the internet to build the course content and of course a lot of this was driven by what was I trying to teach that was the first thing and then it was hard work to try to find material that fell under open access that I could use so that took some effort but the ebrary stuff for my and I don't know a lot about these licenses that libraries have but in our perspective that was very nice because the students could download chapters of PDF and then they were able to actually put that if they had a tablet an ipad if they had a kindle they could use those devices to read those things um so that was a very nice situation the um the ebsco um to my surprise I didn't as a consumer I didn't really understand this when I started the course that it was a single user seven-day borrow and and there was no metadata when I was doing this and that when I was finding the resource that told me that I should have asked I should ask some of our colleagues at the library and I didn't and then I got into the course and I had this this issue with it um the other thing about that is it requires adobe digital edition software which didn't run on the ipad for example and so this was a little more bumpy in terms of the use of it but um yeah I've kind of already said that but again the student reactions were really positive um really positive I've got a lot of um for the first time in my four-year college career I had a teacher who allowed open access to information was one of the students that said that um so I think this is a start at least from my own teaching um a good example I think we're moving forward and you may ask on this and this is just the beginning of our dialogues I think around this initiative um lessons learned for me next next year I'm going to be trying to find acceptable multi-user downloads with PDF chapter I'm going to try to take time in my classes to show students how to do this um since each each uh piece of content sometimes has different access stuff uh approaches um I'm going to try to I one positive was by picking an older book I was able to tell them they could if they want a hard copy they can go get a used copy and in my case it was a five-dollar used copy they were getting so that was another savings for them um uh anyway the the rest of it I think straight forward there so um okay so that's kind of a practical trying to do open access education resources in the classroom now let me take a step back and talk a little bit about some research I've been doing for about six years I'm trying to understand open collaboration at the global scale and and then I'm going to talk about my efforts in parallel and trying to get an international community of educators to collaborate on educational content so I'm going to start with the first one so this is a screenshot of the the book that's coming out from this project in June June 2012 this was an NSF funded project I actually have a background in environmental commons working with a political economist named Eleanor Ostrom and I studied about governance of environmental commons those types of things I'm also a former programmer so about 12 years ago around 2000 I started hearing about open source and suddenly my two lives converged I started realizing that open source was an internet-based commons and all of the theory and empirical research I'd studied environmental commons fisheries forestry irrigation systems might have a lot to say about how we collaborate on the internet um and that led me and and the big lightbulb moment for me was that the idea of copy left licensing had much more importance than just software that those licenses could be um I could create new collaborations in any domain digitally um right after I had that idea I started hearing about creative commons I was all excited about that thinking oh this is great um when I was first starting to research this all of the literature I was reading about was about volunteers and and a lot of the literature was about these big projects like Apache or Linux and I was what what I wanted to try to do in this study was learn more about the population so we weren't skewed toward these big projects I was trying to understand all of them and what I was also realizing as I was starting was that this wasn't just about volunteers anymore it was some of these projects this open source software projects involve firms non-profit institutions government agencies um so it's a much more complicated situation than just uh volunteers and so part of my goal which I mentioned at the table I'm not going to talk about was to try to understand the governance structures of these projects um to get at that hybrid situation and how that what that means for governance I'm not going to talk about that here but um that's part of it I've got it's all chapters in the book about that those issues um but so the research question I was asking was what the broad research question was what factors lead some open source software projects to ongoing collaborative success and others to become abandoned um and it took us my team and I a year and a half or more to come with a measure of collaborative success and abandonment and we've got that and it actually has been individually replicated by other researchers separately from us so that makes me feel like we got that pretty well done um we we turned to many of you probably know sourceforge.net which is a huge repository of soft open source projects so we got our hands on a hundred and seventy thousand of these projects um through theory and literature review and looking at the environmental commons literature and other things we came up with about 40 different hypotheses about what factors might lead to successful collaborations and what might not um and we were thinking about this in terms of projects longitudinally so projects would start um as a as a you know we called the initiation stage the pre first release of the software and then the post first release of the software we called the growth stage our theory was that our hypothesis was that the factors would be different between those two stages and in fact we that's what we've discovered in this study um so i'm i'm pairing down five years of work um in two two slides but um some of the key things we found which will be surprising probably to many of you is leadership by doing is key um the the projects that are successful collaborations have people that are leading it and doing things leading by doing um we also found that these projects tend to have clear vision or of articulated goals and clear vision and we're also um eric von hippels got a book called democratizing innovation where he talks about user centric need and we found across the board both in successful and abandoned projects that's a large driver why people participate in these um what i've also uh hypothesized and i think we've confirmed in this is that it's not only individual user need but now organizations have a need so they're putting resources up to support these um so von hippels work really describes a lot of what's going on now this is one of the most exciting things we found in the study um we found that 58 percent of the successful growth stage projects actually grew by a developer and that statistically we've shown the statistically and that these these people that are picked up on different continents different continents and i won't get into the how we we did some survey work and other things but what this is telling me and and the vast majority of these projects are one two three developers it's not like the linux where you've got 300 developers they're very relatively small teams but what this is telling me is that source forage that hub and google or possibly other search engines are acting as a as an intellectual matchmaker what this is about is not about big teams it's about finding other people in the world who have a passion and interest and building social capital um interacting over the internet to the point where you collaborate and i think what we found in this study is actually describing what we're seeing in wikipedia i think that's probably what's going on there where people are passionate about a particular topic and they want to collaborate on it i think the same could be true in open scholarship and education and i'm about to show an example of my practice of where i think this has happened in my own my own career so that brings me to my last material so um as i as i said as i was doing this research i also wanted to practice it um so i was learning as i was researching and i also am a proponent of trying to move toward open access educational material so uh because i teach geographic information systems i got connected and interested in a uh this um foundation is nonprofit called open source osgo the open source geospatial foundation and this is an international group um that had already been established and they had an education committee and so i got about five or six years ago got embedded in that started to teach with their software products they have a bunch of open source gis software that they have and gradually became the education chair of this international group of um educators and when i took over when i took that role the first thing i thought was okay how do we get international educators to share teaching materials now the the the listserv of people on this are from all over the world it's probably over a hundred people on the list there's probably 15 to 20 that are really active people um a lot of them in europe um many in the u.s but there's also other other places the world and so i was puzzling over how do we start getting or getting this this group of people to share their their educational content and so the first thing we did was we established an educational online repository a metadata base simple web form where people could put in um a metadata about any educational material they developed and put a hyperlink to if they had it on a website somewhere and i would periodically as the leader this is an example leading by doing when i when i would put my stuff up or i would float emails out to the community and say reminder if you've done anything new put it in this database and every time i did that i would get four or five new entries um if i went stagnant for a month and didn't float out those little queries the community nothing would happen so it's an example of what i found in my empirical work where leadership by doing drives some of this you've got to have somebody really pushing it um but so this works really well we ended up in course of a couple years having almost i think a hundred different um links to different educational material um that people have done these would be courses or could be different modules the next step what i wanted to do was move toward a new derivative work system so the idea here would be a very simple one would be if i wrote some educational content um by making it um open access source and allowing new derivatives a colleague in japan who could speak english could translate that into japanese that would be an example or somebody could create a new version with a new release of the software that it's describing this is harder the least i found this to be harder and i still haven't solved this issue um we had a lot of discussion on periodically again when i let it when i kind of pushed for it um around um what kind of document format should we do um some people were saying latex for scientific um writing doc book was thrown out i was pushing for open document format since one of the things i think is important is to make it as easy as possible for people not not put a lot of what eric reyman calls friction um and and so that was what i was pushing the other issue turned to version control um at the time i looked at mit open courseware i looked at rice connections and i tried that out and i should say from that i was really interested in the modular idea around the um that i was thinking modularity was important so that people could buffet style pick pieces not entire courses but pick chunks um so that was an interesting kind of issue um we had uh open source um content versioning system software called subversion available to us and that's what we tried to use and there's another one called github which um we we haven't tried but that was another one i was thinking about um the the the issue is that i i i ran into and i still have run into very difficult time getting people to take the time to contribute in this way and and i i i think part of that is because i'm volunteering and i don't have a lot of time for this and so i'm not pushing hard enough but um this is a really interesting problem of how you get different people across the world to contribute open source in a way that's easy for them to do um and and we haven't successfully achieved it at this point um but the last thing i'm going to say is that so from from that experience this really connects to my my empirical research that leadership by doing is critical we have to out of greed upon vision i think that's some of our problem about getting the new derivative system working um this user center need is a really critical part of all this that that people contribute if they have a need um their own need so the von hippel idea but this last point um fits to my uh point about source forage and google being intellectual matchmakers um by being in this community i've now connected with somebody at the university of nottingham and somebody at the geotech center in texas and we've really built really strong social capital and now we're trying to move forward as a team so i think it's an existence proof of what i was saying about finding several people who are really passionate um so i think with that um the last thing i'll say is that um i think some key issues here are publishing outlets so i'm kind of getting again at the incentive structure um for faculty contributions see how can you get something so that they can put it on their cv or move up the tenure line um and it's made me wonder about libraries as publishers in supporting this so could could libraries act as a collect as a host for collaborative platforms that would encourage this type of thing um so i'll leave it at that i'm i'm obligated to put that up so i hope you don't mind sit down thanks so first of all i want to say thank you for inviting me to come here and speak i'm very excited to be here and i want to try to share a little bit of my enthusiasm here at the end of the afternoon um though i hope you found this panel as stimulating as interesting as i have um for the ways in which um open scholarship and digital scholarship in the humanities are a sort of twin engine that are are driving each other in very interesting ways i mean i think there's a lot of energy there and a lot of potential opportunity for libraries there um and so that's kind of what i want to share with you over the next 20 minutes or so um so i guess the place that i'm beginning from because it's the place that i sit every day is thinking about digital humanities and libraries um as brian said in the introduction i have a joint position um you know in the library and in the digital humanities center at maryland um i think this is a really crucial um bridge building step and i think it's a um it's an acknowledgement of the the things that these two communities have to offer to each other and so i want to talk a little bit about you know what is the digital humanities have to do with open and part of the problem with answering that question is agreeing on what you mean when you say the digital humanities um it's one of these um hot things in academia right now it's one of our buzzwords um and there's a lot of different communities that have come to this and have found something of value in it for them this is a definition that i particularly like um because i think it gets at a number of the uh issues that that go under the tent of digital humanities um this is a colleague at the university of richmond rob nelson who says the digital humanities is a capacious enterprise that includes among other things research using computational and algorithm methods to study culture and history as well as efforts to use digital media to share humanities content beyond the academy and encourage active engagement with that content by a broad public so in this definition you can probably see the seed of of something that um the digital humanities and in the open access movement and efforts of open scholarship and open teaching resources have to offer to each other and i think it's useful to sort of um prize this apart a little bit um so here are some examples of the ways that um digital humanities is beginning to act outside the academy and the ways in which the open openness and open values of these communities are interacting with scholars who come to the digital humanities who are attracted to the digital humanities so you can see here that you have things like wordpress which is a you know popular blogging platform and on top of that people have written plugins or extensions for doing scholarly work for doing footnotes on a blog which if any of you ever tried is not the easiest thing in the world but it's an important part of academic discourse it's a reflection of the kinds of conversation exchange critique that that we do in the academy all the time and it's finding an expression in these sort of open platforms through the collaboration of scholars and people outside the academy you have things that are more venerable perhaps you know around official standards making bodies like the w3c which have produced in turn resources for the digital humanities such as the text encoding initiative standard which is a way of marking up text and newer efforts like the open annotation collaboration which is a way of designing data formats and exchange mechanisms and tools for passing annotations around the web things that scholars want to say to each other things that members of the general public want to say to scholars how do we communicate that through the systems of the web which I happen to agree is one of the greatest technological examples of why what openness can do for a community I think you'll also find that the digital humanities community is actively looking outside the outside the traditional tools outside the traditional channels of perhaps peer reviewed journals of the monograph and there's a couple interesting reasons for that why the digital humanities would look to twitter or to things like stack exchange which is a series of things that are like internet message boards where you can ask questions and get answers but where people can also rate your helpfulness in the community the relevance of your questions and I think a lot of this has to do with the people who are digital humanists the digital humanities is hot right now but it wasn't always so it used to be a very under privileged half ignored aren't they crazy kind of corner of the humanities you want to study Milton how you want to use the computer to do what and will you ever get promoted for that probably not and so the people that tended to find their way into this world were the people that were on the one hand really captivated by the potential of using the computer as a telescope for the mind which is a phrase that willard mccarty a scholar in digital humanities loves to use but who also found in that space the people who are who occupy different roles in the academic hierarchy librarians programmers educational technologists and they kind of banded together when the digital humanities wasn't cool and that's actually the core of this community and I think part of the explanation that they look for and really engage with these outside platforms like wordpress or twitter or github and I think what you're seeing here you mentioned the importance of view source as something that drove the development of the web and part of what I think is so interesting about digital humanities when talking to audiences who are not humanists who have other roles in the academy is that there's a part of digital humanities where the encounter with technology is not about changing the hermeneutic in the humanities which is important to those of us who care about the humanities but there's a part of the digital humanities that's about doing view source on the system of scholarly communication or as a colleague in a meeting that I was just in a couple days ago called it hacking scholarly communication and that's an important element of understanding digital humanities and I think understanding what digital humanities in the library community have to say to each other and how they can sort of advance their mutual goals so as I've said the encounter with digital technologies that we're going to call digital humanities for this afternoon is changing a lot of things inside the academy it's changing what someone has called knowledge design what is the work of a scholar and this is a really this is a place where openness is showing us that having access to the collections that libraries make available openly on the web having collections that our cultural heritage organizations make available on the web means that scholars begin to try imagining new ways to communicate their ideas and they start to question the received forms which have turned into publishing channels and large organizations and bureaucracies and so when you get down at that level and you're starting to reimagine how you communicate your scholarship you'll eventually unsettle the systems and the organizations that depend on that and really interesting creative new things can happen so on the previous slide down at the bottom I've pointed to a project at USC called Scalar and this is a project that's trying to encourage faculty to think about or at least to not fear multimodal authoring to link things together that are perhaps bits of text YouTube videos found on the web images from cultural heritage collections any sort of thing that you might want to put together in a scholarly argument and it's a kind of way of instantiating something that a history colleague once said to me was his sort of ultimate dream of presenting his work that what he really wanted to do was not to write a monograph but to take the reader into the archive with him and sort of point at some interesting things and leave a trail in the archive and I think that's what's so fascinating about a tool like Scalar is that the specialists the people who are engaged in digital humanities are building new tools and new interfaces that capture things that are like a trail in the archive things linked together by little paths and obviously that presents a lot of challenges and it unsettles a lot of things so for the sort of late breaking news portion of my presentation I was just at a meeting in New York hosted by the Scali Communications Institute which was looking at three really important projects in the digital humanities that are particularly relevant I think to this meeting and the meeting was specifically about new modes of scholarly authoring and publication and credentialing and the three projects I've actually put up most of their logos here one was Press Forward which is an initiative of George Mason University to what they like to say collect the best of the scholarship wherever they find it there are the editors of Press Forward for people and they every day they sift through RSS feeds conversations on Twitter things that colleagues mentioned to them in the hallway everything out there on the open web looking for good material good content happening and then there's a series of steps whereby they bring it into something that looks more like the kind of peer reviewed journal that we might be familiar with seeing in the academy there are things that are editor's choice things that are topical now ways of aggregating and pointing attention to something and this sort of segues into interesting conversations that many in this community have been having about sort of alt metrics other ways of demonstrating value and what I think is so fascinating is one thing that these editors have found is that they can drive a lot of views to something but that the traditional web metrics are not very good for what these scholars want to do just you know Twitter clicks or you know feed burner links or any of those is Facebook likes are not really getting at the systems that we want to have in the academy to support our dialogue and that there is a role for an editorial mind to kind of put things in conversation with each other and I want to kind of come back to this idea of the tension between going out there and you know sort of blending into the practices of the wider web you know joining the standards of a community like WordPress or GitHub and then the tension of bringing that back to the values of the academy most importantly I think that starting from this sort of insurgent position and being very closely focused on how to use these new technologies to do a view source on our scholarly communication system you have people actively questioning who's the peer and peer review and thinking about how we distinguish the different types of peers that are out there the citizen scientists or the citizen humanist is one of particular interests and this raises important issues around you know credentialing an audience that it's not just about scholars engagement with this external community getting reflected back on their CVs and in their tenure and promotion processes but how this sharing of scholarly knowledge can validate and you know bring to light perspectives that are out there on the open web you know that can tie things that the public is doing to things that scholars are doing so being out in the open being out on the open web begins to broaden the dialogue about who can be credentialed and what that means and how we sort of exchange knowledge back and forth and I think that's a really valuable effort that this ferment in the digital humanities can bring to academic libraries and to the academy more generally I would sort of sum this up as saying you know the encounter with digital technologies is creating leverage for more open scholarship and at the same time the openness of the open scholarship is in turn pushing digital scholarship to be more imaginative to build new tools new processes new organizations and that sort of mutual dependence is really valuable and it's something that there's a place for libraries to intervene here and to support this work and to sort of keep this engine running so I want to talk a little bit about next steps there was a really interesting conversation at our meeting in New York where some of the colleagues across the table you know from the disciplines were saying that you know we don't know that our faculty really want this new digital humanities thing they might be feeling like they're doing okay and that reason might not be because they're Luddites it might be but there might be another reason and that's that there's a certain pace to the kind of work that they've been trained to do it takes 10 years to write a monograph and we need to think about how this collaborative open scholarship that happens out there on the web around things like blogs or Twitter might align or be realigned with the time scale of things that are like monographs and I think what that really what that discussion was pointing at was a really interesting you know the two world views had not met but not in the ways around adoption of technology or not adoption of technology they hadn't met around issues that were more fundamental about closed and open about collaborative and single authored that part of what's pushing the digital humanities to be more open is the values of that community being so strongly on collaboration when you try to do a digital project you very quickly realize you can't do it by yourself unless you know a lot about medieval manuscripts but also web publishing and programming and copyright law and everything else so there's a sort of built-in drive to collaboration in that community and what that drive to collaboration is actually pushing is pushing more openness because in order to be able to collaborate you have to open up your process you have to open up your thoughts earlier on rather than sort of working in isolation on a sort of single product and so you can see the tension or I hope you can see the tension there about issues of closed and open and how they intersect with the issues of the digital and the non-digital and so for libraries to really get in on this act it's another area where I think being more responsive trying to be more agile I know it takes a lot to set up a position like mine to find the funding for a faculty line but again there have to be people from the library in these communities to react to these new developments at the scale that they're moving rather than you know sometime down the line when some product is finished that we can see more clearly and I think things like Press Forward or the Scalar Project are really interesting examples of this those are projects that really need people from the library and information science community to be involved in them at these early formative stages and to invest in that sense and I think this puts us in a really interesting place there's an opportunity here by so by reaching out to the digital humanities community by sort of getting some quick efforts on the ground and beginning to work in this space and find out what's going on here that there's an opportunity for libraries to really move openness forward in partnership with these scholars but of course you know resources don't come from thin air and so what I would suggest is that this is through the support of digital humanities through collaboration and digital projects where the sort of values of openness of teamwork and collaboration are driving the way that scholarship gets done in that field but there's another way to conceptualize how libraries you know support their core mission of making you know the most information available to the most number of people connecting them with information so that they can sort of create new knowledge in their disciplines or their communities and so this I guess I'm sort of pushing the idea of R&D efforts in addition as sort of valid use for collections budgets or other budgets in the same way that buying materials has come to be the standard way in which libraries you know support the mission of making information available and that this is kind of what they talk about when they say that you know open source isn't free or it's free like puppies that the opportunity here I think is very much an R&D opportunity and to put libraries into that mix with projects that are coming out of the digital humanities is a way in which the sort of value of the library in those disciplinary communities in those public communities can be shifted and yet still fulfill the same core mission as we do through our collections budgets and our outreach and other other efforts that we do I want to point to a few more challenges ahead obviously that raises the issue of how to make space for experimentation I wouldn't claim to you that PressFord or Scalar or the MLA Commons or DH Commons DH questions and answers are the answer to hacking scholarly communication and so the participating in those is participating in an R&D project where not everything will work out and in the sort of trusted stature that libraries have I think it can be difficult to make space for projects that fail but I just want to sort of point at the importance of that and yet to be able to distinguish between experimentation that fails and fails profitably to churn throwing things at the wall and seeing what sticks and that's a kind of management challenge that I know that we're certainly grappling with that probably many of you are grappling with and that will continue to be an issue as we think about openness and the digital humanities the other one I want to point to just quickly is this idea that everyone has a personal digital archive now and so it's not only trails through our institutional collection digital collections but thinking about these scholars and members of the public's personal digital collections how those become publications or maybe how they relate to publications how we collect them how we make them discoverable those are big challenges that are closely related to these questions of new modes of authoring new modes of publishing so people's photos on flicker their tweets the Library of Congress you know we can't just say the Library of Congress has got that one covered you know they're this is very you know opening up these personal digital archives is a place where libraries have a role that intersects clearly with these new you know publication and authoring ventures with the new sort of hermeneutics of the humanities that are being expressed through the engagement with digital technologies and this is a quote from the last couple days I hope Cliff doesn't mind too much I think it's really good but he said one of the great things about plus one the new open access publication in the sciences was that they've essentially capitulated on evaluating importance prior to publication they've they check for sort of technical correctness you know methodology and soundness but they don't try to filter at that prepublication level what will have lasting value and I think that's an interesting model for libraries to think about more broadly you know sort of collecting and publishing efforts and it relates directly to that issue of personal digital archives someone's personal photos or their research notes or their bibliographies and something like Zotero what is the enduring value of that I don't know that we could answer that question up front but engaging with these materials which have a greater potential to be open which we should work on making as open as possible gives us a way to build rich collections that will matter to future researchers but we can't do it continuing to act like gatekeepers and I think this is another important challenge to keep in mind as we move forward and I want to kind of close with this idea of how do we do division of labor without sort of dividing our forces and to think about potential partnerships in this area these three projects that I mentioned are all digital humanities research projects they look sort of like infrastructure but they're research projects led by humanities faculty and they're technical partners and some of those have reached out to scholarly societies or cultural heritage organizations not many research libraries interestingly make about what you will and I think to return to a point that I was making earlier what is so interesting is that this engagement with openness open technologies the world of the open web doesn't mean losing the identity of the academy what scholars have always done openness is not a threat to those values and I think it's it's a particularly and yet it can be generative of better thinking and new innovation in those areas so what the press forward authors found is that they were sifting massive amounts of scholarship blogs and tweets and you know published papers and institutional repositories down into a publication and at the end they were acting very much like traditional journal editors like traditional publishers they were building a list saying these are the things that are important and worthy of aggregating our attention and so thinking about these partnerships is again understanding the values and participating in the technology the pace of rapid sharing rapid innovation that happens in these communities on the open web and is increasingly happening in digital humanities does not mean sort of subsuming the identity of things like publishers that they're it's in a partnership between a library and a publisher it's obviously important for the library to stay a library and the publisher to stay a publisher and that's a sort of tricky question to think about you know building that filter but building it after you've already have all this open material out there on the web and that's especially important I think when you engage not with sort of the easy targets easy here being very relative of scholarly societies or publishers people we already sort of have conversations with but how do you have a conversation with Wikipedia or with Twitter and so this this tension of blending openness with the traditions and values of the humanities is something that the digital humanities is beginning to grapple with and that library's participation the digital humanities could further and it's a sort of important and challenging area that I think will serve both libraries and the digital humanities communities as we move forward and I think I'll just leave it there thank you thank you for listening music was provided by josh woodward for more talks from this meeting please visit www.arl.org