 Hopefully everybody is now seeing my slides. As Stephanie mentioned, I'm going to report on RDA, Digital Humanities Outreach efforts. Here's the Twitter tags for Research Data Alliance myself if you want to tweet during that talk. I assume that most of the people on the call at least have an inkling of what RDA is, but I will just give a very brief overview focusing on the things that are most relevant for the outreach efforts I'm going to talk about. So the Research Data Alliance is a global initiative with over 3,000 members from over 100 countries. It's aiming to build the social and technical bridges to enhance open sharing of data, and the RDA vision is research and innovators openly sharing data across technologies, disciplines, countries to address the grand challenges of society. So those are the official marketing speak of what RDA is, and what I think more importantly is what is RDA hoping to do and hoping to produce, right? And I think there's two main points here. One is social infrastructure, and to that end, RDA is providing a global forum for communication and collaboration, and a wide interdisciplinary community of experts working together to identify problems and build solutions. And I think the interdisciplinary point here is really important because particularly for the humanities, it's an opportunity to reach outside of our discipline and get access to experts from other communities. There's also technical infrastructure, right? And so RDA is investigating, documenting, and leveraging existing infrastructure solutions, but also building new solutions for public apps. And I'm going to share a couple of slides from Fran Berman, the head of the RDA US organization, because I think they really present a picture of what infrastructure is in other areas than data, right? And so if we look at what infrastructure is made up of, there are maybe six key areas, right? There are adopted policies that state what you can do and what we agree to adhere to. There are solutions for systems of interoperability. The classic case here is that we can all plug our laptops in, no matter what country we're in, because we have these handy little gadgets that allow our adapters to work with the plug sizes each country has. So it doesn't really matter that every country doesn't adhere to the same standard here, but we have the brokers and tools that allow those things to interoperate. And similarly, there's common types, standards, and metadata that are common sizes for pieces of wood and for buildings, that sort of thing for wires and cables. All of these things are infrastructure in the physical world, and there are analogs in the data world as well. Then another key point is it has to be sustainable, right? There has to be some sort of economics that enable us to keep the solutions that we're building going. The community has to adopt the solutions. A traffic light only works if everybody agrees to go on green and stop on red, otherwise chaos ensues. And then we also need to train and educate the people or the users of the solutions. So although these are not specifically data solutions we're talking about here, I think there's definitely analogs in the data world, but this is a good overview of where REA is focusing its efforts. But prioritizing this work and infrastructure can be challenging, right? Because there are barriers to data sharing and collaboration all over the place. It can be difficult, particularly when professional cultures aren't supportive, and in the humanities this is maybe a key point. Funding is often limited, and this is not specific to the humanities. It's limited everywhere, right? But humanities maybe is often at least in the U.S. I know it's more challenged than some of the heart sciences. And with limited funding you often find yourself competing with people that you would really like to collaborate with. There are lack of incentives, lack of enabling environments, lack of infrastructure. All of those things can make work on infrastructure difficult. And I really love this quote that Fran has here from the Syracuse Mayor that talks about infrastructure being overlooked because politicians don't really want to spend money on academic development because you don't cut rooms for water mains, right? So it's not exciting to talk about infrastructure where you can't have a big party when you have a water main, but that's what really matters. And that's the case for data infrastructure. It's hard to make the case to do something innovative when it comes to infrastructure. But I really believe that working on collaboration is an innovative activity. So there is lots more to say about RDA. We can talk about the way the organization is structured into working groups and interest groups. But most of it, and talk about specific outcomes that we have so far, but that information is all available on the RDA website and is not the main point of the talk today. So I wanted to give you a brief overview of what we're doing with RDA and then really go into the motivations and objectives for doing some outreach from RDA to the digital humanities community and vice versa. So for me this is really a personal crusade. I guess you might call I got involved in RDA because I really felt that the digital humanities were underrepresented. And I wanted to make sure that I could make use of whatever solutions were being built. So one key motivating factor here is really to raise awareness within the currently mostly scientific RDA community of the relevance of humanities use cases. And then going the other way to raise awareness within the DH community of the activities of RDA and how they might be relevant to human estate and needs because after all we do have data even if maybe not all humanists think about what they're doing as being producing or using data. It is data. And you know to the outcome of this to raising of awareness really ought to be then to involve, to result in work, right, engaging, increasing the engagement by humanists in designing, developing and adopting the interdisciplinary solutions for data sharing. So that's really at its heart why I'm involved in these outreach efforts and why my colleagues that I've been working with are as well. So what are the incentives for people to participate in RDA and participate in a global community such as RDA? And I think that some of the most important incentives are the ability to reach wider audiences. So having your work sustained beyond the life of a single project. And this is particularly where I think the multidisciplinary aspect comes in because we work with solutions that support not only the humanities but the geosciences, astronomy, we have a much wider community that's working to sustain those solutions and a wider range of projects that would be depending on them, ensuring their life beyond any one project. And you know resulting in progressively better use of the dollars or euros or whatever the money we're talking about is and less of it would go to redevelopment infrastructure and more to the actual scholarly research that's the point of all of the data production. So how do we realize the potential of RDA? And this is one of the things too that I think is unique is being a bottom-up organization it's really up to the people participating to make what they will of it. Either we do the work together and we produce solutions or we don't. So it really depends upon engagement and if and how we leverage the community. Each of us as individuals and the organizations that we work for, I think that within RDA the working group structure and its focus on producing adoptable outputs is really essential to driving to real solutions and not just talk. So I think that by participating in a working group I can be assured that the work that I'm doing is important not only to me but to the other people who have committed to adopting it and it results in much more real solutions I think. And it offers an opportunity to look outside the usual list of suspects for collaborators to build sustainable solutions and this gets back to the point I was talking about before. So I think that we can collaborate with somebody in astrophysics or astronomy or sciences and it not only results in more sustainable solutions but more scalable and more robust solutions as well. So that's the background of RDA and the motivation for the outreach activities and now I want to talk a little bit about those activities in more detail. We started really this effort with a humanities panel at the plenary in San Diego in March. This was an outreach from Humanities to RDA and followed that up with a workshop at Johns Hopkins University in Baltimore in May and then most recently we had a meet-up at the DH 2015 conference in Sydney in July. So the humanities panel at Plenary 5, we asked the panelist to answer two sets of questions. First, what is different about digital humanities infrastructure and more specifically can those differences be described in terms of use cases? Because getting back to the producing adoptable outputs, we need to have real use cases behind the things we talk about so that we make sure that what we're doing is solving a real solution. The other set of questions, are there specific RDA products or recommendations that our digital humanities projects can adopt now and if not, why? And particularly to look at the question of whether the requirements in data models are different and that whether that's contributing to a lack of adoption or lack of ability to adopt the work that's happening in RDA. The participants were Peter Wittenberg from the Max Planck Data Defeat Center, Nigel Ward from Nectar in Australia, Ted Hewitt from the Sharks in Canada, Ian Fortune from RPI, Centre Columns which were caused by the Ireland and myself. You can find out more about the panel on the RDA website as well. The key takeaways, I think, that all of the panelists represented this perspective in different ways. The question of what is the same across digital humanities and the other disciplines? I think the key point of agreement here was that core infrastructure really is the same, both in terms of the needs of the infrastructure, needs for persistent identifier solutions, data type registries, metadata standards, identity access and authentication solutions, et cetera. At its core, these core needs are not really different across disciplines, across the humanities and the other disciplines. But additionally, the challenges around core infrastructure are often the same as well. A lot of redundancy, similar solutions for the same problem, lack of long-term commitment to sustaining the solutions, lack of recognition of the work on supporting data sharing and interoperability, lack of incentives and support. So these things really look the same whether you're talking about the humanities or hard sciences. But there are some differences and the differences are really in the nuances of the data. Data formats may be very different and in the humanities in particular you may have many more types of non-born digital data that we have to accommodate. There's probably many more copyright and access complexities in the humanities, both data and metadata may be subject to copyright restrictions or maybe social, cultural issues around access that need to be considered. Semantics are different and at its heart an understanding of what data means may be different in the humanities and in the sciences. We're not only talking about spreadsheets and rows and columns of numbers but texts and annotations and various other types of data. There are cultural language complexities, data may be in a variety of different languages and a variety of different cultural assumptions behind its collection and use. The research methods may be different and in the humanities there are very often non-linear and recursive and not just about collecting data but about the work of doing curation and producing data as a result of curation and research in this sort of recursive, collaborative, long-term way without a clear beginning and end point. And in the humanities there are often very traditional reward models where digital and data production is not really at the forefront of people's minds. So some of the key takeaways from the panel were really that I think that from within RDA the case for involvement in humanities seems clear. There doesn't seem to be a question that it would be useful and valuable to have humanists participating in the discussion. So it did feel a bit like preaching to the choir and it was clear that we needed to do outreach to engage the humanists themselves in RDA activities. And further that we needed to find ways to enable and incentivize this work within the context of RDA because in order to enable people to participate they would be funded and incentivized to do so. So these last two points really fed into the next outreach effort which was an outreach from RDA to the humanists to the other direction. And that was our Baltimore workshop that was hosted at Johns Hopkins. The RDA perspective objective on this was to learn more about the disciplinary needs of digital humanists with regard to data infrastructure and data sharing. But sort of the underlying objective was really to have this discussion about how to engage the humanists and how to enable that engagement. So participants were invited to share these cases for infrastructure and the RDA-US leadership presented the RDA structure and went into some detail on the outcomes of RDA to date. We had a roundtable discussion following this and then asked for a reaction from the funders. So just to give you an overview of who participated, the workshop itself was hosted by Johns Hopkins University and RDA-US. The RDA-US leadership was present. We had U.S. funders of humanities there, the National Endowment for the Humanities Institute for Museum of Library Services and the Mellon Foundation all had representatives. And we had participants from a number of projects. In addition to doing the providing an overview of RDA activities, we also presented a sort of a deep dive look at the relevance of some of the RDA outcomes for the humanities. Because I think this question of how to make the value proposition clear is still one that we're struggling with a little bit. So I took one outcome, the data type registries, which is one of the first outcomes from RDA and provided a little example, mapping it to some of the use cases that the participants provided to explain where I saw the value being provided by this type of solution. And so some of the questions that I feel could be answered through the use of data type registries as well as other pieces of infrastructure or things such as answering questions about data that should be considered at the beginning of the project or as you work on a project, questions about visualizations and how to aggregate questions about copyright. And then also how to manage your data. Do I need to create a new data type or can I use one who else has data like this? These are all the types of questions that having solutions and infrastructure solutions like data type registries might help answer. So some of the key discussion points after the presentations. Collaboration was a topic and the fact is that collaboration is often difficult because humanities projects and funding are often invested in nationalistic or localized pursuits. So here we are talking specifically about global collaboration. And we spent some time identifying some grand challenges that might serve as unifying issues to spur collaboration. For example, interoperability of link data was a key point. Link data is a very actively pursued topic in the humanities right now and there's definitely a lot of work to be done there around ensuring our interoperability. And another grand challenge that could be considered would be ways to address the plurality of languages found in the data. Some of the other discussion points. We talked a bit about the RDA working group structures and how they should account for needs of the DH community where infrastructure may already exist but need to be generalized for wider use. And I think this is not a humanities specific topic. In any discipline, there are bits of infrastructure that exist and that could be generalized more fully to make it a wider, more scalable solution. So working groups might not need only to design new solutions but look at existing infrastructure and generalize it. The other point was really that RDA needs to make its value proposition to humanist more clear. Communication is really a challenge here. Talking about data is often not that easy to do, especially when dealing with traditional humanists. And we all agreed that training is really essential to engaging humanists in other suits. One of the most interesting parts of the discussion was really the feedback from the funders. The funders felt that RDA does really have the potential to prevent the reinvention of the wheel syndrome, which is something that's obviously a great interest to them when it comes to efficiency of spending with money. They felt that projects could perhaps look to take advantage of RDA's role as an entry with disciplinary global authority providing guidance and a stamp of approval. So for example, leveraging an RDA working group structure to ensure projects are producing and using sustainable solutions. We talked a bit about how it could be the type of thing that going forward in our funding applications, we talked about presenting solutions to a working group or leveraging a working group to put the stamp of approval and how that might spur funding activity. Other feedback from the funders was that RDA really does need to help tackle data publishing and copyright challenges in order to be relevant, particularly in the humanities. And there was some thought that preservation needs could perhaps serve as a unifying factor for DH projects. And here in particular, libraries were called out as having an important role to play, both on preservation as well as in providing training. So that's really the takeaways from the Baltimore workshop. There was no concrete actions determined from the outreach activity, but participants were encouraged to sign up and become members of RDA to explore participating and working in interest groups. And I think more importantly, it was really good to have this audience with the funders so they could begin thinking about how they could leverage the work of RDA to support funding of infrastructure. So the last outreach effort that I'll talk about here is the meetup we had at the DH 2015 conference in Sydney. I think Rick Mason helped me organize this, and we had an informal lunchtime gathering with about 10 to 15 attendees. It's always difficult when you're competing with lunch to get a good crowd. But it was interesting that most of the participants had not really heard of RDA before. They really had only a very big understanding of what RDA was doing and how it might be of interest to them. I think the people that came were mostly drawn by the topic of infrastructure and the opportunity to have a chance to talk a little bit about what they thought needs of infrastructure were. It was pretty clear that the researchers do not yet have a clear view of how their work benefits RDA and vice versa. So thinking about infrastructure in bits and pockets and not at the bigger picture is, I think, a challenge here. Everyone did, of course, agree on the importance of infrastructure and the need for humanist researchers to be engaged in defining the problems and solutions. We had some really interesting conversation on this point, particularly around topics of provenance and complexities of provenance when it comes to humanities data, as well as complexities around access and authentication and copyright solutions. In exploring those topics a little bit, it became clear that we really do have, from the humanities perspective, a point to provide here in helping ensure that the solutions in these areas are more robust because by considering the humanist needs, the solutions will have to take into account things that they might not have otherwise when looking at some of the more concrete sciences. So the end result of the meetup, participants were encouraged to sign up and join the working interest groups and we provided pointers to a few that were perhaps of interest. The other key takeaway is that the ADHD organization is going to begin to investigate ways to collaborate or participate with RDA. And I think that's a good thing and a good way to engage the humanist researchers a bit more. So what's next on the roadmap here? Going back to a point from early in the talk, I do really believe it's up to the community to engage at this point. Humanities researchers do need to begin to participate in the working interest groups of RDA if they want to have their voice be heard in the work that's being done there. Of course for funding, this is maybe a bit of a chicken and egg scenario, right? We want the activity that we're doing here to be funded but we really need to start doing it in order to prove its value. So I do think this remains a challenge but I also think it's something that we can work with and by starting the work and the momentum should pick up and we can begin to see whether or not it's something that's worth doing. So for those of you on the call who want more details on how to get involved, first sign up at the RDA website. It's free. You can start by exploring existing relevant RDA humanities activity. I've provided some links in the slides here. There's a more detailed report on the Baltimore workshop that you can access. Also from that same page you can access the use cases that were uploaded by the participants in the workshop and they're really interesting to take a look at I think. There is a digital practices and history and ethnography interest group that is the closest thing at the moment to a humanities domain specific group in RDA. And I think the approach they've been taking to this has been very interesting. They may have been doing some project share and reviews where they explore different solutions for digital humanities projects. And it's very interesting to me to see the approach the ethnographers take to exploring these projects. I will put a plug in here for the NASA PID collections working group of which I am a co-chair. This is a proposed working group not accepted yet, but I do think it has a large degree of relevance for the humanities. And this is about dealing with persistent identifiers for collections of objects, whether they're virtual collections or physical collections. There are a lot of different use cases that are of interest here. So I encourage you to take a look at our case statement and provide some feedback on the site about that if you're interested. Another interest group that I think is worth participating in from the humanities perspective of the data fabric interest group, which is looking at the big picture of taking the outcomes of RDA and putting them together to provide a full feature solution across the data fabric. And they've been calling for use cases from a number of different domains. And so that's an opportunity to present your use cases and to see what's happening across the other domains in this area. So I encourage you to add your use cases to start or join an interest or working group. And there's also a tab election coming up. So I encourage you to nominate yourself or a colleague for the technical advisory board if that's something that's of interest to you. I'm approaching the end of my term and I really do hope to see additional humanities colleagues participating at that level. I think that's now at the end of my slides and we can stop for questions.