 Welcome to this session on our shared research data management project or initiative that we're embarking on at JISC. I'm Rachel Bruce, Deputy Chief Innovation Officer from JISC, and what I will do is I will start off by telling you what JISC is because we're an organisation in the UK and some of you may not be aware of the sort of remit that we have and then I'll also outline some of the context in terms of the research data policy within the UK and some of the drivers which are encouraging our universities to look at research data management solutions from an institutional perspective rather than a disciplinary perspective. And then I shall outline where we are with our initiative in terms of developing a shared research data management service and some of you might say at the end of it when you hear it, is that a shared service Rachel? And let's say this is exploratory, okay, so we'll see where we get to. Okay, so what is JISC? We are an organisation that serves all of the higher education institutions within the UK and also the further education and skills providers. We're a not-for-profit and it says we provide digital services and solutions but take solutions in quite a broad sense, so that might include exploratory work and change initiatives in terms of working with new information or ICT challenges and opportunities with the sector. And the three things that we do, so we deliver shared digital infrastructure and services, so for example we deliver the network, the equivalent to internet 2 in the UK and we also provide a lot of services which are what you might call digital library services, so quite similar to things like the California digital library delivers. Then we also undertake negotiation and I suppose sometimes some consensus building and influencing with products such as software, cloud provision and also of course access to academic content, so we undertake the negotiations for journals within the UK for all of the universities and also the research councils. And then there's advice and guidance, so that advice and guidance can include all sorts of things such as advice around legal issues for information or technology, advice around the adoption of e-portfolios and it goes on, so basically anything to do with information and digital technology and its influence and adoption in education and research. So for some of those that have been aware of GISC before we have undergone a reorganisation, so gosh it's taken a while, so maybe four years ago we were not legal entity and now we are a legal entity and all of our different services are served more holistically as one under a single governance. So our vision and mission, which I won't read out to you apart from I will read out the vision, it makes me quite scared actually to make the UK the most digitally advanced education and research nation in the world. So it's quite a challenge and aspiration but I think that's about really keeping us focused on making sure that we are delivering solutions. Hello Peter, hello. So I talked about, mentioned there the way in which we've been redesigned and reorganised and one of the things that we have looked at is how we can in the current economic climate deliver more focused research and development and innovation activity. So previously we used to have a lot of grant funding that was thousand flower bloom sort of thing whereas now we've got this process which is called co-design where we work very closely with sector representatives to identify the key areas in which we should focus all of our R&D or innovation activity. So these logos here represent the six areas that we are currently focused on within research and development and the shared research data management service falls under research at risk but just to give you a flavour of some of the sorts of other activities. So in terms of learning analytics, so universities and colleges wanted to work with us. I think in two aspects around learning analytics. So one was really just a community of practice, sharing lessons, trying to understand how to implement learning analytics. So there's been a lot of work around consent issues and ethical issues but then also to trial some of the solutions. So we have now got a shared data management, sorry shared data warehouse for the sector and we've also got two analytics engines. One that is a proprietary commercial solution and the other an open source tools so people have options and then we're working with the community that's been built to identify new areas and to do some R&D and develop new analytics tools and that's obviously all around supporting the student experience. And it's proving quite popular, a little bit too popular actually. So I think there's about 60 universities which want to work with us in terms of using the developing shared services and we're having to manage that demand. So we'll see how that goes but those are the sorts of things that we've been doing in our research and development portfolio. So going back to research data management and the policy landscape in the UK. So there has been a lot of work from the research councils around research data management for many years. So we have seven research councils in the UK that cover different disciplinary areas and obviously for a time there's been quite substantial provision around social science data and earth and environmental science data. But in terms of the drive towards improved research integrity and efficiency in the reuse agenda, back in 2011 the research councils, the whole seven of them, their umbrella organisation is RCUK, published their principles for research data management and of course these are quite similar to the sorts of policies and principles that you see in the US. So the belief that research data are public good and should be openly available with as few restrictions as possible and all of the things that come with that in terms of management making sure that data is discoverable but also respecting some of the restraints that there might be in terms of privacy and sensitive data and commercial collaboration. And then the right of first use to the originator of that data. And I've highlighted sharing is not free because and as I think was highlighted at the round table this morning that was held that funding and costing for research data management is a significant issue. And obviously in this the funders were making an expression in terms of the way in which their policies actually support research data management in the grant application process but that's not the whole story and so there is quite a big issue in terms of sustaining research data management. And then something in terms of the UK landscape and some of the top down I suppose drivers I think this is quite an important report or the output of an inquiry really it was held by the Royal Society and I don't think open was in the title of the activity at first. So there was anticipation in terms of what would the Royal Society come up with when they started to look at the way in which science and research should be undertaken and perhaps change taking into account technologies in the digital environment. And it came up with it was it was an extremely thorough piece of work. A lot of people were consulted and it came up with quite a bold assertion about research data being made free and open access. But there was quite a nuanced approach from this so they talked about intelligent openness and intelligent openness means understanding that you have to do things to make data accessible and usable. And then also here as you'll see there was this expression about making data usable and accessible for other specialists in the same or linked fields. So that's not to say that there isn't an aspiration to make data available and accessible to all but it's recognising that actually first and foremost because there are many different things which need to be undertaken to make data shareable and reusable. Your probably primary audience is actually making sure that your peers and others within your discipline or linked disciplines have access to that data. So it was quite a nuanced approach but it was quite bold and Professor Geoffrey Bolton who oversaw this I mean he says on occasion that if you do not make your data openly available then it's tantamount to scientific malpractice so he's quoted quite a lot in terms of saying that. And then just I suppose to go a little bit wider afield and also to outline something which I think is quite aspirational and very ambitious and bold which is in the European Commission is now taking forward something which they have termed the European Open Science Cloud. And the definition of that Open Science Cloud is up for discussion at the moment and there are particular groups working on what that really means and giving further advice to the European Commission. Cloud is perhaps a bit confusing because it doesn't necessarily mean cloud technology. It really is about a more transparent and joined up research infrastructure and is about sharing data across borders so it's part and parcel of their digital single market initiative. And overall looking at as highlighted here removing the technical and legal barriers to sharing of data is ambitious in terms of it actually aspires to look at private and public actors within that space. And also to try and coordinate the member state research infrastructure with the European infrastructure because clearly there are shared infrastructures across Europe as well as member states having the infrastructures that they provide. And also looking at disciplinary infrastructures so a big issue is going to be governance around this and it's early days we expect a communication to come out in March and that communication will only be the start of development really. However just to point to some of the policy landscape that we are looking to respond to so of course something which we again at the round table was said quite a lot so we don't really have many incentives in terms of researchers sharing research data and this piece of work that was undertaken by medical research funders and also social science research funders concluded again that you know there is little formal recognition for data outputs and this seems to be quite a big problem in you know we've got all the mandate saying let's make data open but we don't really have the incentives in the system. And in particular within the UK environment is the research excellence framework because we have this dual support system and the outputs of research are evaluated via this process and whilst data is recognised as a research output in that process it goes towards things like innovative research environment so it's not weighted in the same way as a publication. However the people that oversee that have to make sure that they have a policy really that's I suppose in tune with the majority of people and so at the moment recognising data and promoting data within that process is perhaps over and above what can realistically be achieved but they're consulting on that. So going back to why universities are seeking shared research data management services is we may not have many carrots but we certainly have quite a big stick in the UK and this has been in the form of the EPSRC so the Engineering and Physical Sciences Research Council's policy which has been a mandate which has changed certainly from our research funders the perspective in terms of the responsibility for research data management so the responsibility actually being that of the university or the research organisation. So the research organisation needing to make sure that its researchers are aware of the policies actually also provide training and provide access to the tools and the infrastructure in order for them to make sure that they can manage their data and make it accessible. And then this final one at the bottom securely preserved for a minimum of 10 years from last use that's caused quite a lot of discussion I think on two counts because universities are not sure how to define the point of last use and also they don't necessarily have the infrastructure in place to make sure that data is preserved. So that's been quite a challenge I think at first this policy was rolled out so it's first announced in 2011 and then universities were given a period of time to develop a roadmap in terms of how they were going to meet this and then it was from the first of May this year that the mandate was in place. And there's been a change in the mood I think during that time so at first throwing hands up in the air then anxiety and now I think actually there's a change in terms of universities are seeing this as an opportunity to steward their digital assets and have actually looked to respond to this mandate also looking at how they can respond to the curation and sharing of data across different disciplines. So there's definitely and the other thing that that mandate has done is it's obviously put research data management on the top table within universities and so that's quite a change. You do now actually have Vice Chancellor's talking about it so an example of that is there's just recently been this open research data concord app that has been which was spurred on by that mandate so again because of some of the anxiety it was an expression between research funders and universities to say okay we all aspire to sharing our data but we know it's a journey and we need to accept that this is a journey that will have this shared statement and I know in discussions of the open research data concord app Vice Chancellor's have said okay that's fair enough but we want the solution so can we actually address this and how. So what we have is obviously fairly small sector in comparison to your own where we have a number of organisations let's say 150 rounds it up that actually need to make sure they have a research data management solution in place and they have been working on that but there are different rates of progress it's certainly very fragmented a lot of the small specialist institutions don't have the capability to provide services and again those sorts of factors I think when we were consulting around what to do with research data management the most common thing that was said was we really don't want all to be doing this on our own we need to be able to share lessons. So this shows this was a poll taken by the digital curation centre at a workshop it is a while ago but I still think it's quite a good indicator of progress so on the left hand side you've got the key elements required in order to provide a research data management service so all the way down from your policy and strategy to data management plans and then the skills and training access and storage management and cataloging and publishing and preservation and the red is actually implementation so you can see that I think overall I'm trying to remember I think this came from about 40 universities represented in this poll and it was 2014 I think things have progressed since then but it shows you that things are really quite immature in terms of having a sustainable research data management infrastructure and a lot of the blue is really just thinking about things so down at preservation you know there's a lot of thinking business planning and sustainability there's a lot of thinking but there isn't much implementation at this point in time and then some surveys that have been undertaken so really the point of this is just to show that at the moment the well it shows the storage capacity and it shows that in about three years time people are anticipating quite an increase in terms of the amount of data that they'll need to deal with to store so all of this obviously means that people are keen to share experiences and share services so we have tried to devise a way in which we can try and address that at a national level and going back to research at risk so when this consultation in terms of where JISC's funding for development should be focused was undertaken this was how the stakeholders expressed what they wanted us to address so realising a robust and sustainable research data management infrastructure and then I think if you look at the success criteria there a cost effective national brokered infrastructure as a service perhaps more aspirational things such as research outputs will be more discoverable and reusable fewer impediments to doing research but then research data management is business as usual so again quite ambitious when you're thinking about how mature the landscape is so in order to try and address that we have undertaken some analysis but one of the first things because it was quite difficult for people to identify what it is they wanted to share so one of the first things we did was to try and define the architecture for research data management obviously based on a lot of practice that had been undertaken so I should credit here Stuart Lewis at the University of Edinburgh and also another Lewis John Lewis at the University of Sheffield who had done quite a lot of work previously and so this draws from that and other activities so going from your data management planning activities to active research data management and then moving into the area where you are ingesting data for publication so storage for access if you like and then storage for preservation and then on the far right hand side showing that you need to link into other services national and international and that middle top quadrant is the research information management system which has become a critical piece of institutional infrastructure and of course you do see a blurring between the repository infrastructure and the research information management infrastructure so we use that as a basis to talk to institutions to see where it was we should try and tackle shared services at first and this shows the areas that we will prioritise so that pink section which if you like is the section which will enable institutions to meet their fund mandates so going from data management outside of active data management so where you are ingesting the data into a repository and then moving into preservation and storage but in all of that there is other activity which has indicated here so of course the digital curation centres data management planning service also we have some agreements and services around the storage end so we have a framework agreement with archiving which gives three copies of data and one of those copies is in escrow and also agreements with cloud providers so in terms of the active part of our research data management service where we are going to try and develop new services it is that data publishing and archiving and preservation but we do have other elements which we will be trying to link into and link up to so the process I have gone through the fact that obviously we have gathered requirements what we are now doing now is identifying the pilot institutions to work with because in order for this to work obviously we need to have some testing with different types of institutions and we are undertaking it is a procurement process mainly to commission different providers commercial providers maybe community initiatives can also bid to the procurement universities can bid to the procurement or development groups there is going to be a lot of orchestration between these different groups and then over a 24 month period we will undertake the development hopefully launch a beta service and then we will work on well we will be working on this all the way along the business plan in terms of making a fully robust and sustainable service hopefully in that 24 month time. Some of the key features in terms of the things that people really want the GISC shared service to start to address for them so the user experience people seem to think that that's pretty poor at the moment and so by working together can we try and address the user experience some of the products obviously that are available are really quite strong in that area but there are some products that are used around this area which are quite popular within the UK but the user experience is not optimal and then one of the biggest things that has come up has been preservation in absolutely every discussion that we have had whether it's a workshop a survey a one to one interview preservation has come up as the key gap an issue that people want us to try and address and primarily that's because they haven't got onto it yet themselves down at the local institutional level and so we I mentioned we've got the agreement with archiven but this is really talking about preservation where you've got the tools in place to make sure that you're developing the appropriate metadata and also you know the archival information packages and dealing with processes of transformation and migration and we also have introduced into our requirements emulation and that may be quite ambitious I think our preservation requirements are quite ambitious we're going to have quite a lot of R&D I think in that space so then the other key issue that people see as something to work on through the shared services is interoperability so I mentioned the current research information management systems being a key system within institutions again there are some issues around some of those actually having APIs in place to integrate with some of the other systems around research data management so that will be one of the key areas that hopefully if we work collectively we can do more efficiently together and then there's also interoperability with other infrastructures and initiatives so open air over there on the right hand side that is European infrastructure for open access outputs and also supporting the horizon 2020 data pilot so making sure whatever we develop for this research data management service actually supports the standards to talk to other infrastructures around research data management and then obviously key things like orchid and data site so the moving on from that so those were the three main things so summarized here in terms of our vision so trying to address researchers needs making sure they don't need to think too much about research data management and I think some of that is trying to make sure that we have a way in which metadata can be captured easily and just enough to make this happen and then also the interoperable issue and of course preservation I really couldn't believe actually how much preservation came up and it seemed to be such an issue that at one point we thought okay we need to do some more exploratory work on that but then we've decided to just be bold and put it into our shared service requirement and go for it and see where we go so just to show how the research data management activity links to some other bits of work which are going on in the UK to support research data management so there are a couple of things so we have been developing a service around research data discovery that's based on CCAN and at the moment it's harvesting data from 14 universities and five disciplinary research data centres there is also now the UK Orchid Consortium which the research councils also signed up to I think two weeks ago and there are 50 universities that have joined the UK Orchid Consortium and so around that there will also be some additional development in terms of integrating orchid into different systems we're also at the moment we've got a prototype usage statistic service so I mentioned the fact that universities are not sure what the last point of access is so we're doing some work around usage and what usage means and we've got I think 15 universities that are working with us on that prototype work around policies as well and then the digital curation centres DMP online so those are other core activities that are not being taken forward as part and parcel of that shared service but run alongside it and we will work with those and as far as possible they will be joined up and then there's a set of other activities which are more about advice and guidance and this issue of costing will be an initiative we need to take forward there's been quite a lot of work in the UK around trying to understand the cost of data and also how it can be supported within grants but there's more to be done there and something we want to be able to do so we've had a small grant funding programme called Research Data Spring which has really been very small amounts of funding and has brought together publishers and universities and other stakeholders just to work on small innovations together so where we can pick up some of those outputs it would be quite nice to build them into the shared service and one in particular is called giving researchers credit for their data and that's a collaboration between F1000 and the University of Oxford and they've developed a tool in a protocol which allows you to easily publish a data paper direct from the repository so that's quite interesting so it would be quite nice if we could pick up on some of those features as well and then very briefly just to say that clearly there are another set of services that support open access and you can see some parallels in terms of discovery policy analysis such as the Sherpa services I think quite a lot of people use the Sherpa services and as far as possible we'll be looking to join up what we do within the research data shared service with those as well so this represents the sort of key layers in terms of what we need from the user interface down to the preservation layer and we have now developed the requirements in quite a lot of detail but at the moment the process that has gone out is just the pre qualification questionnaire but these are the eight lots that we have decided we need to focus on to develop the shared service that as I say addresses the core components to meet the fund mandate so a couple there I'm just thinking research data repository speaks for itself but the things like the research data exchange interface so that's looking at ingest and bulk upload and migration we've added an R&D strand this came in quite late after consultation with universities on this research information and administration systems integrations and that's primarily because of the concerns that people seem to have around interoperability with those so we'll see what happens there I don't know I'm hoping that some of the vendors will come into the initiative and through it we will support them to develop their APIs so hopefully it's a win-win but you know we shall see if they don't come into it then we'll have to find other ways to deal with the problem and then around research data so preservation so the provision of a platform but as I said we will have to look I think at development so that lot six around preservation development and tools so trying again to build in some R&D around new file formats updating some of the open source registries that are available and then something the research data reporting this is actually to build a dashboard across the research data services which would report on performance but also things such as volume perhaps the compliance with policies so we want to undertake that as we go and there's already been some developments in that space but this is really about trying to provide a national service and I imagine we will try and have more of a local reporting as well as shared UK reporting across the service so it is quite ambitious and some of the things that we need to address here and why I say you know is it a shared service or isn't it because clearly universities have said they want a more flexible approach so we would expect for example that we don't just serve one type of research data repository that they should probably be two or three options to be able to deal with the different requirements and also I suppose where universities are in terms of implementing their infrastructure and also we need to be able to deliver the services I suppose as a multi-tenanted hosting option to make sure as well the institutional branding is hosted so I think through the two years we'll be really deciding or finding out exactly where the sharing is but trying to support a flexible approach so that institutions that only need part of the solution can take the bits that they might want so it is going to be quite a lot of coordination today was the closing date for our expressions of interest from institutions I can say before I flew out here we had enough institutions that expressed interest were okay and it is a real mix so we've got some medical institutions, arts institutions, we've got research intensive institutions and also some which have more of a teaching portfolio we'll have to try and work out how we select the optimal number and group and it's really going to be about testing the different sort of cases the sorts of things which they have said they're interested in I've mentioned most of these already but there was quite a lot again about preservation but no there was a lot that came up around sharing and developing practice so people really want to be able to share and see this as an opportunity there was a bit about moving to centralised services and as I say that would be something that would be tested and then there were more edge cases like automation with elab notebooks and clearly what we're doing at the moment would deal more with small and medium sized data and not necessarily big data so we're going to have to start looking at that further down the line so that is about where we are in terms of taking forward this shared service and yeah so I don't know questions or also it would be interesting to hear from you about your experiences and in particular as you might have guessed preservation but also there is a real issue in terms of engaging researchers so it would be interesting to know what people are doing in that space thank you