 Roeddwn i'r ysglaenio vaen nhoc o'r ysglaenio gyda'r Llyfrgell Llyfrgell Rydynnef ac E'r Llyfrgell Llyfrgell Rydynnef a'r Eidl Llyfrgell Ryfyn i gyfle i ysglaenio'r Ysglaenio Plyfrgell Ryfyn, o ddim yn rhan yn y sglaenio gyflym yn y llyfrgell. Felly, rwy'n gyd yn gweithio i ni'n meddwl o'r ystyried o Manchester. Felly, rwy'n gweithio i gyd yn gweithio'r problemau ar yn ymwneud, ac yn ymhyggurau'r ddatatau yma, sy'n gweithio o Manchester. Rwy'n gweithio i ni'n meddwl o'r ysgolodau sy'n gweithio am yr oedd yn gyffredinol yng Nghymru 16. May. Felly, mae'n rhoi'n gweithio i ni i'r prys. Felly, dwi'n gweithio i'r ffactorion o'r 2010 o Manchester. We've got four faculties, engineering physical sciences, medical and human sciences, as well as humanities and life sciences, of which there are 20 schools and hundreds of specialist research groups. We've got 5,000 plus research staff and 3,500 plus postgraduate students, all of which are doing research, working with data and it's the wider problem of what we've got to handle. In total, £279.4 million of external research funding was brought in in 2010, which means we have a big responsibility to these guys that are funding us to manage the data. Research data management in Manchester. Prior to any of our projects, we have had an institutional scholarly repository called e-scholar for journals and publishing papers, et cetera, and thesis. In 2009, we were funded by JISC for a project called Madden, which is Manchester Data Management. The reason why we went forward with this was basically, as Mark referred to, issues in the UK where things like Climate Gate, Freedom of Information Act, and it raised the awareness of institutions of, you know, the problems in managing research data. Also, funders change it, well, the policy updates that Mark referred to as well. EPSRC in particular that Mark referred to have issued that universities should have a path for developing a framework for research data management later next year. And then something in place by 2015. So they're actually saying you as an institution should have some sort of framework for research data management. So that's another reason. OK, wasted resources. You've got scientists that are actually creating data at instruments, doing their research, keeping it to themselves. Well, really, these instruments that they're using are very expensive, yeah? So we want to save on money that way. But also it's public money, yeah? And that's wasting public money, yeah? And also, you know, as Mark pointed out as well, I've got to mention Mark a lot in this as one of our funders. We've got responsibility to the public, OK? And also there's the risk of data loss. So Madden, our initial project, which was actually working on creating a data management system for just a few users. We didn't try to tackle the whole university at this stage. We worked with the life sciences and medical and human sciences, a couple of groups at each. And we did a user requirements analysis only to find out what we already can and you to a point. And that was our data management throughout these groups was ad hoc, inconsistent, you know. It was dependent on the user group how they actually handled their data. Multiple copies of data all around the place on laptops and USB sticks on external hard drives at home. Wherever you think about it, you know, it was difficult to track down which was the right version. You know, after a certain period of time, it goes very cloudy. You've got, you know, if it's not managed properly. And that's mentioned USB discs, external hard drives laptops, even large service somewhere. You know, it was a problem of where the actual data was actually being stored and transferred. So fragmented and decentralized storage, which was a nightmare. And of course, you know, just by having your data on an external disk drive is not back up. You know, it's not something that you can rely on. You know, as it was referred to, my background is scientific visualisation. I know how many times, you know, people's external hard drives have been fried, you know, when they've even plugged them into one of our visualisation systems. You know, you just, you need something that has proper service background with regards to backup, etc. And also, because of all this kind of approach by the actual groups to keep things to themselves, it was a limited means of disseminating. Not everybody, and if people were trying to, when I started my PhD, a little story, I went looking for a really large data set to visualise. Something that would be free on the open market that I'd be able to take and work with. And it was so difficult. I mean, this is ten years ago, but it was so difficult to actually get somebody to share data with you, yeah? Why should I have to go out and recreate data if I'm just trying to do an experiment such as, you know, how do I parallelise big data? Anyway, so that's a bit of a sidetrack there. So, limited means of disseminating. And, of course, there was no archiving policies to support long-term curation, or the retention needs of our funders. Normally, one of the approaches is just to store the data under a desk on the PC or on hard drives. So, that is not complying with the funders. So, Madden brought about a simple software solution that allowed the researchers to handle their projects, their research data. It linked up with our research office systems, so that in the data management planning exercise, we cut down the duplication of input of data, such as, you know, who my collaborators are, where my data is going to be stored. You know, who can I share this data with? Ethics issues, you know, has it been signed off, et cetera. So, anything that you could think of data management planning was involved. And it also gave us a platform for compliance. But this, as I said, this was only for life sciences and the medical and human sciences. And this project was funded from 2009 to 2011, at which point, at the end of 2011, we were lucky to be funded again by JISC on a project called MISC, which is Madden Inter-Sustainable Service, and is looking at creating a research data management infrastructure service for the University of Manchester. In setting up this project, we got 250K from JISC to support the project team. You will notice that the 750K from the main university, most of that, apart from 245,000, which was for storage, was, and remember, this is a transition project. It's not the service. The service will require a hell of a lot more money for storage and, you know, staff input, et cetera. But the rest of the money that was put towards this project was for staff time. In building up what, working towards a service and research data management infrastructure, we had to involve a lot of people from across the university. IT services people, in particular storage and infrastructure services, the library, the research librarians, are going to be helping teaching about research data management. The research support services are the research offices. We've got a percentage of time, about 27 research business managers, to work with the researchers to do their data management planning, et cetera. So that's just a small example. And of course, there's a steering group and, as well, a technical architecture group which are all about delivering the service. OK, so one of the main outputs of MIS will be, is the policy. And as I said, it was ratified on the 16th of May. And when we started looking at doing this policy, which was really a bit before the project took off, we started looking around and thinking, where do we start on writing a policy? We've got the RCUK common principles which Mark put up. And that is something that, you know, we thought was really important and would be a starting point. But as well as that, we've got all the other individual funders policies external to the RCUK councils. We've got European councils. We've got charities, et cetera. We've also got our own University of Manchester policies. And one thing that you don't want to do in writing the policies to have your policies contradicting each other. So an awareness of what is actually happening in your institution is very important on what policies. And we've got thousands at Manchester. That got part of the money that we did buy in time of a person or somebody that was a policy maker. The Digital Creation Centre again in the UK is a fantastic point of contact. It has a list of policies that have already been ratified. In particular, Edinburgh's policy was very useful and a point for us to start from. They were, I think, the first in the UK. Yeah. Well, Kevin, that was the first in the UK, weren't they, too? Oh, OK. That's why I was very wary of saying that to actually publish a policy. It was a good example for a lot of us universities who are actually involved in the GISC Research Data Management. One thing I should advise is that if you are going to have a policy, have an academic champion that's really good at going out and buying in support for your policy. We started very early on before the project about, say, about September in moving towards a policy last year. The policy has been ratified in May. We didn't expect it to go through that fast. So now we've got another issue on our project in that we've got to have an interim service that we are going ahead with. And when you are writing the policy as well, you want something that's simple and clear for the people to read. You don't want something that's like a political agenda. You want to make it easy to understand because it's so easy to misinterpret things that are legally written in legal terms. OK, so sneak preview. So in writing the policy, what was necessary was clear ownership and responsibilities. Now, Mark, as I mentioned before, talked about the NERC policy and the UPSRC policy and who the owner should fall upon for being responsible. What Manchester has taken is a shared approach. So the university will support its researchers. By the way, the policy is 12 points behind it. It's a lot of procedures and guidance that has to be written and is currently being written before we actually make people do things from September. So we want to make it sound as if it's not a stick. It's a nice, easy place to work in. Easy to put it out for the researchers. OK, so the first point is we adopted the RCUK common principles. We've also said that we will take into account any other research data management requirements. So we're thinking about policies from funding bodies as they come about. This policy is not a fixed thing, by the way. It is something that will be reviewed continuously in the movement and the growth of the funding, the funders policies as well. OK, what we do also say, talk about, is intellectual property rights. So there was a question in the previous talk about who owns the intellectual property rights. At Manchester, our IPR policy says for staff the university owns the IPR. PhD students who are paying are a different kettle of fish. This is where we actually had our first stumbling point and where you have to involve your university lawyers and make sure what you're actually going to put in is the correct wording. This is where I talk about legal wording being not the way to go, but we need to simply put out that, again, if you're a PhD student, less agreed otherwise, you work with the Research Data Management Service at the University. As well as that, PhD students are funded by Research Councils. As been said before in the talks, it's public money. So they haven't owned this anyway to share their data and to manage their data accordingly. Also one thing to be very aware of is multi-partner collaboration. We talked before about data being stored in many different places and ownership of data being a bit muddy. For us, we said that it was the responsibility of the PI in using the data management plan to actually state any of these details about the research data, who owns it or if they don't own it, what you allow to do with it, licensing, sharing, etc. So going on to data management planning, we've said that every research project should have a data management plan, which must be maintained throughout the project lifecycle. We see it as a live document. Now the thing is with data management planning can be hundreds of questions for a researcher to fill in. So what we're doing as part of the research data management service is integrating the data management planning with working practices. By working with the research office and the research office systems, we're taking data that's filled in when a researcher fills in a research application and filling out the data management plan. Again, when they fill in an ethics form, ethics comes with the ethics system into the data management plan. It's the building up of work processes to make the research's life easier, because what you don't want, and one of the feedback messages from our consultation of the policy was, oh gosh, this sounds very bureaucratic. Do we have to do this? Yes, we have to make it easier for the PI. Again, we've put the responsibility of the PI, of doing the data management plan and the PI. Where can data be stored? OK, so we're going to create this service at Manchester, which has some central storage. However, we are very aware that research funded by councils such as NERC have their own repositories. So what we actually say, well, OK, if you've got access to another repository that you have to use because your funder says so or you're using because of your community, as long as, in our procedures and guidelines, it's a recommended and approved repository, that's fine, just as long as you state where the actual data is in the data management plan. And, as Mark said, it would be very expensive to hold everything. It's very expensive for universities, especially when funding charities like the Wellcome Trusted, you have to keep data for 20 years, from the last time it was actually touched. How can we afford that? I mean, one of the concerns of the university is overheads. So if we are thinking of costing in the retention of data for a 20-year period, our competitiveness goes out the window. Right, so going back onto the policy, the next point seven is about metadata. All relevant data, by the way, notice we've got specific terms because, again, what would be good would be a checklist which we could give to our scientists, which Mark's NERC is looking at. That tells them what they have to store because another question that came out of our consultation was what data do I have to store, everything? Or, you know, can you give us some guidelines? That's going to be in our procedures and guidelines anyway. So it's going to be very interesting to hear more about the actual checklist that Mark was referring to. So, yes, we require the researchers to create metadata that describes the data so that, you know, in reuse, people can understand, you know, what this data is, how it could be reused, you know, what variables were applied to the data and producing it, et cetera. And also, we see several levels of metadata. It's not just about how the data is created or acquired, but how can it be discovered easier and to make reuse relevant. Okay, so retention periods, that doesn't need much saying. We've stipulated that, you know, the data has to be kept according to the funder's guide and the University of Manchester guidelines. We have our own guidelines as well, which is amazing how many people don't actually know that researchers don't actually know what their university requires or them to do, but they know what the funder requires them to do. Openness and Publishing, again, you'll notice that a lot of the keywords we're using are the RCUK given keywords. So, to make data openly available to other researchers, but also by protecting our own research is by saying, of course, you're going to want to have the impact on your research, so there'll be a limited period of privileged access to ensure that we get the research done that we want to. But also, if data is made openly available to ensure that compliance with ethical approvals, rights of the data subjects, anonymisation of data, the Data Protection Act in the UK, and also all the IPR issues, et cetera, are maintained. And also, we protect our researchers in the medical human sciences, those that cannot make their research data available. But we also protect ourselves as well as an institution saying, well, maybe you might want to share with some specified users who, you know, may be needed to verify the integrity of the research, and in doing so, you must think of the confidentiality. And also that we realise that published research may only require some of this information. So we're kind of like trying to say, tread carefully. And of course, we haven't got a stick anymore. We did have a stick at point 12, but that was complained about, which said, you know, this could be a case of research misconduct if you don't follow this policy. But we've got this softly, softly gentle reminder and saying that, remember we've got a university code of good research practice here. And, you know, it's an overarching framework for this policy. And hint, hint, that means if you don't really abide by this policy, it is research misconduct. If they read that into it, I suppose. So, as I said, it was ratified on the 16th of May. The ratification path was, it was quiet, quieter than I thought it would be. It went out to several university groups, first of all, the university research group and research support group who were the main guys who kind of like decide upon whether it's a good thing to do and made most of the comments. Then it went out to a consultation consultation in March of this year. One thing in that that I should point out to people in actually setting it out to consultation at such a wide university, there's going to be communication path breakdown and you must have some sort of backup plan on how, if it does break down, you know, you've still got to keep your path going, your policy going through the system because otherwise, if we hadn't got to the planning resource committee on the Senate Board of Governors, it would be next year because over the summer it goes really quiet at UK universities and we wouldn't be able to get all these guys together to improve our policy. So, you have to have backup plan for the communication. Consultation period, as I mentioned, brought up the bureaucracy of it all. So, making us aware of, you know, what other, what researchers were actually doing and how difficult this could make it for them and it's all good, good advice that came through and we're working with specific users now to ensure that fed back into the consultation period. And that was the main gist of the policy. Yeah, it's still a work, it's still a work in practice because we've got to write the procedures and guidance so it's not public yet but it should be by September. Yeah. Okay, thank you. Thank you.