 Mae gilydd i'r gweld dros y fydd, mae'n gwybod hyd chipwch, hwn wedi gyd yn cyflwyno a'r gwiyd gyda'r newydd gwreithio a'r newydd. Mae'n ddifenni cael ei gwybod i gan ganddilyn gwahanol eu bod yn fwy gyd yn gweithgaf? Felly mae'r wyniadau, mae hi'n gynnod i'n meddwl i wneud unrhyw gweithgaf mwy gyd yn gweithgaf, gyda'r gyd yn mas am yr oedraeth o'r wneud chi, mae'n ddifenol nhw'n arddangos. Ond rhai, mae'r rhai yn ei wneud i gweithio'r ddau sydd yn ymdweud, y ddau'r ddau'r ddau'r ddau'r ddau sydd. A gyd, roeddw i'n ddweud i'r ddau ddau'r ddau sydd yn y ddau'r ddau a'r ddau sydd yn ei ddau. Dwi'n cael ei fod yn ddefnyddio ddau sydd yn y ddau y ddau boggyllydd yn rhoi. Dwi'n ddau sydd yn rhaid gwneud y dyma'r ysgol. Mae'r hyfforddau hypoethus, mae'r dweud yn cyd-maint o'r tiff, a mae'n anolwad ar hyn. A mae'n gwybod o yw'r cyllid yn ymddydd. Mae'n gwybod ymddydd ymddydd, ond yw'n gwybod. Mae'n gwybod yw'n gwybod mewn gwirionedd ac mae'r cyd-momb i'w tawr o'r ffilosfwy o cyllid. Felly mae'n gwybod i ddwy i'w ddwy'r fwylltau. A'r cyfeirio yw'r cyd-mobli'r ganol yw'r cyfnod is that when you're doing science, you're usually building on existing ideas, existing knowledge, existing results and analyses and so on, existing data. So if you're building on existing data and existing results, doesn't it make sense to build on the existing software that produced those results? Well, we computer science people would probably also say yes, but this turns out to be something that's extremely rare in science. And if you ask scientists whether they do that, the answer is often no or it's too difficult or how on earth would I do it. And the reason for that is really to do with the pressures on scientists, the sort of practical day-to-day pressures and the way that science is funded. So typically what happens in big-ish groups is that a junior person, a relatively junior person is required as part of their work to produce some results using software somewhere along the line. So they create a bit of software, they write a bit of software or they use a spreadsheet or do something with software somewhere on their laptop in some folder on their laptop. My research July on a Saturday stroke, this might work one day sort of thing. And they do the work, they run the analysis and that gets published because that's the dissemination part which is so important to science. And then that person, that junior person is almost certainly on a short contract. Either they're a student and they're doing a PhD so they're only around for three or four years or a little bit longer if they're part-time. Or they're a postdoc and they're funded by a grant and the grant runs out and the person moves on. And then if that software is to be built upon then the next postdoc or the next PhD student has to first of all find that work that's already been done which may be squilled away on someone's laptop in a bizarre place. They need to find it, they need to build it if it's the sort of software that needs to be built and compiled and what have you. They need to run it and then they need to in some way modify it so understand it. And these are all steps that are often very difficult for scientists because what they're missing in that process is all the sorts of things that our software people and hardware people now have become to sort of rely on as part of open source processes and communities and just the way that people do things. So scientists are usually missing things that we would consider to be critical to our day-to-day work like version control or some kind of repeatable way of gathering together all the libraries and the different bits of software that we need to run a build or create some sort of infrastructure for the software to run on. They often don't document what they're doing and often perhaps wouldn't necessarily be obvious which bits are important to document. They often don't test what they're doing in the way that we software people think of testing so automated testing. And they often don't think about automating things that we would automate as a matter of course because software people are taught to be sort of very lazy in a very particular way. They do something once you just do it but if you do it twice you automate it because why wouldn't you be the computers there to serve us isn't it? Not the other way around. But this often isn't obvious to people who haven't had much software development training. So they'll do things again and again and again and write out really long things on the command line or do something on a spreadsheet that would be so much quicker in a for loop because they don't necessarily know. So these sorts of engineering disciplines are very rare in science but obviously very widespread in open source communities. So if you're a scientist and you're in the unenviable position of starting a new job or starting a student ship and you inherit some random pile of files from the last person who just left. Then first of all you want to be able to find the software you need then you need to be able to build it then you want to be able to run it. Then you want to be able to check that it produced the results that were actually published in the last paper and that it does what you think it's going to do. And then and only then can you extend that work to produce some new results and some new science. And those steps are often all extremely difficult and sometimes impossible. And there are really three big reasons for this which are not particularly very exciting reasons. They're really very pragmatic reasons. So one is that scientists and this sometimes includes computer scientists and not taught these sort of very day to day skills like you know how to use get and how to use sort of tools that we would use all the time. Because there isn't a lot of time in a degree to teach absolutely everything you want to teach. So if you're a genomics lecturer and someone says well I want to make sure that all your genomics graduates know how to use version control because it's a really key skill for scientists. You would say well there's an awful lot of genomics for these people to learn. What should I take out of my syllabus so that I can teach these people version control. And it's very difficult to persuade people that a core bit of their science is worth dropping so that their students can have these important skills. So that's the first. The second issue is that there isn't really a way to credit people who do this kind of development work. So if you produce a new piece of science you want to publish it your name will be on a paper that describes that piece of work. But if you're just the person that wrote all the scripts and maintained them and made sure that the server didn't crash just before a deadline and you don't get your name on a paper then all of that work which may be months of work and pretty hard work for a lot of people is not something you can put on your CV and it's not going to help you get your next job and your contract may be running out pretty soon. And it's not something that anybody is going to know about or notice in terms of your career. So that's a barrier for people. But the last thing is that science has a set of incentives. It's important to get your work disseminated and published so that people know about it so that you can get the next grant to fund all the people that you already employ. And you will lose those staff if you don't get the next grant and without getting new staff in or renewing the contracts of the staff you've got you won't be able to publish the next paper. You won't be able to do the next bit of work. So this cycle of publishing and grant writing is quite a tough cycle. It's a merry-go-round. And people have a perception or scientists often have a perception that if I say to a junior member of staff, a postdoc or a PhD student, by all means spend a month, spend two months sorting out your million files of silly scripts and making them readable and maintainable and testable and all these different things. That's two months that they're not spending doing the next bit of work to publish the next paper, to get the next grant. So people think of this sort of software development practice of subtracting time away, whereas those of us who've been trained in software development will think, well I'd really better check this into version control because I know I'm going to wake up tomorrow and I'll have forgotten what I did today and I'll have forgotten which file I put that work in and where it went and how to run it and all the rest of it. So we who've been trained this way would tend to think that all these important practices are not subtracting time, they're actually adding time because in the future they're going to make my life easier or what have you. So this sounds a bit bleak and a bit depressing but it doesn't have to be. The good news is that there are a lot of people now working to combat all of these issues, both the issue of credit and finding ways to credit people for their time if they're just doing building research software and also training. So these are two institutions that do both these things. The right is the Software Sustainability Institute which is a body that's funded by several of the major funding councils in the UK. And that exists to help scientists combat all these problems. So they have a group of people who will go to scientific groups around the UK and help them to take their morass of very complicated code that nobody can understand and turn it into something that's usable and can be built upon. And the Software Carpentry Institute is a global group who train people up to do good workshops and do bits of teaching for people who are working scientists but to replace the teaching that you might hope these people had had in their undergraduate degrees. So very often this sort of work is helping people who maybe have only ever used spreadsheets but would be better off writing a script because it would cut their workload down by a half perhaps. Or helping people who have some scripts of MATLAB files but they're all a mess and helping them use version control so they actually know what they've done every day and so forth. So that's software and that's the issues with it and what could possibly be done to help scientists do better with their software and be more productive with their software. But this also raises a question I think which is if you have the value that it makes sense for scientific software to be open source because it helps scientists build upon the existing knowledge and the existing practice that's there, What else should be open source? Should it just be the software? Or should it also be the data that people have gathered? Or should it be the data and the software because with the data and the software you can run the analysis and check whether somebody really came to the conclusion that they ought to have come to or whether a published result is correct or so on. Or should it be the data, the software and all the lab equipment that produced the data as well? Or should it be the data, the software, the lab equipment and the papers that disseminated that work? So papers are actually quite a big and controversial issue in science and if you want to see a scientist really upset and ranting for a very long time just ask them about the state of publication in their particular area. So it's often the case that you publish a paper and the authors have worked on it for a long time and they may have been publicly funded so they publish the paper and then they get no extra money from the publication company to publish that paper. Somebody edits the paper, they're not paid, they're a volunteer also but the people who read the paper have to pay which is a slightly odd model but it means that academic publishers have a profit margin that's larger than almost any other industry and nice for them because they can subsist on a very small number of staff to produce these pieces of work that have really been produced by volunteers and then sell them out at a very large profit which of course causes a disincentive because it means that you can only read those papers if you or your university or whoever is employing you has a lot of money and only so many universities have a lot of money and many universities around the world do not. So there's a sort of power imbalance there so if you want to upset a scientist ask them about publishers. So you might also think well if all of that should be open source, if you believe all of that should be open source and you know that of course most scientists are funded by you and I through our taxes maybe everything should be transparent, maybe scientists should put their expenses on the web and you know be very transparent about what they're doing. In fact the research group that I work for now has all of its expenses on the web so you can see exactly what we're doing. Our expenses are not terribly exciting I have to say because we computer scientists we don't buy synchrochons and things like this, we just take flights and buy coffee occasionally. So you can read them but it's not thrilling is what I'm saying. So there is a move now in science to increase the amount of openness, partly for some of the reasons that Peter talked about previously, partly for sharing, partly for building upon one another's work, partly for driving down costs because if you can drive down the cost you create more efficiency for the taxpayer if we're thinking about relatively rich universities in the UK. But also you open up science to all sorts of other people who may not have our resources and our finances behind them which is great. So this is just a couple of examples, this is a paper from a group in Cambridge who produced a 3D printed microscope. So there are several parts to a microscope, there's the computer part in the software which in this case is a Raspberry Pi. There's the lens part and in this particular case that's the Raspberry Pi camera module. Then there's all the other mechanical bits that keep the sample stable at the bottom of the microscope and move the lens around and all this sort of thing. And that's what these people have produced in this paper, they've produced a 3D printable microscope that anybody can download and reproduce themselves if they have a 3D printer for a very cheap price. So the dominant cost of this particular microscope is the Raspberry Pi and when you think how much a Raspberry Pi costs and how much a microscope would normally cost, I think that's quite an impressive feat. So this idea is gaining momentum now, so there have been some conferences which is what this nature article is talking about on creating whole laboratories full of the work that 3D printed that we can all or anybody with some 3D printers and some skill can replicate. Which is great, but it also poses some really interesting challenges because if you buy a very expensive piece of equipment, if you don't have your 3D printed microscope, you have one from a company, that company will come out and calibrate your microscope and calibrate your very complicated equipment. So open source hardware is a great idea presenting a number of challenges like calibration, however you know that your microscope and the microscope in the university down the road are really reading the same things. We don't yet. So this talk has been a very general overview about the problems in science and how open source and open software and hardware might address some of those problems. But since you're the BCS, you can maybe help out with some of these things. So if you ever feel like doing something about these problems, the easiest way to get involved is to help out with software carpentry. If you have some software skills, then almost certainly you can help out at a software carpentry workshop. You do not have to be a particularly amazing developer to help out because the problems that people face in science are often really very basic problems and that anybody could be taught if they have the opportunity to learn. So software carpentry is really about giving people the opportunity to learn these things. So this is really a sort of big advert this talk. If you're interested in these things, if you want to help out, come and talk to me over beer and I will tell you more about how to get in touch with these people and how to get involved and maybe you can come to a software carpentry workshop down the road. So thank you very much.