 Hi, I don't know what people are expecting. This is kind of going to be a rant and a bit personal and a bit opinionated and I've tried to put some slides together but I sort of struggled to illustrate some of the things that I want to try and say, so if you bear with me that would be great. I just want to start with a little bit about my background because obviously I'm aware that scientists come in lots of different flavours, imposters come in lots of different flavours and I might be talking about something a bit specific. I'm a biologist, I didn't spend my teenage years programming, I spent my childhood playing around with frogs and bugs and things like that and I did a biology degree, I liked maths and computers at school but I didn't spend a lot of my free time on it and I did a biology degree and then went on to do a PhD in an area of computational biology and for various reasons ended up kind of feeling like I was struggling in on the back foot with the sort of programming computer side of things, mostly because I didn't have any training in that area at all. So that's the background that I'm going to be talking about and maybe some of the problems I think there might be in academia and education and some things that I think we could do to maybe sort those things out and some people that have helped me that other people might find useful. So there we go. Oh, I see it. Okay, good. Okay, so being a scientist I tried to illustrate this in graphs. So this is the amount of coding that I do and the amount of training that I had so I did have a little bit of training in programming when I was in primary school and thought it was fun but it was something we did for a couple of weeks that went away again where we just typed instructions out and made some basic games. That was kind of fun. I started a PhD and the first thing my supervisor said was right to implement this evolutionary model in Mathematica and then he left for three weeks and that was a fairly daunting task and that made me feel like maybe I shouldn't be doing a PhD because if I was expected to know how to do all of these things then maybe I was in the wrong place. And so that's where the imposter part of this talk comes from because I've kind of gone on feeling like that was a little bit. I felt like I was sort of struggling to learn a lot of things, struggling to catch up, not sure where to even start a lot of the time and from talking to other people that I work with there are quite a few of us that feel the same way which was my motivation here. So I again had a sort of two week Pearl course that I sat in on that was part of somebody's computer science masters course that they were doing when I was doing my PhD and unfortunately that was very basic. Like I could do the sort of basic concepts but so that wasn't very helpful with sort of day to day volume of stuff that I needed to do. And now doing a postdoc and again there hasn't really been any training in anything computer related even though basically I'm in front of a computer eight hours a day a lot of the time doing coding. So this is again sort of how much of an imposter I felt again at times I feel like maybe I know how to do something and then wah it's all a bit much and I feel like I probably shouldn't be there because everyone seems to know things and I don't and people make comments like you're really smart so you should know how to do this. And those sort of things make those feelings of being an imposter worse somebody goes you're really smart you should know how to do this thing that you've never done before. And I might try and share a few tricks I've learned for dealing with those people because I think they're wrong. So yeah and again trying to illustrate this what it feels like is I can do very little everyone else can do an awful lot. And I don't know if you can see it. On the right is a sinking narrow boat. So I feel like a sinking boat I feel like I'm gradually going under and all the time I feel like there are more things I should know how to do. And it's like a constant race. Like the metaphor of a dark way it looks like it's all right but under the water it's kind of doing this weird swimming thing. So just trying to illustrate this in a couple of different ways there. But this is how this is how I want to feel and how I want my students to feel. Like as I move on with being scientists I'm getting to supervise some students. I don't want them to go through the same sort of feelings that I did. I want my students to learn things and feel confident about learning things. I want them to know that maybe they can do something other people can do something else but there's a common ground and we can all learn from each other. And in the process of doing science like what we're trying to do ultimately as scientists in the system that we work within is produce papers and produce research. And supervisors will have want that to happen as quickly as possible. But if you don't have the skills to do that well it might happen. It might not be very good. It might not happen as quickly as they would like. And it seems like it's this constant race to get to producing papers but what I want to argue is that if you actually spend time learning the skills that you need to do the research whether there's a skills on the computer or skills in the lab you'll actually get to the paper writing stage of things a lot faster and produce something a lot better. So I want to argue that taking time to build skills and to teach your students skills is really important in this whole process of doing science. So I'm going to be bold and say I think we face a crisis in biology education so my background is biology I can't speak for other sciences but my degree was very non computational and very non quantitative and I think this is a massive problem. So I did my degree round about the time that the first human genome sequence was published. Since then thousands of genome sequences of different organisms have been published we have an absolute ton of data. We need people to analyse that data and how do you do that? You do that with big computers. You need to know how to use them. So I don't know whether this applies to other sciences. I can only speak for the background that I work within. So I sometimes go down to get this graph on this. So again trying to show the development of DNA sequencing technologies which has massively jumped up in the last couple of years vast amounts of data being produced and we need people to handle that but it seems kind of like at the moment that hasn't fully been appreciated so I went back and looked at the degree content of the degree that I did ten years on and it's exactly the same. There is still no programming course for biologists. There's a very basic maths course that you didn't need to do if you had got to see at I think higher or a level and that hasn't changed and just to verify that I talked to people at the university that I went to that hadn't changed. I looked at the biology degree at Oxford which is where I work now and it looks pretty similar. People I know who are teaching say that maths and computing are a major bottleneck for the undergraduates that they're teaching. They're really concerned that they don't necessarily have those skills to teach the students but they think the students need them if they want to go into research. So at the moment I'm doing a post op. I work in a lab with a bunch of people ranging from medics who are actually doing clinical stuff to statisticians. Even the medics come in my office and go can you show me some basic python because this will help me do the research that I want to do. So people really appreciate that there's kind of like a need to learn these skills but we don't necessarily always know how and some people are very afraid of the idea of sitting down in front of a computer and playing around because it's something that they haven't done. If they're a medic they haven't done that. They don't know where to start. But we have vast amounts of data in biology that we want to work with. So again I tried to look up what was in an undergraduate degree course and doing a little bit of research about an hours worth of research I could find two bioinformatics BSC courses. One mention of a course for undergraduate computational biology that was in the process of being planned and my degree had two week intro to bioinformatics that didn't involve any programming and just using some basic software tools. Again somebody's published a paper actually looking into this and they interviewed a whopping 937 ecology researchers and 75% of those people felt that their degrees were lacking in maths or computing content. Interestingly if you look at these blue bars showing whether they were satisfied with their math understanding in the paper they include computers in that as well. It's only the people who actually trained as mathematicians to say yes I'm happy with how much maths I know which I thought was kind of interesting. And here's a quote from one of the people that they interviewed here where they're saying that given the nature of our field even though we work with specialists who implement maths and computing stuff for us we actually want to understand what's going on and it's taken us months of focusing on statistics, mathematics and models just to get up to speed with the fundamentals I wish I'd been given during undergrad. So I think there's a need to improve this. I'm not really sure necessarily how to do that but I just kind of wanted to get up and rant about it in case anyone had better ideas than I did. So I'm going to ask what can we maybe do about this within the constraints of the system that we're in because there are things about the whole science academia system that I would really, really like to change. And I think one of the things that can lead to feeling like you are an impostor is kind of inbuilt in the academic system. There's this pressure to be smart. People say you're smart. You should be able to do this. And maybe you are smart but maybe at times you don't feel like you're smart. Maybe you are somebody who finds it really hard to speak up in a room full of people. Or you might find it really hard to ask for help when you're struggling. I think it's kind of on supervisors to work on building a better dialogue with students to try and identify those needs sometimes because people may not speak up about things. If you're feeling like you're on the back foot, if you're feeling like an impostor, those feelings can be really hard to get over. So again, what I want is for everyone to feel like they have some skills, other people may have skills but that's actually not really a problem that you don't know. Those are things that you can learn. At the moment there seem to be lots of really short courses that are on offer sometimes to people who are further along their degree. And again, I'm going back to my neighbour's boat with this metaphor. So this boat that was sinking in the first picture I showed actually sunk completely to the bottom of the river. And the owner's response to this was as follows. They pumped it out. It sunk again. They pumped it out again. And eventually worked out that by standing there continually pumping the water out of the boat they could just about keep it floating. At no point does it seem to have occurred to this person that maybe they should try and find the hole that is letting the water in and fix that. Perhaps they don't have the money or the resources to do so but they seem to be reluctant to try different solutions. And because there's a boat polluting the river with water lots of people are trying to give them advice. And this sort of struck me as a useful metaphor for maybe there are things in this system that we could try and change or that I would like to try and change. So here's an illustration based on talking to people in my group about some of their attitudes to computers and some of the things that people have said to them and that have been helpful or less helpful. So on the left is the supervisor who maybe had no training in computational stuff themselves but has learned stuff because they've been doing it for 15 or 20 years. Note the propeller scars on the manatee. This is the key feature of this diagram. This manatee has learned through pain. What they tend to say to their student is, go figure this out on your own. It should only take you a day or so. I found that that translates to several different things. Often with my supervisors that is, I have no idea how to do this thing. Please figure it out for me. Sometimes I figured out how to do this thing through pain and you should do the same thing because that makes you actually learn. You'll learn if you have to do it through pain. On the other hand, some supervisors will go, yeah, that's really hard. It took me ages to learn to want to look at what I've done. How can I help you out? Let's talk about this. Oh look, you can do this bit. If you can do this bit, then you can figure out this other thing. I think the technical term for this in teaching is scaffolding. So giving somebody that's something to do that is challenging but not so challenging that they don't even know where to begin with it. I think that if we want to train people as researchers, that's the approach that we need in building all kinds of skills is build them up gradually, start small, work up to bigger tasks, building on what people already know, work out what people know, start with that. With the medics that I work with who are interested in programming, some of them have tried to do courses and they just go, it doesn't stick in my head. What does actually seem to stick with them is figuring out a problem that they want to solve and working with them to solve that particular problem. Find the thing that they're interested in and then they actually enjoy the skills that they need to learn how to do it. This is probably really obvious, right? I don't know. When I looked into this, this idea of a fixed or growth mindset and I think often in academia there's a tendency to have this sort of fixed mindset of viewing people as being smart, at least in some of the cultures that I've been around. People go, you're smart and if you're told that you're smart and then suddenly you find something that you can't do, this is kind of a weird cognitive dissonance because this destroys almost part of your sort of conception of yourself. If you go, I'm smart, but I can't do this thing that I've been told as simple. Suddenly you feel very stupid and your ways of learning might not be necessarily the right ways of learning to get over that. If you're not used to failure because you're used to feeling smart, you're used to finding things easy, then when you try to do something like programming which is actually about failure in quite a fundamental way, you may end up struggling. At least in my experience, my degree was a lot of learning stuff from books and writing essays and doing various tasks that weren't quite such a problem-solving nature and that's one way that I've personally found that I struggled with was doing a task where I'm going to fail and I think if somebody had told me that before I started doing stuff on the computer that would have probably helped. But in fact there's the idea that your ability isn't fixed and that you can fail and you can learn from failing and then you can fail again and you can do it better and then you can learn. The one on the right, the growth mindset. I think that's actually a much more useful mindset to have when you're learning any complex task and I think if you're supervising students that is the one that you really want to encourage. I also think that scientists may not be software developers. We have very different backgrounds and so we don't need to write code that is beautiful. However, a lot of the code that biologists write and that comes into published papers is of the nature. This was analysed with an in-house Perl script and there's no further description of that. I want to put up a representative in-house R script from somebody in my group and an example of how I think learning some skills can do it better. So trying to prepare something for a presentation. My boss sent me this. Here's a script for you asked for. I ran it once last year. I think it worked. I don't remember what it all does. It starts with a comment and I go, great, this is going to tell me what it does. Actually the comment is probably from something else and completely irrelevant. It says, read a tree. Tries to read a file. That's actually on this person's desktop machine in order to do the next bit. And when you run this thing, it doesn't ask for a tree as any kind of argument so you don't realise that until you look at the code. It then goes through a bunch of things where every variable that they've given is TP or MTP and it can take a little while to work out what they were talking about. It then finishes and produced some output and looked completely confusing. And they said, oh yeah, it's finished. Yeah, I don't know what that is either. And they couldn't find the original script that they'd taken it out of and put into another one. And they said, yeah, I did this last year and I don't remember it. So I want to argue that you can do better than that. So scientific code may be shitty to a software developer. But this example, I think, illustrates how it may be shitty. It may make a software engineer weep, but it will get the science done and it will probably get the science done again in six months. So I think we need a low level of skill but all the stuff that I learn actually speeds up the science, makes it reproducible and I think that's really important. So if you're looking at this example, I can't see on the screen. But basically what's different about this example to the previous one is there are comments, the comments tell you what's happening, the variable names which describe what is in them. And if you go back to that one six months later, you would probably be able to work out relatively easily what it does. So I think there are definitely things that scientists can aim for, but it's not aiming to be like a software engineer because I think these are very different things. So somebody produced what I thought was a really helpful ladder of what you could be aiming for with scientific code. This is slightly counterintuitive because it actually starts at the top, so described in the publication. So what is it, what does it do? They reference their implementation with some way you can download it, you can install it, you can run it on your own data, then you could maybe replicate their results, run it usefully on your own data, include in a pipeline, modify it and redistribute it. And these arrows that they've designated, I don't necessarily completely agree with them because they say this is nominally required for publication. Actually many publications aren't on this scale at all, so that isn't required for publication. That first arrow should be up the top there. But the sustainability stuff I think is important. This is from a guy called C. Titus Brown who does bioinformatics, and if you're interested in this stuff at all, I would check them out. I put a star on it based on where I think I'm at with what I'm doing. I feel it's a continuous learning process. So some of the things I think are useful for scientists who focus on handling large data sets, at least in my field, that's something that we're doing and a lot of people have absolutely no clue about how to do. So we're getting given computing power that we don't know how to use necessarily. And I think this is problematic. So we have one shared server in my work, and we have people who will just load it up with their stuff that would run for six months. We have a cluster and people will run jobs on the head node. And I think those things are like, if an hour's worth of basic training would probably get rid of issues that happen on a recurring basis. Version control and code sharing are things that do not happen at all in the group that I work in, and the group that I'm in at the moment is definitely the best for software stuff. We have issues where people will move a piece of software that other people are using, delete it, change it to a different version, and everything breaks that 20 people are doing. Version control is something that the scientists I work with seem to be really, really resistant to. If anyone has ideas about how to get over this barrier to the idea of using version control for the scripts that you write and want to share with other people, please talk to me, because my boss just will not do it. And people go, I don't understand how to use this SDN thing that somebody set up, and they just will not do it. Keeping logs of stuff as well, I find really helpful. Writing modular code that I can reuse, I've kind of learned to write stuff as sort of reasonable functions, sort of finding and using existing libraries. One of my co-workers was trying to completely rewrite something with Python to open tab delimited text files and open certain formats of DNA sequencing. There's a library called BioPython, which does everything that she wanted to do. She just didn't know about it. So this is like the sort of level of what I'm talking about. And this person is a postdoc who has their own funding to set up their own research group. So there are quite a senior level and doing computational biology. Aim for reproducibility starts small and keep learning. I keep being frustrated because there are a million things I don't know and feeling like I'm a total failure because I don't know a lot of things. I'm kind of realising that as long as I keep learning stuff and keep implementing the things that I do, I'm doing okay. Biology educators should introduce computing early, teach it repeatedly. It's like my dad's old saying, throw enough shit at a wall and some of it will stick. I think learning is a lot like that probably for most people. I know it is, it's like that for me. I need a repeated exposure to things in order for it to actually sink in and for me to feel confident with using it. I think problem-solving based learning is probably the way to go with stuff. Lectures, especially about computing stuff, it doesn't stick for me. It may do for other people. But I think that's hard. Oh, I've said start small and keep learning again. Sorry. And lastly, this is the one thing I wish somebody had taught me about programming before I started. I think my feelings of programming are a lot like as if I was fighting a large snail with a small stick. I'm failing all the time. Maybe 80% of what I do is wrong and failing. And for the longest time, that made me feel like I was the one that was failing. And then I talked to some people who are software engineers and I talked to some people who are some of the cleverest computer science people that I know. And they're like, yeah, it's like that for me too. And that kind of popped the balloon of feeling like I was a total failure. And I want to spread the word about that. Probably if you're failing, it's because you're doing something hard. And talk to people, ask questions, because you're not alone in feeling that way. And finally, I just wanted to share a couple of resources that I found that have been really helpful. This is a group called the Software Sustainability Institute. I've been to a couple of their conferences, talked to people. Their aim is to improve research software. Their slogan is better software, better research. And they run conferences and do stuff like bringing together software engineers and scientists, talking about issues in research software and training. I found them really helpful. And they put me in touch with a group called Software Carpentry to try and run a two-day training bootcamp thing for my group, which I'm in the process of doing. That's it. Thanks. Hi. I work in academia in engineering. And we don't release the code behind our simulations. How do you think we could go about pushing out code that leads to a lot of vital research? I'm kind of in the process of figuring that out myself because I haven't done that before. Well, I have been in a very, very poor way. There are some journals that you can write purely code stuff. I'm not totally familiar with how to do that yet because I'm in the process of learning. I'm trying to release a small pipeline that I wrote probably as a virtual machine to make it easy for people to use and capture all of the dependencies. And we were looking at a journal called Giga Science that seems to be one way of letting you do that. I'm afraid I'm not an expert on how to do that yet. I want to learn, but if you talk to these guys at the Software Sustainability Institute, they're the people I'm talking to about that kind of thing. Sorry. It's not as much a question as a promotion for the... So along with the Software Sustainability Institute, we've created a research software engineers community in the UK. I don't know if you're aware of it at all, trying to get a group of people who are doing this kind of stuff both from the computing side and from the biologist trying to figure out how to do it better side to really push to get academia to appreciate the work that goes into software development because at the moment it's really difficult and it's hard to justify spending time writing decent software because you don't get grants on that basis. You get grants on publications and if you're wasting six weeks making your code reusable then as far as money goes, you're wasting your time. So, yeah, if anybody is doing this kind of stuff and is interested in trying to push to improve the situation in academia, www.rse.ac.uk, we've got an AGM coming up soon and a Hack Day in London. So, yeah. Thank you. No, thanks. That's awesome. Thank you. I'd like to give a round of applause to Jane.