 Please welcome Magdalena. Thanks Richard. Hi everyone. So we are all after lunch. So let's get up a bit and let's do some hand-rising exercises. So please raise your hand if you are working in science. Cool. And please raise your hand if you ever went to university. And please raise your hand if you are programming in Python. Very nice. Now a more serious question. What does it mean successful software in science? Reproducible, documented, tested, installable, reusable, published. Nice. Excited? Yeah? Okay. Beautiful. Nice. Useful. Great. So we all want to get there. And after next 20 minutes you will know free tools that support software development in science. This talk is targeted as people who want to move from writing just script to writing a software in science. So we will go through a case study of Moderna software. It was written in Python and has over 23,000 lines of code. It was developed during four years by four people. Now let's get to Moderna roots. Seven years ago Polish researcher, professor Brunitzky, looked at me over his square glasses and said, Lena, I want to write a program which predicts 3D structure for an A. Nobody done it before. And I thought, great, I will win the Nobel Prize. But how do you start such a big project? So he explained. You want to achieve 3D model of RNA molecule. So you need to find out X, Y and Z coordinates for each atom in this molecule. As an input you will have a sequence for your target molecule which is technically a string. And then this string tells you about residues, about pieces you have in your model. And you will also have a second structure that is similar but has different sequence that your target. I could understand the idea of professor Brunitzky because of my biological background. I thought, what a nice idea. I was excited. I was motivated. I just started to code. And after six years, after six months, we had first prototype. It were three very, very, very long scripts, a set of structures that we could use for modeling and 100 of experimental structures that we could use for testing. At this moment I started to realize that if I continue this way, I am in deep trouble. Because first, we were not really sure which features we already have there. And which features needs to be still implemented to get the program, the project finished. I was not sure how reliable is my program. How to detect and how to remove bugs effectively. And changing something started to be very difficult. It was long. I had to test manually after adding one feature, whether all previous features still works. So it became kind of stressful. The project isn't so much fun anymore. Fortunately, I had quite a nice postdoc, ever optimistic Christian around. And he just took out a set of paper cards and he said, Lena, let's have fun. Let's write a software. First important change we made in our development process was defining small, clear, achievable goals. Before I was going at full speed into pursuit of program that models 3D structure of RNA, which was as defined as one goal, a very huge and unachievable task. So we started to define small steps, clearly defined in one short sentence and we were writing them down. As single sentence on task cards. Afterwards, we were defining how difficult it will be to implement which priority this task has. And finally, who will do the implementation. So for example, here we wanted to change names in PDB file. So there were some atoms that had wrong names and we wanted to correct them. Defining such a clear task helped us to write clear code. Because answering a question, I want to change atom names, it is much easier than answering a question, I want to model RNA. This small question is much more concrete and the answer can be find easier. So as a result, we were able to write one function that does exactly one thing. The name of function is self-explanatory. The documentation states clearly what the code does and the code has just seven lines. So what else we were doing with task cards? We were keeping them visible. We used Pinboard with three columns to do in progress and done. And you can imagine how nice a feeling it is to move a card to a done column. We also experimented with setting limits to in progress column. I noticed that I work the most effectively when I am allowed only for three tasks in progress. So when I want to place fourth card there, I need to first get one thing done. As soon as we had all our tasks in done column, we were sitting together again and we were doing retrospective. So we checked how our estimation for level of priority and level of difficultness, whether the estimations were correct. And thanks to that, we could better estimate in the next iteration. So during the last year of the project, we also started to combine our iteration of cards with releases. It meant that we were getting the constant feedback for our users and that we were keeping our project moving and we had constant motion there. So what are task cards good for? They help you to divide the big picture, your big goal into smaller achievable pieces. With task cards, you can make progress of your project visible. You can iterate. So define set of cards and then when they are done another set and another set, and do retrospective after each iteration to improve your working process. But we were getting another problems. So after iteration, we had to check whether our new features work and moreover that our old features from previous iterations still work. What we have done? We started to use test-driven development and unit tests. So how does unit test work? First, you collect exemplary input, then you define which output you should get out of it and then in the test function, you run function that you are testing on your input and check whether the data you got is the same as the data you expected. Here is a live example. It tests our previous function about fixing atom names. So first, we import unit test module, which is standard module from Python. Then we define the class for tests and there we have our functions and in these functions we have structure with dirty names, so with some problems. We run on this our function for fixing atom names and afterwards we check whether the names are really correct now. When looking for nasty test cases, Tomek put on the first Moderna user was irreplaceable. We spent four years sitting back to back in one room and whenever I heard Tomek sighed and turned his wide muscular shoulder to me that new tests were coming. To give you some numbers in Moderna, we had 702 test functions in comparison to 761 code functions and whenever we developed a new feature or whenever we did refactoring, we were running this 7702 test functions and this is what could happen. Everything went wrong. This one, even though it looks a bit dramatically, it's actually an easy case. It probably means that there is some trivial back like wrong variable name in some basic function. So it was easy to fix. We could have such case. Everything worked. Boring. The developer had nothing to do except thinking, oh how great I am. And this one. This one is actually the most interesting. Sometimes it happened that new function, new feature we were adding, affected other places in the code we wouldn't think about before. Without automated testing, it would be impossible to detect that. So for this case, this is actually a situation where tests are the most useful. So once again, why should we test? Test prove that code does what it is supposed to do. Automated codes secure that your program is reproducible and automated tests also help to define requirements and document the code. Designing tests was also a lot of fun for me. And it brings me to the last point. Having fun and being excited about what I do helps me to keep going, achieving things and getting things done. During developing Moderna, we have a lot of fun working together, defining small tasks and implementing tests. Though we also had some troubles and challenges. For example, I didn't really enjoy it so much documenting and making my code pretty. It was a bit boring. So we found a solution. We started to use pylind. Pylind makes score of your code. It checks whether it is documented and whether it sticks to PEP 8 standard. It also does other useful things. For example, it points to places in your code which are too complicated and which should be refactored. So I seen files which were rank minus 100. But I am an achiever. So it was very motivating for me to boost my code, to boost my score. And my code, my program, the quality of the code improved after starting using pylind. So that's it. During the development of Moderna, we had small goals, we had tests, and we have a lot of fun. So was Moderna successful? We got some ideas at the beginning so let's check. The project was published in nine articles and it gets cited in 28 articles. It is still maintained after five years from the first release. And most importantly, all developers would like to work like this in the same way again. So what helped us to get there? Defining small achievable goals, testing our code, and optimizing our working process so we had fun. I thank the Moderna heroes, Christian, Janusz and Tomek. And thank you for your attention and I will be happy to take questions. Any questions, please speak up. So I was wondering if you heard of Greg Wilson in the software partnership in the organization that he runs. He's active as a scientist. I wonder if you've taken that course and if that's been influenced? The question is whether I took the idea from Software Carpentry. Software Carpentry. No, actually no, but we were optimizing our process step by step. But yes, these are things that are not taken entirely for us. This is taken mostly from agile techniques. And for our task cards you can for example find more official names like user stories. And we are in general experimenting with agile development. You said that the lessons you learned through software development can be a rather guide to a standard research product, a non-software product. Can you write in the paper? Okay, so the question is whether I could apply the methods we learned during developing Moderna in a more general way in science or in science? Okay, yeah. Of course. Do you feel that some of the things you learned from the software development are you currently applying them to other research products? Okay, so whether we could use these methods to not only produce software but also other results. Yes, I think the agile methods could be used also in labs and in wet labs and in other fields of science, not only software development, definitely. Yes? I'd be curious to know how long it took to introduce these things and get used to the techniques. So as I said the question is how long it took to get this process optimized. So, as I said, the first six months were with no real framework for working and after that we kept improving step by step and it took us like I think two years and the last year of the project, the fourth year of the project was the most organized and it was moving the fastest then. Yes? You mentioned a department where you only work on three at most improvised tasks. You have to do something else you have to complete one of them but do you think in the real case in the software three improvised is the same? Because sometimes you have hundreds of tasks where they have no priority you need to just put them on hold maybe for later I understand that this will help focusing on... I understand. So the question is whether having just three tasks in progress is really possible. So I will say yes. It's very difficult to be so strict with yourself and allowed just three tasks but it's possible and we tried it with software and it simply helped us to get faster. So we also like first had a long list of tasks but then when we reduce the task the development went faster and we also try it at home so with our tasks we need to manage our household and we also then limit our paperwork or things that we need to do at home with Pinboard and we also have a limit and it also helps. Yes? The story sounds a bit like a fairy tale. The story sounds a bit like a fairy tale from one story process with the programming instruction code and then you end up with your structure. My experience is that people are often reluctant to change. So the question is that it's difficult to get people from unstructured way of working to the structure way of working and how to motivate and how to get it happen. So my answer is that to use small steps so as I said we didn't do everything at one time we were improving our process in small steps and also we were giving each other very positive feedback. So we really the thing you shouldn't do is like you are working in such a structure way that it's very unmotivating. So whenever you try to do small change and you see someone just a bit adjusting to this change like telling wow what an incredible result it's very motivating and it helps people to want to change.