 Thank you very much, you know, I'm I cannot think of any box that is not ticked by this award First of all, it's an award and it's always nice to receive an award, but it's not in your world, you know, it's It's from my peers and from some of my peers. I most like and respect and You know being recognized by people is one thing being recognized by your family It's harder to achieve and it is so much more worthy that that it is really something I'm very happy and you know, and I guess I'm talking for many of us here six months ago having a presidential conference seemed unthinkable and I will not say, you know, I see all the masks and Be all the blue faces. So it's not normality, but it's so close to normality. That's something, you know I'm tremendously happy to be here, but a little bit more about next flow. So I'm assuming everybody knows everything about next flow, but of course, that's not true Many of you here may not even have heard about next flow So I should really start by explaining what next flow is. So next one is a pipeline language You can write your pipelines with next flow your genomics pipeline typically But it could be anything and you can run them anywhere with next flow It will be your your your it will organize your your your your computation And that's on the cloud on an HPC on a laptop next flow will take care of parallelization in some, you know Relatively straightforward way. I'll explain a little bit and more importantly. It supports containers reproducibility and therefore contributes to the fair concept, you know this idea of reproducibility findability accessibility Interability and reproducibility and so next flow contribute to the are now You know, you've seen all of these things a lot You know, there are plenty of things claiming to do the same thing So I'll try to explain to you a little bit why I think next flow is special and obviously I think next way special Obviously I'm biased so you'll have to to make your own research on this before trusting me But why is it special? So let me very briefly touch on Unix which most of you are familiar with so in Unix We pipe all the time and that's something actually computer scientists really don't like about biologists. Biologists love, you know Barf marriage and to be more precise. They love the concept of piping. So what does this mean? You have a process a that is producing bits of data elements of Identifiable data and you have a B process that will be consuming this data and The bits of data will go through the pipe sign And so the most is the easiest explanation is the easiest example of this is when you cut a file into less the unit of data is a line and it goes line by line and less Receives the data through the pipe and we'll flash it one line after one line That's a very powerful concept. If you think about it, it links. It can link. That's bad that style Whoops, that's where I tell my students to leave the room, please. I Hope I have to figure out a way to put this in silent mode. Yes, okay So so pipe is similar to something that is called reactive programming Which means that you have a program that waits and waits, you know, it's just sign and suddenly something arrives It wakes up and consumes it and then it waits again. The program doesn't need to know When things will arrive, it does not need to know how many things will arrive It simply needs to know that things will arrive packet after packet That's how it works and you can imagine very very complex networks Connected just that way and that's exactly how next floor works and for those of you familiar with Unix make file. It's exactly the opposite of a Unix make file in the make file world You define a graph of dependency and before doing anything you have to compute the entire graph and when you're done You start the computation and so the difference between next floor and make file is that next floor flows without any Big picture knowledge make file prepares and I insist on this distinction because there are really two tools That are on the same niche this day snake make and next flow And it's quite interesting to me to observe that they belong to exactly they use exactly opposite principles And there are plenty of reasons why in some situations you want to do a make fight lie thing And others when you want to do a next-floor style thing In fact, we wanted to start with make file like everybody But if you have very tiny operations, but zillions of Sam the make file was a killer snake Make people found nice way nice tricks to go around this But you know, it's interesting to insist that these two approach are based on absolutely opposite principles so This is what next floor looks like this is probably our first-ever pipeline and what I like quite like about it Is that you know all you have to do is that you have to take? Thing that we're already working bits and pieces of pipeline and you just wrap Sam a bit like you will do with HTML you're saying what kind of input they expect and what kind of output they will generate and then all of these things get Automatically linked to one another by the description of their output this graph here that I'm showing here gets automatically Generated by next floor once you have decided when you have defined who goes where you know What goes in and what goes out and then automatically all of these things get connected and of course the computation is happening here So this thing here will be waiting for it's a little bit occurred. I may be missing something So here this thing here will be generated multiple alignment And what is it? It's just a process waiting and when sequences come in blue It generates an alignment if you deploy this thing on the large number of processors Implicitly it becomes parallelized. You know, this is this is an implicit implementation of Embarrassingly parallel problems which are the problems we encounter most of the time in biology these days You know your typical situation is that you have a matrix to process where every row is independent from the others And you can throw all of these things That's what next for does that's really the core of next floor But there's more to it and the reason it made it in nature biotech is actually here You have all heard about the reproducibility crisis in in research these days We usually think of wet lab and and and gels difficult to reproduce and all these things But this also happens in computer science in computerized analysis They are difficult to reproduce and as it turned out with next floor with the Continuization we address this problem and that's how we could make it in nature biotech I've just posted a blog entry where I explain a little bit how the story went and that's quite an interesting story It's Evan Floddan who figured out this thing and Evan figured out at the time with Calisto And I'm glad that Liar never never hit on us for this because you don't want to have a fight with Liar but the beauty with Calisto is that you know you run it on on Amazon linux on our Maco x6 You're gonna get roughly the same results, you know on the differentially expressed genes, but slight differences These differences are very tiny. They will not change the Interpretation of your results But if six months later you have to rerun exactly the same pipeline to get exactly the same results, you're in trouble Why is it? We don't know think of a computer like a machine with moving parts where every line of code in the operating system Will be a moving part. It's a huge machine. We've never built machines larger than this. You're talking about hundreds of millions of moving part It will be unexpected that two alternative Operating system give exactly the same results for very very low level reasons and that's just what we catch here And the solution the solution is very simple. You just dockerize everything I don't know if Mac OS 6 or Amazon linux is better and I don't care. I know the differences are neglectable I simply want to have exactly the same result anywhere I don't want to go into an hospital with my genome or my exam and be told This is the driving gene and go into another hospital with exactly the same data the same software analysis and them telling me This is your driving gene and the two genes being different. You want these things to be harmonized and that's what? Containerization does for you next floor also allows you to bundle amazingly complicated things with with with a container this thing It's not us. It's what really got us excited. It's a companion Pipeline at the singer. I think they have something like 40 packages that are all bundled together I mean, it's just crucially complicated and they nailed it with next floor You know, we were very excited about six months one year after we put the code online This thing appeared and this was so nice and when you compare when you dockerize it you can see here This is a gene prediction pipeline you get exactly the same number of genes predicted on any platform So that's what you want for reproducibility. It's very important now Of course giving bad news is only fun if you have bad news for everybody, right? Otherwise and so Maria sheds you in my group just figured out that not only was this true for the companion paper Pipeline of the singer, but it was also true for phylogenetic tree. These are tiny variations We have here. They you know on most of your trees. It will not even change the topology They will Neglectively change the branch length But if you think that in the next data sets like the kinases for instance when you'll have finished the Earth by your genome You will have about one point one point five million Species for nearly a billion kinases to align and to classify the topology will change for sure with these tiny variations and so What can you do exactly what we have been proposing with next floor? integrating integrating your integrating your computation with Dockerized version of your pipelines and then that allows you know everything is quite smoothly integrated in next floor You can simply automatically check out what you need from github and all these kind of things Now you know over my career. I've often had to I've had a few successful projects But you know when you're successful you always have to ask yourself. Why is this successful and more often than not I have to admit that I did not find the true reason for success And that's very important because if you know why you've been successful You can keep pushing in the right direction But if you don't know exactly or if you don't find the true reason for your success You may be pushing in the wrong direction and this happens every once in a while So I had to ask myself what made next floor so successful because it's true 500 site We did not expect it to be that successful frankly and so why is that first of all? This is the urgency It's easy to use for very urgent problems. We had in the lab You know, we didn't we didn't sit with a team of computer scientists and and wrote the documentation and implemented it in C++ and all these Kind of things no we were just in a hurry. I just started the lab in Barcelona I had tons of pipeline that were important for my production And I had to get this thing in production now because and the students were long gone and they were very good students who had better Things to do than writing documentation and therefore I had no idea how to run this pipeline I took all the pipeline then I tell to Paolo here are the pipelines Please do a miracle and he did and this really was made by users for users And that that's really I think if I had to say what I believe to be the main reason for the success of this project It's a small-scale project by users for themselves the alternatives were excruciatingly complicated CWL was a killer and the user community, you know The small people immediately recognized themselves with us, you know, they recognize that you know It's a lot like ours and they've developed and I think you know, they make Roughly the same trajectory for the same reasons There is another point that I want to stress because it's very important as you can see here No, you cannot it's readability and You know, I'm sorry. I'm French And so I I I tend to think that I'm allowed to consider myself a philosopher for a few minutes every once in a while So I'll do a little bit of French philosophy. And so why is readability so important, you know All of us in this room and maybe unfortunately some of us already Some medical pipeline will soon decide on our future And within five to ten years maximum, you will get your exam done or some part of your genome and This will go into a pipeline and the pipeline will spit out a number and If this number is below some threshold, well the social security will deny you Access to the magic treatment that fuels everybody and has no side effect, you know Like for instance, the hepatitis C New drug or this kind of thing, you know, if you're above the magic number, you will get the treatment now The sense of unfairness that will will unfortunately feel when this happens to us will be Devastating, you know, it's like a double punishment. You're sick and you don't have access to the correct treatment to the best treatment What you think will should be the best treatment if you do not believe In the fairness of the pipeline that generates this output, you will not only be devastated You will be angry against the system, you know, there will be a lot of anger Therefore, you must have people you trust you will not be able to read the pipeline But you must have access to people whom you trust Who will be able to read the pipeline and the larger the community of people you trust able to read these pipelines The more likely you as a citizen ought to trust the system and to accept that this is not a system being unfair It's life being unfair to you and life being unfair to you is something we can accept and comprehend. Okay That's why I think that making pipelines readable, you know, we always talk about open source pipeline But we have to go beyond they also have to be readable and I believe that's one of the things next floor contributes to you know And you know, we are in the COVID era. We've all heard about fake news and that's really a pain And I have this, you know, you can quote me on this probably a lot of people I've thought about this But if news were alleles fake news will be the fittest by far It's unbelievable that the fitness of a fake news is amazing and why do we get fake news? It's because we have black boxes things that are not transparent enough and the Conservationists have a very good time a very easy time putting anything they want in these black boxes and The only way to fight against all of this is not the government telling you this is true. This is not true No, it's full transparency. This is our only defense against black against fake news And I believe that readability of decision-making processes is just as important as anything that will allow us You know in France we have this strange algorithm for assigning students to to to universities and For some time the algorithm was not open source and this created a lot of tension So this has to be open source all of these things have to be open source And I'm very happy to to to mention NF core Which is a follow-up of next floor not driven by us It's driven by sign life lab in Sweden and especially feel you well And these guys have generated an amazing collection of reference pipelines very well annotated Very well documented extremely readable and transparent by my own standards And this is not only a collection of pipeline. It is also a Standard for next floor and this is having a huge influence on how these things are being used We are we are big users of NF core with something called Bovrag, which is one of the H 2020 Project for a better. It's it's kind of encode for farm animal and this is meant to help ensuring better sustainability With farming under under the fang umbrella and and NF core is a big thing for us because this is allowing us to Exchange to to to make a lot of things interoperable across all of these consortiums now One of the things that that made me the happiest with next floor is how successful my students have been you know, you run a lab and The people going through your lab are meant to be successful when they leave the lab They are meant to be successful Academically or in business and in this case for former members of my lab went on to incorporate two highly successful companies life beats by Maria Chazoo and Paolo Pablo Prieto and Sekira labs by Paolo de Tomasso and Evan Flodan and you know these guys have been tremendously successful they've raised money and These two guys Sekira lab have just raised five million euro dollars or well five million euros I think and five point five dollars and they secured to Shan Zuckerberg Initiative grants, you know, that's how successful this project have been has been in such a small amount of time now You know like boring used to say Predictions are difficult especially for the future and so I'm not going to tell you that they will become a unicorn or not But I secretly dream that they may become the first Catalan unicorn It's it's certainly not something I called Guarantee But it's something that has suddenly become possible and that it's extremely exciting to think that such a success has been coming out of Such a low-scale, you know hands-on projects like next floor was in its early days And so what is next on our table? And what are we going to do is and actually I have to thank the award Even though it's not a huge amount of money It's it's the best money you can get in academia because it's soft money So all the things that you have to justify and that are complicated to buy become easy to do with this thing So this is really a very important lubricant. I have to just take a second to thank the surgery You know, I will never have secured any grant on next floor. I'm not a 90 guy My lab is not a 90 place And so we try to get a little bit of money, but we never manage the reason next for exist It because it's because we have a core budget We have a good core budget at the surgery and with this core budget I can do what I want and if I fail, you know every five years I get evaluated and the surgery has the possibility to tell me, you know You're using the resources, but you're doing things good enough You have to leave the place for someone who would but as long as you can achieve things you get Supported unconditionally and that's what a core budget is all about and that's why next flow is here And that's why we've been successful in quite a few projects actually thanks to this core budget And so the next thing is something called NF benchmark, which is about the interoperability of benchmark This was actually if I go back in time, this was the original intention behind next flow I think building blocks that are easy to connect with one another and you can measure a lot of things And now with deep learning this is getting very exciting and this is going to change a lot There is also another development, which I'm very happy about it's now genomics and mathematics We want to start, you know, I started this journal about three years ago now Some members of the SEB are on our board like like Philip and Philip Boucher And we are going to start a section dedicated to pipelines So that's a section where this will be application notes But really centered on pipeline and making sure the pipeline or deposited the way they should be They are documented the way they should be so that they can be programmatically used and combined with one another You know, and all this IT plumbing that is so beautiful when it works well And, you know, the kind of things you'd like never to have to know about because they work so well Because that's what IT is really about, you know, you want to forget It's like electricity, do you know how electricity is produced? I have no idea 150 years ago there was an electricity chief officer in every company in, well, 120 years ago There was an electric chief officer in every company And you don't want to have an electricity chief officer in your institute, right? That's the last thing you want And we'll have also something we'll deploy called Ocean Which is a cool thing, you know, about the reproducibility of papers With paper coming along with all the codes interoperated And so let me finish by thanking all the people who made this possible And especially Evan Flodon and Paolo Di Tommaso Who really were the leaders on this project Jose has just joined us and he's working on an NF benchmark And a lot of that stuff is available on nextflow.io, the code and everything if you want to run it I'll mention here that the Swiss bioinformatics is going to run a nextflow workshop Mid of November, this will be run jointly with a CRG I don't have the pointers, the page is not up yet But get in touch with Diana Marek at CIB And she will give you, this will be online, and this will be online, so this can be worldwide There is a limitation in that tendency because we provide support to the users And so we have a limited number of people who can help with this Thank you, thank you for your attention and thank you again to the CIB for awarding me this award