 Hello everybody, so I'm Nicola Croquefair. I'm here to talk about Celery, which is a task engine to orchestrate, to combine some tasks. We will talk also about Director, a tool we made in HoVH to easily create this workflow. Okay? Why we need to work with Celery? We will see that together. To be on the same base, we will see what is Celery, and I will make a quick demonstration, a very basic demonstration. And then we will see that in our team we have some custom needs. We need to execute some background tasks, but some future provided by Vania Celery was not enough for us. So we created a tool, Director, which is no open source this week. So you can try it right now. What is Celery? This is the official description in the Celery documentation telling Celery is an asynchronous task queue based on distributed message passage. Celery supports real-time processing but also periodic scheduling. What does it mean? In fact, the important words here are task queue. What is the task queue? In fact, it's really simple. It's just a mechanism used to execute some tasks in other matching threads. How to do that? When we are talking about task queue, we are in fact talking about producers and consumers. On the middle of the screen, you have the queue. In Celery, a queue is named a broker and the most common ones are RabbitMQ and Redis. The ID is to pass message from the left of the screen, from the producers to the consumers. The ID is producer does not want to execute themselves the task. They want to make it executed by another system, another consumer. OK? So just in summary, what is Celery? It's just a mechanism of Python library used to execute tasks. In fact, it's just Python code somewhere else. And why to do that? Just some use cases. For example, to not block the user, if you are working in a web service and your user makes a request on it, you don't want to block it. You want to execute a long-running task somewhere else and then return a quick response right now. Your producer cannot have enough resources to execute the task because it's a complicated task to do involving some CPU bound and you don't have as a producer the resources to do it, but your workers can do it. You have big machines to do. We can also talk about network accesses. There is lots of use cases for some Celery tasks. And here, I will show you how to create tasks and workflows using Celery. Just for that, this is what we will use in this demonstration. Just remember there is two parts, the producers and the consumers. To produce message, we are using the dot delay method. There is some other methods to send the message in the queue, and now we will go to use the delay method. And on the other side, we have the Celery command providing some subcommands and one of these subcommands is the worker command. So we will produce message with this and we will... I don't know if... We will consume message with that. First part of the demo will be just create simple task and now we will see how to combine this task with some Celery primitives and we will see the chain and the group primitives, OK? And this is the demo part and I hope it will work. OK. Remember, I need to have a worker running so I already installed a Redis instance. You can have a RabbitMQ. You can also have five systems. So no running instance in it, but I prefer to use Redis. Oh, yeah. Indeed. So I think it's over there. Colours. And this one, no? OK. This is Celery. Thank you. Do you have any questions? OK. OK. So let's start. I already installed my requirements so I'm in the virtual environment. As you can see. OK. It will happen. Jupiter, just here. OK. Do you see? OK. It's OK. I can do that. OK. So I already created a task file named task.p containing all my Python code to execute somewhere else. How to do that? First thing to do is to import the Celery class, of course, the Celery application. If you've already used a web framework like Flask, for example, you know you need to create a Flask application and use it somewhere in your code. It's the same thing using Celery. I created my application. This is the name of this application, and I need to give it the connection to the broker. I also give it the connection to the backend, so I can do it. OK. And here I have two simple functions. This function is already simple. It's not the important here to make some complicated code. I just wanted to show you how to send a task somewhere else. So we have a first function to use to return a random number. And we have another function used to get first parameters and return the addition. OK. This will be my producer. The notebook will be the producer. And I have to launch the consumer like this. As you can see, there is my task file. OK. And I can launch it. OK. As you can see, Celery is launched and discovered my task because we transformed this Python function into Celery task using the decorator. OK. So first thing to do is to, of course, in the producer import the function. And it's not because we transform a Python function into a Celery task that we cannot execute it as a normal Python function. So here, OK. I already have my results, sorry. I can execute it normally. As you can see, a random number has been executed. This is just a Python notebook because I send it in the background. OK. So if I did not stupid, please. OK. No, it's OK. Now I can execute it in using the delay function. Using it, I will, in fact, send a task in the broker. I will not execute the task in the producer. Instead of it, I will send it in the broker and as I have a worker running, it will execute it. What I have in return is a nothing-result object. This is, in fact, a Celery object telling me maybe the task is finished, maybe it is not, but this object allows us to have the state of the task. Is it finished? Is it pending? And so on. As you can see, my worker really executed the task. It was not my producer, it was my worker. And as I said, we can use some async-result method like the dot-get method to really have the result. And it was the good number. OK. It's just a simple task showing you how to send a task and how to execute it somewhere else. Now we can use a Celery primitive to combine this task and create some workflows. One simple workflow will be this one, a chain workflow. This workflow, in fact, will execute some task in the right order one after the other. OK. We will see that. I have to import the chain canvas. Here I call one first task, the random task, and another task. As you can see, I'm using .si function. I'm not using .delay, why? Because here I want to create a signature. I want to create a new task in the broker. And as you can see, I have my two tasks created. Nothing has happened here. It's still my notebook, sorry. Nothing has happened. But now I can really execute the .delay function used to apply this task. And as you can see, my two tasks has been created. OK. We will use the group canvas. This canvas allows me to launch some task in parallel. As you can see now, I want to first one, execute two tasks in parallel and then retrieve this result. There are results. How to do that? Using the .s function. .s will take the result of my previous task. .si does not take care about the result of the previous task. Here we can see that first we execute get run in parallel and then we take the sum of this task. OK. Still my notebook logs. We have a nothing result again. And our task has been executed in parallel using tuple worker. And the sum of these numbers has been taken, has been executed by the getSum function. And now I can take the result of this canvas. OK. This is how to use Celery. As you can see, it's really simple, but I think it's powerful to use it because of this simple API. Is this my talk? OK. This is it, I think. Is it OK? Sorry for that. This is just in the file to have the code of the demonstration. OK. We will pass on that. Here we created the chain. We executed the result. Here we created the group. Executed and have the result. And OK. As you can see Celery is powerful, but it suffers some problems for us in our team. It's really difficult to see the dependencies between the task and the workflow. So we wanted a tool, allows us to track the evolution of this task. Maybe one task failed, but what task? And what other task cannot be handled because of this failure? It was complicated. We also wanted to execute them using some API call like this notebook. We wanted to create workflow using YAML format. So task has been created of course in Python, but in a folder and in another way, in another YAML file, we combine this task to create workflows. OK. We wanted to periodically execute some workflows. Celery allows you by default to periodically execute some tasks, but not a world workflow. And this is in our to-do list. It's not yet provided in the current version of GitHub, but we wanted to retry a failed workflow that had a specific task. For example, if several dozens of tasks succeeded and the 12th task failed, because we know that our tasks are important. So we need to store the result of this task and then re-launch the workflow at the failed task because we fix the problem. OK. So how to use the director and I sequence it. The demo. So again, the installation is just peeping style. Celery director here. Peeping style that. If you're doing it, you will have a new command named director. First thing to do, I will check if I... First thing to do is to create a workspace. So just remember, this tool will allow you to easily create new a Celery project. It's a kind of framework above Celery. Director in it and I will create a workflows space for example. OK. First thing to do is to as it's written to set this environment variable. I can now go to it and when you are doing it, you will see that there is a task folder containing an example. OK. So here I just have to import the task decorator, create a Python function and decorate it giving it some name. There is a simple example given to you by default which is an extract transform load example just using some print function. And you have the YAML file telling it if you want to do this kind of simple workflow you can do it using this syntax. OK. Celery needs to store the result of the task in its own database to make the dependencies easier to display. So for that, first thing to do is to create a database. Here I have a file by default it's using SQLite but of course we recommend to use a more powerful database like PostgreSQL or we are using SQL and now I can list this example right now I have this one and I can execute it. So remember I don't have to open a Python terminal, I don't have to open a producer and import the source code. I just have to use this command I give it some default payload, it will be an empty payload. In the next version we will remove this useless information but if you want to say foobar you can. It's director workflow one sorry. And now the task has been sent in the queue, so in the redis. I can now open a worker to consume them. We didn't see anything but task has been executed here. I just print task. How to display them because it's still not useful. We can now open this. I will open this one. We have the web UI telling you we are in a small format so the screen is not really beautiful here but we have our workflow and we can display the task executed. It's director. If I have some time then yes I have it. I can show you a failed execution. So the ID is for example to create an error file. So from director import it creates my task. Giving some hugs. This is the default signature of our curbs and I can make an error. Okay. I transform this one into task. Give it a name. I think it's okay. And I can now open this one. So here I just copy past because I have my notebook used to help me. Okay. I think it's okay. It's okay. Yes. So here I created a new workflow name WillFailup containing several tasks. The first one will be the head struct one and then we created a group containing several tasks. These tasks are a transform task. We'll succeed. And here we have our failed task will be failed because of the division by zero. And this task will not be executed of course. But we will see that in the Web UI. I cannot execute it. Name is WillFailup. I think I'm not sure. I can okay. The worker. Of course the idea is to have several terminals, several shells or several containers launching all of this function. And now we can see that in the Web server which is just here we have an error using showing you the workflow. And we can see that our task failed here. I don't have time to execute flower and show you the trust back. But if you are also using flower which is a well known tool in the salary community you can check here the complete trust back. Today in OVH we are using this tool to manage our workflows and our task and it's really easier to make it because we can use some Web services to call our task. I show you here how to do that using using some cli like this. But it's so important to note that we can now execute a workflow using some API call. Using some post call. We also want to provide you a way to execute a workflow directly in the Web UI. And this is a result. And here if you are using flower you can display the complete trust back. Now useful to make some investigation. And it's finished for me. The tool has been open source this week. It's really fresh. You can try it and give us some feedbacks if you want. This session is really easy as you can see. So thank you for your attention. And we have some questions.