 Hello, everyone. Thank you for joining me. My name is Sever. And this presentation is about the library that I'm working on, which makes it easy to model and run distributed workflows. There will be a Q&A section at the end, hopefully, if there is not enough time. You can also stop me during the talk and ask me questions if anything is unclear. OK, so let's get started. We'll start by discussing a bit what the workflow is. And then I will show you a quick demo and spend the next part of the presentation trying to explain what happened during the demo. So the term workflow is used in many different contexts. But for our purpose, a distributed workflow is some kind of complex process, which is composed of a mix of independent and interdependent units of work that are called tasks. Usually workflows are modeled with DAGs, which stands for direct acyclic graphs, think dependency graphs between the tasks. And they are modeled using some domain specific language. Or with ad hoc code, like when you have a job queue, but what you really try to accomplish is to have an entire workflow and you use the job queue and the tasks in the job queue to do some work, but also to schedule the next steps that should happen during the workflow. Neither of those provide a good solution. And the reason for that is because DAGs are too rigid. You cannot have dynamic stuff happening there, usually. And the ad hoc approach, where you have the job queues, tends to create code that is hard to maintain, because the entire workflow logic is spread across all the tasks that are part of the workflow. Another problem with the ad hoc approach is that usually it's very hard to synchronize tasks between them. So if you want to have a task started only after other tasks are finished, that's usually pretty hard to do. Flow takes a different approach for the workflow modeling problem. And it uses a single-threaded Python code and something that I call gradual concurrency inference. Here's the toy example of a video processing workflow. At the top, we have some input data. And in our case, there are two URLs for a video and a subtitle. And then there is an entire workflow that will process this data. And what it will do, it will try to overlay the subtitle on the video and encode the video in some target formats. It will also try to find some chapters, some cut points in the videos, and extract thumbnails from there. And we'll try to analyze the subtitle and target some ads for this video. So the interesting thing here, and it's something that you cannot easily do with DAGs, is the part where the thumbnails are extracted. This is a dynamic step. And the number of thumbnail extraction tasks can different based on the video. So this is where you need some flexibility. So next, I would like to show you how this workflow is implemented in Flowey. And then, like I said earlier, I'll try to explain what really happened there. I start with the activities or rather the tasks. And in this case, I'm using some dummy tasks. You can see all of them have some sleep timer in there just to simulate they are doing something. And they are regular Python function that there's nothing special about them. They just get some input data, do some processing, and output a result. So this is similar with what you will get in Celery or a regular job queue. This is the workflow code. So it's the code that would implement the workflow that we saw earlier. Again, it's a regular Python code. We are just calling the tasks. But there is something funny about it because it has a closure, and we are not importing the task functions themselves. And there is a reason for this. This is a kind of dependency injection. And there is a reason for it. And we'll see later why this would be useful. Other than that, there are just function calls and regular Python code. Actually, I'm going to demonstrate that this is not anything special by running this code. So what I do here, I import all the tasks and the workflow function. I'm going to pass the tasks to the workflow closure and then call the closure with the input data. And this will run the workflow code sequentially. I'm also going to time this execution. So it will take a while because of the timers that I have there, forcing the task to slip. And hopefully, yeah, that's what happens. Sorry about that. I'll try again. I don't know what's going on, but whatever. Yeah, something's wrong. So usually, it should work with just regular Python code. So there is no reason for it not to work. But the interesting part here, so running that code would take about 10 seconds because of all the timers. And everything will happen in sequence. So the interesting part is being able to run this as a workflow and have all that concurrency happening. So I'll try to do that. OK, so it went much faster about two seconds. And the reason for that is because all the tasks that could be executed in parallel were executed at the same time, as we can see in the diagram that was generated. So the arrows there represent a dependency between the tasks. And we can see a lot of them were being executed at the same time. So I'm going to try to explain how that works and why it went so fast versus the previous version, which didn't work. So in order to understand what was happening during the demo, I have to talk about workflow engines first. And we begin with a simple task queue, where we have all the tasks that we want to be executed. The workflows are pulling the tasks from the queue and are running them. And as I said, when you have an approach similar to this, there must be some additional code in the task that will know to schedule other tasks when they are finished. So they also generate other tasks beside the usual data processing that they are doing. And this is not very good because the workflow logic will get spread. And like I said, it's also very hard to synchronize between different tasks. So another idea would be to have the task generate a special type of task called the decision. And what the decision does instead of doing some data processing, it will only schedule other tasks in the queue, so it acts as a kind of orchestrator. Like we can see here, the error from the storage to the worker is reversed because the orchestrate the decision will read data from the data store in order to try to get a snapshot of the workflow history and the workflow state. And based on that state and all the tasks that were finished, it will try to come up with other tasks that must be executed next. But this solution is also not very good because you could have concurrency problems. So if two tasks finish one right after the other, you can get two decisions scheduled. And if those are executed in parallel by two workers, they will generate duplicate tasks in the queue. So this is not a perfect solution. So in order to improve this even more, we need to have the queues managed in a way that all the decisions for a particular workflow execution will happen in sequence. And for this, we introduce another layer that we'll ensure this. Another thing we would also want to add is some kind of time tracking system that will know how much time a worker has spent running some tasks. And it can declare the tasks as timeout if a certain amount of time passes without the worker doing any progress. So this is not something new. This kind of workflow engine is implemented and provided by the Amazon SWF service. It's also available as an open source alternative in the Eucalyptus project with the same API that Amazon has. There is also a Redis-based engine similar to this in the works that I know of. And there's also the local backend that you saw earlier in the demo. And the local backend will create all this engine and the workers in a single machine on a single machine and will run them only for the duration of the workflow. And then everything gets destroyed. So hopefully by this time, this was the workflow code in the demo. So hopefully at this time, you kind of get an understanding that this code will run multiple times. So every time a decision needs to be made for this workflow to have progress on it, this code will be executed again. So if I were to put a print statement there and run the workflow, I would see a lot of print messages. OK. So I mentioned earlier about dependency injection and why that's needed. And the reason for it is because Flowey will inject some proxies instead of the real task functions. And the proxies are callables and will act just as a task would, but they are a bit special. So when a proxy is called, the call itself is non-blocking, so it will return very fast. And the return value of the proxy is a task result. And the task result can have three different types. It can be a placeholder in the case that we don't have a value for that task, or maybe the task is currently running and we don't have a result for it. It can be a success if the task was completed successfully and we do have a value for it. Or it can be an error if, for some reason, the task failed. The other thing a proxy call does, it looks at the arguments and tries to find other task results that are part of the arguments. If any of the argument is a placeholder, then this means that the current activity or task cannot be scheduled yet because it has dependencies that are not yet satisfied. So it will track the results of the previous proxy calls through the entire workflow, like we can see here. So in this case, when the code is run for the first time in a workflow, the embedded subtitle task will be scheduled and its result will be a placeholder because we don't have a value for it. But the calls for the video encoding won't schedule any activities because they will have placeholder as part of their arguments, meaning that there are unsatisfied dependencies. And in this case, the results for the proxy calls for the encode video task will also be placeholders. So what this does, it's actually building the DAG dynamically at runtime by tracing all the results from the proxy calls through the arguments of other proxy calls. And finally, workflow finish its execution when the result, the return value, contains no placeholders, meaning that all the activities or all the tasks that were needed to compose the final result are finished. And like you can see here, this is true for even for data structures. So we have here a tuple, and the values are inside the tuple. And this will continue to work. And the templates there are in our list. And those will also get picked up. So you can use any kind of data structures for the return data as long as it can be JSON serialized. That's what it's used for serialization. So there are a couple of important things to keep in mind when writing a workflow. Basically, what you want is for all the decision executions to have the same execution path in your code for the same workflow instance. So for all the decisions that belong to the same workflow instance, this usually means that you have to use pure functions in your workflow. Or if you want some kind of side effects, either send those values through the input data to the workflow or have dedicated activities for them or dedicated tasks for them. So the other thing you can do with the task result is to use it as a Python value. Like we see here, I'm squaring two numbers, and then I'm adding them together. And when this happens, if any of the value involved is a placeholder, meaning that there is no result for it yet, a special exception is raised that will interrupt the execution of this function. So in effect, this acts as a barrier in your workflow, and it won't get passed until you have the values for the results that are involved. This also means that if you have code after this place that can be concurrent, it won't be detected. So you have to make sure that you access the values as late as possible to have the greatest concurrency. A similar thing happens in the original code of the example where we iterate over the chapters that are found in the video. So here too, this acts as a barrier, but being at the bottom, it didn't affect the rest of the code. So you may have not noticed it. Another example is when you have a situation like this one. So here I'm squaring two numbers, and then I may want to do some optional additional computation. And it's not clear in what order the if conditions should be written, because in this case, if the b computation, so squaring of the b is the first one to finish, because I have the conditional on the a value, it will have to wait until the result for a is available to progress further in the workflow. And no matter how I try to write the code, there will always be a case where the workflow cannot make progress until the other value is available. And this is kind of a problem. But it can be solved with something that is called a sub workflow. So here I refactored the code that did the processing for each number in part in a sub workflow. And then in the main workflow, I'm using the sub workflows as I would use a regular task. And this way they can all happen in parallel. And when both are finished, I can sum them and return the result. So workflows are a great way to do more complex things that you couldn't without them. And another thing to notice here, in the main workflow, I didn't have to do anything special to use the sub workflows. They are used just as regular tasks. So for error handling, you might expect the error handling to look something like this. This is how a normal Python code would look like if you had some exceptions in a function. But this is not possible because, as I said earlier, the proxy call is non-blocking. So you cannot get the exception at this point. So actually, this is the place where you have to write your try-accept clause. So the reason for this is because only at this point we can force the evaluation of the result. And only at this point we know for sure if the computation was successful or not. And this looks a bit strange. And I don't like it too much. There is a better way of doing it using the wait function and it comes in flowy. And what this does, it will try to de-reference the task result. And it's similar as doing an operation on it. And the name is a reminder that this will act as a barrier, so nothing will pass this point until and not only that it won't pass this point but won't be detected even if it could be executed in parallel until this value is available. But this is not always the case. You are not, maybe you don't want to use the value in the workflow itself. You just want to pass the value from a task to another task. And in this case, how do you pick up errors? So what would happen here if the result for B is an error? When you're passing an error in the arguments of another proxy call, the proxy call will also return an error. So the errors propagate from one task to the other. And if the result value that you try to return from the workflow contains errors, then the workflow itself will fail. So you cannot dodge errors. You have to deal with them. Or you can ignore them by not making them part of the final result, in which case you will get some warning message that you had some errors that were not picked up by your code or handled. So the workflows can also scale by using some of the other backends that I mentioned earlier, the Amazon one or eucalyptus. And when you want to scale, basically nothing changes in the workflow. So you would still use the code that you saw earlier. There are some additional configurations that you have to do that happens outside of the code, so are not part of the code. Because when you scale and you want to run the workflow on multiple machines, in a distributed system, there can be all kinds of failures. There are some execution timers that you can set. And those will help you with fault tolerance. There is another type of error that you can get when you scale, which is a timeout error, which is a subclass of the task error that we saw earlier. So you can have special handling for timeouts. There is automatic retry mechanisms in place. And you can for the timeouts. And you can configure them as you wish. There is also the notion of the harbids. And the harbids are some callables that a task can call. And what it does when a harbit is called, it will send a message to the backend telling the backend that the current task is still doing progress. But another thing that it does, it will return a boolean value in the task. And that boolean value can be used to know if the task timed out, in which case you can abandon its execution. Because even if it finished the execution successfully, its result will be rejected by the backend. Another thing to keep in mind, you should aim to have tasks written in such a way that they can run multiple times just because of the failures that can happen and the retries. The tasks or the activities I'm using, they mean mostly the same thing, can be implemented in other languages. So you can use flowy only for orchestration and workflow modeling. So the engine and the logic to run the activities. There are some restrictions on the size of the data that can be passed as input or the result size. Each worker, so when you are scaling and you run multiple machines, you would have workers that are running continuously, not like we had for the local backend where they were running only for the duration of the workflow. And those workers are single-threaded, single process. So if you want more of them on a single machine, you have to use your own process manager and start them and make sure that they are alive. And if the history gets too large, so the decision must use the workflow history, the workflow execution history, and the workflow state to make decisions. And if the history gets too large and actually the data that is transferred because of the history has an exponential growth, you can reduce that by using sub-workflows. Sub-workflows will only appear as a single entity in the history. So you can get, basically, you can get logarithmic data transfer by using sub-workflows in a smart way. And because of default tolerance built in, you can scale down. So you could, like, for example, all the workers can die at some point in time. And then after a while, they would come back online. And the workflow progress won't be lost. You may still lose the progress on specific tasks, but the workflow itself, the workflow progress, won't be lost. And this is very useful for workflows that take a very long time to run. I think the maximum duration for Amazon is like one year for a workflow. So this can be very useful in some situations. And you can also scale up very easily. Just start new machines. And they will connect to the queues and start pulling tasks that need to be executed. Thank you. That was all. If you have questions, I think now it's a good time. How does this compare to Celery? There is Celery, you can create tasks, and it will automate them. How can you compare it? Yeah, so with Celery, so Celery is a distributed task queue or job queue. And it's a bit different because here you have the orchestration of the tasks. So if you have many tasks and you want them to operate in a certain way with some dependencies between them and to pass that data between them, you can do that by writing single-traded code. And from that single-traded code, the dependency graph will be inferred for you. And it will make sure that the tasks are scheduled in the correct order and they get the data they need passed in. So I would use Celery for one-off jobs, sending an email or something, but not for hundreds of jobs that are somehow interdependent. Yeah, it also has Canvas, which is more like a DAG where you define your workflow topology before not in such a dynamic way you can do with single-traded Python code where you can have conditions and for loops and all that. Thank you. What asynchronous library you use as the bottom of the flow? Sorry. What the same name? Asynchronous library. AsyncDB, or AsyncSeo, or maybe Giviant. I don't think I'm using any asynchronous library. For the local backend, I'm using the futures module to implement the workers, but there is no asynchronous library involved. Thanks. Yeah, in the example workflow, you showed one of the tasks returns the list, so the list of chapter points that then gets fed into something that builds thumbnails for the chapters. Do you have to wait? Does that task essentially block until every single chapter has been found? Or would it be possible, maybe, with code changes, to support, say, a generator function so that you could start building the thumbnails for the first chapter while the task is still finding the later chapters? So here it will block. So any code under the thumbnails line won't be executed until we have the chapters. And this is because the fine chapters returns a list. And it's a single result. And we cannot get partial results from the task. So we have to wait until the entire result is available. Yeah, so anything below that will be blocked until the result is available. And this isn't such a big problem, usually, because there are ways to write the code in. And this doesn't become a problem. Or if it is a problem, you can create a sub-workflow. So I could have a sub-workflow that would do only the fine chapters and the thumbnail generation and then call the sub-workflow from here and have that running in parallel with the other code. Sorry, just to follow up, then. Does that mean that in this example, add tags, which you could start processing immediately, won't be executed immediately because you're waiting for the video encoding to finish? No. So in this case, in this example, all the tests that can be executed in parallel will be executed in parallel. So the actual execution topology will basically look exactly like this one. So this is how it will get executed. That's why the workflow duration was about two seconds instead of 11 or something. The time for the last question? Kind of a repeat of the previous one. He made a good point about not the thumbnail line, but the line above where it's finding the chapters and returning the list. It won't return from fine chapters until it's found all three of the chapters. But if you could convert fine chapters to be a generator, or get it to return next chapter, and then you can do the thumbnail for the first chapter while fine chapters is still finding the second chapter. So yeah, you could have a task that will only find the first chapter and return that, and then call the task again, and it will resume from that point. You can actually send the last chapter and find the next one. And this way, you can solve the problem if you want to. So it really depends on how you write your code. You just, the only rule you have to remember is that when you try to access a value in the workflow, it will block until the value is available. That's basically the only thing you need to know. Anything below that point won't be detected and cannot be concurrent. And that can be solved through sub-workflows. Saber, thank you very much for your talk. Thank you. Thank you.