 Anyway, I want to be respectful with respect to these two projects. My name is JP Zivillich, CTO co-founder of Pipekit. We do Argo expertise services and provide a control plan that's going to be a little bit biased, but I want to give Airflow a fair shake. What is Argo workflows? Argo workflows is an open source Kubernetes-native tool for managing containerized workflows. Key features are the really tight Kubernetes integration. It is YAML-based originally, but there's a great Python SDK that's available in the community called Hera that I would very much recommend you check out. And then why do people choose Argo workflows? So it is perfect for Kubernetes shops. It is extremely generalizable. You can use it for CI, ML, a bunch of great stuff. Great. Airflow. Quick show of hands. Who here knows Airflow or has used Airflow before? Good amount of hands. Okay. Anyway, so it's an open source workflow automation tool, Python-based, comes out of Airbnb, not Kubernetes-native, not a cloud-native tool, but is the de facto standard for running workflow orchestration, especially for data workflows. People choose it because it is popular, well-known, and has historically worked very well. So let's compare the architectures really quickly. So Argo workflows uses custom resource definitions to define and manage workflows. So if you're used to CRDs, Kubernetes deployments, things like that, you should be familiar with how to interact with Argo workflows. There are two main components, the workflow controller and the Argo server. And then there are some optional components that you can tag along. So a SQL database for offloading and archiving workflows, S3 for artifacts and logs. And you can really tack on any sort of other Kubernetes primitive. Really if it's Kubernetes, it works well with Argo workflows. And it has a very lightweight footprint. Airflow has several main components, the scheduler, the executor, which is optionally a part of the scheduler, a web server, a DAG folder where you stored just a bunch of Python DAGs, and then a metadata database. There are some optional components that you can tag along in as well. And if you use the Kubernetes executor, you can offload the workers from the Airflow into pods on Kubernetes. So this is a side-by-side comparison of the architectures. We can see Argo workflows is a pretty lightweight footprint. We got the Argo server, the workflow controller. The workflow CRD just spins up a bunch of dynamically generated pods. On the right, this is a workflow diagram that I shamelessly stole from the Airflow official documentation. So don't at me if you think it's inaccurate. But we can see that on the Kubernetes cluster itself, we have the actual worker pods, but everything else is living outside of the cluster. So you see that's like the main difference that we have here. So talking about the integration with Kubernetes, again, Argo workflows is going to provide that seamless integration with Kubernetes, where you can interact directly with Kube control, Kube CTL, whatever you call it, do like Kube control, like get workflows within your namespace and have access to all the primitives. Now with Airflow, it's not going to be quite as tight. You can query for all of the pods that the workers offset, but that's going to be about it. Additionally, if you want to define your pod, you have to do that in YAML as well, which was news to me given that the promise of Airflow is that it's purely Python based. You can override it a little bit, but it is challenging. Scalability, this is the one that people care about. So we did some benchmarking studies and found that with Argo workflows, really the core limit is going to be the Kubernetes API rather than any of the internal components within Argo workflows. So if you're thinking about scale, Kubernetes is scale, and Argo workflows really is just Kubernetes. There you go. Airflow, it really depends on how you're configuring Airflow. The main bottlenecks that you're going to see are the scheduler and then the metadata database. So if you've configured those properly, then you can hit high levels of scale comparatively, but it's not going to quite give you the same outputs as Argo workflows. Now ease of use, historically Airflow was easier to use if you're a Python shop because it is Python based. That has changed as of late with the new HERA SDK for Argo workflows, which has picked up a lot of broad popularity within the community, and I recommend that everyone check it out. Monitoring observability, I'm running long time, so I'm going to skip this slide. Cool. Quick conclusion. So Argo workflows, some pros and cons, pros. If you're looking for that tight Kubernetes integration, great. Cons, there isn't a great third party like connector library. So that's been a thing that a lot of companies are really looking for. And so Airflow does shine well there. So Airflow, pros, Python based, great connector library, cons, lots of components to manage, not Kubernetes based. And I am good on time. Cool. So right at the buzzer beater, again, my name is JP Zivlich. Thank you so much for coming out to the talk. If you want to talk a little bit more about Airflow, Argo workflows, how like the trade offs between the two, here's a link to put some time on my calendar for some pipe kit office hours. We'll be at booth D28, and we're happy to just have breakdowns like performance benchmarking between the two. We have time for Q and A on this talk, unfortunately. But if you want, I'll hang out for a little bit after. Thank you guys so much for coming out and enjoy KubeCon.