 So welcome to our presentation. Hope you guys had a very nice morning. Very interesting keynote there. We're going to talk about large-scale batch processing with Argo workflow and events. Myself, Rakesh, I work as a backend engineer in Intuit's data processing organization. And I have with me a co-presenter, Bala. He's the MVP. He's the core contributor and developer of Argo workflow. So give him a good shout out. So you guys can bug him with a lot of questions, so right after the presentation. So in this presentation, we'll be covering Intuit's background, overview of batch processing platform, the use case, why we built it, architecture, and how it integrates with Argo workflow and Argo events. Following that, we'll talk about the scaling challenges, the scalability journey that we went through, and the efforts that went into getting it to production scale. And we'll wrap up the presentation with a quick demo and what's next, what's coming next in the Argo events and workflow and what's cooking. So before we get into the presentation, a quick shout out to what Intuit does. As you guys are aware of, it's a global technology platform that helps customers achieve financial confidence. We have over 100 million customers across the world, 14,000 employees working really hard to serve those customers. And we are a strong open source contributor, both in promoting and also building open source projects and getting it out there. So in this slide, we'll start with the batch processing platform overview, starting with some personas. Who's the target audience that we built the presentation for, the platform for. And so before we deep dive, I wanted to briefly touch upon some of the personas that the processing platform helps with. Data engineer who wants to focus on transforming data rapidly so that they can aid their customers. Data scientist who wants to build ML models at a faster rate and make their applications more intelligent. And platform engineers who wants to build data engineering applications so that they can provide these applications to other data personas who can build better data processing models. So let's talk about, in this slide, we'll talk about the Intuit's data processing flow from a very high level. So Intuit has several customer facing products with millions of users, be it small businesses, tax filing, and also for verifying your financial health. So between the products, before the data, with the data that we get, there are a lot of use cases that we serve with the data. We need data storage, real time recommendations to the users, data enrichment, data curation, fraud detection. So from the product lineup, before the data arrives in Intuit's data lake, we have real time transformations done using our stream processing platform. And once the data arrives in the data lake, we have batch platforms performing post-processing and analysis from which we'll get reporting, model training, feature creation, and other data that can be used for real time and similar real time feedbacks back to the products. So with that in mind, we'll talk about the Intuit's batch processing requirements. From a platform standpoint, we wanted to build a company-wide solution for scheduling and orchestration. A lot of the teams want to have data pipelines or applications running on a schedule, and data is usually interconnected. So we wanted to provide dependency management for data pipelines and applications. We wanted to provide a standardized runtime compute layer for data and other processing use cases. In Intuit's batch processing, we support Spark and containerized application runtimes in multiple Kubernetes clusters that we manage. We provide the seamless experience to the users to build data pipelines, build containerized applications, deploy them, and schedule them based on different dependencies. So Intuit manages 60,000 pipelines and 10,000 concurrent pipelines run at any moment. So the platform is also catered to solving a lot of engineering problems. So the impact this has is over 1,500 engineers, including the personas that we talked about besides that there's also other engineering personas that are impacted using this platform. The slide seems to be taking some time to load up. So we'll move on to the next slide. So before we talk about how it integrates with Argo, I wanted to quickly talk about the platform's architecture. This is a very high-level architecture, and we'll talk about how we integrate with Argo workflow on events for the orchestration, which will specifically focus this presentation on. So from the platform, the platform has multiple different services. I've highlighted a few here. So we have the pipeline management service, which helps with repository creation, build environment deployments, integration with the artifact tree, and observability. So from a data, any engineering personas, what we expect is them to out of the box have the ability to get their environment up and running. And then we provide the capability through a job orchestration service using the workflow and the events layer for scheduling their jobs and creating dependencies. And we have namespace service to orchestrate their runs on the compute layer on the multiple Kubernetes clusters that we manage, and notification services for alerting a notification through various mediums that we have. On the right, so we can see that we have a representation of what we call as a pipeline. So a pipeline is an abstract which we use for calling a scheduling on a compute layer. So basically, a pipeline has a stack of process that with the data that is getting ingested and the data that is outputted. Each processor is a code artifact. And the processor does transformations on the incoming data. And the pipeline manages the processing of this entire layer and the dependencies. So we'll get more clarity on these abstracts in the coming slides. So in this slide, I put a quick representation of how the stack from a DAG on the left translates to Orgo and how we manage the dependencies. So basically, so in the previous slide, we saw the Intuit's processing flow with the real-time stream transformations and also the batch transformations. So in the batch transformation, we can see that there is a DAG which is translated to pipeline dependencies through workflows in Orgo. So the root nodes here are, in this example, Cron workflows that are running on a schedule. And once the workflow is fired, there is downstream dependencies. So downstream dependencies usually can be between data. They can be between application layers. So it depends on your use case. But usually, we support a resource-based upstream dependencies between multiple pipelines. So these are independent pipelines that are managed by different organizations, different teams in Intuit. And there are downstream that are depending on this data. Maybe there is a data enrichment being done in step one. And then we want to process and write the data to Hive in the next step, and then do post-processing and build reporting and model analysis in the next step. So we support two types of pipeline dependencies, primarily time-based dependency, which we support using the Cron workflows and trigger-condition-based dependencies, which we support using Orgo, EventSource, and Sensor. So in this example, the upstream and the downstream workflow, or like we discussed earlier, is a pipeline with multiple steps in it. And the dependency management is done using a Kafka EventSource that we have built, and the sensor, which takes care of the downstream triggers. So in this example, it's a one-to-one, but we support one-to-many. This is also supported natively in Orgo events. So this just shows the capability that you can build combining Orgo workflow and events, complete orchestration and a scheduling infrastructure for all your data pipelines. So in the next slide, I wanted to talk about the scalability journey that we went through in terms of both scaling the Orgo events and Orgo workflows. And we'll take a quick look at what we have built internally and into it, on top of what Orgo offers. And we'll talk about the next steps. So I put different iterations that we went over in the architecture, specifically this slide targets how we scaled Orgo events. This targets both the stability aspect and also the scalability aspect of this infrastructure. So from the previous slide, we can see that between two workflows, the internal dependencies, we managed through an EventSource and a sensor that has an internal NAT stream bus. So the initial issue that we ran into with this architecture is that the EventSource, there are different types of sources that the Orgo events support, one including the workflows. So the way we were orchestrating is the EventSource will listen to the upstream workflow completion and emitting event to the NAT stream, which will then utilize the sensors for doing the downstream triggers. But what happens during a critical failure? What if an entire node goes down or an EventSource completely goes out? In those scenarios, we face some issues with fault tolerance where an upstream workflow completes, but an event is not captured because the EventSource went down. So during the initial journey, we migrated all of our EventSource to Kafka. This is a natively supported feature in Orgo events, which meant that we could turn down, even if during critical failures, if the EventSource pods are unavailable, the data is still persisted in an intermediate queue. And so this was the first step that we took about for stabilizing this. But there is still one big scalability issue that we've had. So with each of our pipeline dependency management, we are deploying the workflows, which takes up pods in your Kubernetes clusters. We are deploying the EventSource and sensors, which takes up more pods. So we are just adding up more pods for each pipeline dependency management. So we were looking into solving the problem of creating independent EventSource per workflow. So the good news is that we were able to collaborate with the Orgo events team. And so we came up with the solution of migrating to JetStream bus and a shared EventSource. So this is supported in the most recent versions of Orgo events, starting from 1.7.1, where you could utilize a single Kafka EventSource for all the upstream shared workflows. So we have reduced the resource utilization on the EventSource by 99%. And we have migrated to the new JetStream bus, which supports the capability for EventSource to rely on a single EventSource event. And it can filter on the data that's coming in from the workflows. So this is our current iteration. And in our target iteration, we are working on moving the sensors to a shared sensor model. So the target state is that for each Orgo events name space, which is the orchestration layer, we will have a single compute stack for the EventSource and the sensor, which will manage all the pipeline dependency management for tens of thousands of pipelines that we manage every single day. And the downstream triggers are handled by the EventSource compute layer. So there is an effort going internally to migrate the event sensors and decouple the manifest and the compute so that we can scale to the structure, which we'll be posting more about in the upcoming Orgo release notes. Now I'll hand off to Bala to talk about scaling Orgo workflows. Thank you, Rakesh. Are you able to hear me? So based on the batch processing platform requirement, we need like a 10,000 concurrent workflow has to be run. Each pipeline will have like three parts. That means 30,000 parts need to support. If you want to support this massive workload, we cannot achieve with a single cluster. So we need to move into that multi-cluster architecture. So when we are thinking about the multi-cluster architecture, the one main requirement we got is like the distribution need to be more dynamic and it need to scale in future. Based on that, we start designing like our multi-cluster orchestration service, which will distribute that load balance that workflow into that multiple clusters. And we leverage that Orgo event to support that pipeline-to-pipeline dependency. When you see that each node clusters, you can see that workflow controller, workflow API server, and Orgo event. The every API server will be registered into that orchestration service so that the orchestration service will know that what are the clusters are registered so that it will schedule the pod into that each cluster in the dynamic way. So each cluster, when the workflow is completed, it will notify the status to that Orgo event. Based on the dependency configuration, Orgo event will notify back to that orchestration service to schedule the downstream workflow. This architecture, we are able to achieve like a one-to-many, many-to-one, and many-to-many pipeline dependencies. With this journey, we open source some of the scaling feature into that Orgo workflow also. The one feature we did like a HA on that workflow controller. Some of the heavy-loaded environment, if you see that the scheduling of pod will take a time. In the BPP environment, most of the workflows are like a Chrome workflows. If the workflow controller is taking like a little bit time, the Chrome schedulers are missing. So we enabled that HA using like a Kubernetes leader APIs so that at the given time, there will be only one leader and the rest of the pod will be a follower. Whenever there's any problem, the leader immediately that follower will pick it up and start working on that. The next feature we enable like the PDP support on that workflow pods. Whenever you have like a frequent auto-scaling clusters, the scale down will trigger the node drain, which will evict that of the workflow pods, which will end up to that workflow step failure or you need to rerun that again the entire step. To avoid this issue, we enabled that pod disruption budget as a first-class citizen for that workflow so that every workflow will create a PDP for their workflow pods and once the workflow is completed, the PDP object will be get deleted. This is the way we can prevent that pod eviction from the node drain. The third feature we implemented is like a rate limiting that concurrent pod because every cluster will have like some concurrent pod limitations. So we enable that the rate limiting the concurrent pods in the multi-levels. The user can configure in the cluster level so that the total cluster, if I want to restrict for the 10,000 parts they can configure it in the cluster level and each namespace level also they can configure it to give a multi-tenant way of like some namespace and we need a more level and in some namespace we need a restructural level we can configure it. Then there is another special use case like if you are integrating our workflow with the external system like a database or some other APS server which need like a rate limiting. So we implemented the semaphore and mutex to control the concurrency in that container level. So this way we are able to achieve like the batch processing scalability and we used that this all the features in that our platform to achieve that batch processing platform requirement to support like a 10,000 concurrent workflows in our clusters. Let's give back to Rakesh to demo the platform. Thank you guys. In the next slide we'll quickly look at the visual aspects of what we were talking about. I think that will paint a better picture. So we talked about having two abstracts for the platforms that are integrating the engineering. So we have internal workflow into a developer portal where users can build a data processor. So a data processor like I mentioned earlier is an Intuit artifact, a code that can be run in the compute layer of your choice. These are reusable artifacts that can be scheduled in multiple different pipelines. These are shareable processors. So for example, in this slide, you could see that like a processor that is being built, this is of runtime spark. It's a language that the user has chosen as Python and right out of the box, the pipeline management layer builds up the repositories for the users to get started, the CI CD, and also the artifact reintegration. So the platform users can just focus on primarily writing code and not the other aspects. So the next step, like once a user registers a processor, they could create a data pipeline, which is the compute and the orchestration layer where they could specify the artifact. They could stack multiple processor artifacts and have the pipeline set up. And you could specify the scheduling. And so in the scheduling, like we discussed earlier, using the Orgo events, we are able to support resource-based scheduling and time-based scheduling. So in this example, the pipeline has to run every day at 3 a.m. on completion of this upstream. Data pipeline and on completion of this resource. So on meeting this condition, the pipeline will automatically schedule. So this is how we have decoupled the dependency management between organizational wide data pipelines. So once the pipeline is created, we provide the dashboards for managing your clusters, your observability, Splunk dashboards, cost management, and Spark history server in this example, because it's a Spark pipeline for data processing. And we also provide the execution history. So all this is a layer that's built on top of the Kubernetes clusters we manage for the compute and the clusters we manage for the Orgo orchestration. So in the next slide, I just wanted to quickly give a viewpoint of the capability and what it enables the engineering platforms to do. So this is an example of a cluster created for one specific customer with an Intuit using the data platform. So you could see multiple different DAGs being constructed and there's multiple complicated clusters which are interconnected through the dependency management and also the compute orchestration that we provide. We'll quickly wrap the presentation by talking about what's next. So we are currently working on Next Gen Orgo event sensor, like we mentioned in the previous slide. So this is the next scalability journey that we are currently ongoing. So where we are decoupling the sensor compute and the manifest management to reuse a single sensor for managing all your dependencies. So and we're also working on an Next Gen multi-cluster workflow for supporting cluster load balancing through the workflow controller which Bala will give a few words about. So based on the... Hello. So this is one of the top demand in the open source supporting like a multi-cluster workflow. Still we are evaluating how we can natively support the multi-cluster load balancing for the Orgo workflow controller. I think the yesterday maintenance notes I mentioned that we are targeting for 3.6 feature but still we are looking for the community to give their use case and everything to understand better for the common requirement for the multi-cluster workflow. Do you guys have any questions? For the orchestration service, how does it work to like distribute different workflows to different clusters? Are you storing like the Kube configs within the orchestration service or is it like an event-based? So basically the orchestration service as like all the Orgo APS service are registered into that so that it's not going with the Kube config, it's going with the Orgo server being and it has a memory of like a how many workflows are running on the each clusters so that based on that it will load balance it based on the configuration. So does it dynamically determine which cluster to schedule the workflow based on the number of workflows that are in each cluster or does it? Can't get some more dynamic. So still we are working. So currently the mantra orchestration service will have like all the registered clusters and based on those registration it will repeatedly schedule it in the wrong property. And it will keep track of like how many workflows are running that because there is a feedback loop through the Orgo workflow for Orgo even which will tell you back that this workflow is done you can start the downstream workflow. Okay, but I think the question is how do you determine which workflow should run in which cluster? Can't? It's done through a round-rabin fashion. Yeah. But currently we don't have any specific, this workflow need to work under this cluster. We don't have that kind of system because all the workflow can run under all the clusters. So when it comes to the Kubernetes clusters that we manage one of the key challenges with running Orgo workflows in scale is pod management. So we run all our Kubernetes clusters in AWS accounts and there is IP limitations on like how many pods that can come up with low latency on a specific cluster. So the need for load balancing between multi clusters comes from that basis. So currently the BPP's orchestration service is performing this load balancing through the feedback loop with the Orgo events in Kafka but we are natively looking to support this capability in the workflow controller so that this could be managed more natively. Hey, question about the cluster size of the number of clusters you guys are running. How did you tune or determine maximum cluster and how many clusters are you sort of running this workflows on? So currently per cluster we are targeting around 10,000 active workflows. So because our pod usage per workflows around two, three pods. So that's around 30,000 IPs and the concurrent workflows that we support per cluster is around 3000. So we, for supporting 10,000 concurrent workflows we are running on three Kubernetes clusters that is dedicated for the orchestration. So we have a separate compute clusters for running our Spark processing jobs and the container applications. And we are running out of time. I have a couple of questions from the virtual audience. Is your experience for large batch jobs is it preferable to have many small workflows or a small number of very large workflows? So in, so small number of workflows means that there is a high amount of resource utilization. So if we are able to optimize for a pod usage per workflow you could run as many workflows as you can but we are usually optimizing towards larger workflows so that like the resources are shared more optimally. Okay, another question quickly. Does your UI allow you to define relationships between workflows? If so, how does it know that you're output of one compatible with the input of another? So the compatibility in our case is specifically tied with data because we are talking about data pipelines. So the workflow integration are done through the events infrastructure. So the Orgo events track the workflow state change and using the state change events we emitted to Kafka from there we do downstream invocations on other data pipelines. So completion of a workflow which in our case data pipeline we assume the availability of data. So on the downstreams usually that are relying on the data gets triggered immediately based on the conditions but we do support complex conditioning using the Orgo events. Because Orgo event will support that complex dependency logics. So based on that if the one pipeline depends on the multiple pipelines we can configure it in the Orgo event side. Okay, we need to wrap up. There are a couple more questions on here. Speakers, would you be able to go into the platform later and answer those questions for them? Sure. Okay, great. Yeah, so we will be in our inputs booth. So you guys can come meet us and like shoot us our questions. We will talk more in depth on the architecture and also any other help that you guys need. Thank you. Yeah, thank you guys. Thank you. Thank you.