 Hi, everyone. Good morning. I'm going to talk to you about Walmart cloud-native platforms and how we have extended Kubernetes to build a control plane and deploy applications out to thousands of clusters. So whenever any of us think of Walmart, we do think of the world's largest retailer. And we think of thousands of stores and some of them mega, mega stores. And we also think of new-age commerce. We think of online and omnichannel commerce. And we also think of next-generation in-store retail. And a majority of these applications are built and run on the Walmart cloud-native platform, which is what I'm going to talk to you about today. So first, we'll just go over an overview of the platform to get an idea of what this platform is and what it does, and then dive into how we've extended Kubernetes and managed application orchestration across the platform. So Walmart began its cloud-native journey with Kubernetes clusters being adopted at Walmart stores. This is what we know as call our edge location. And after that, our cluster footprint gradually increased and moved into the cloud as well, which would be Walmart's private data centers, as well as public clouds that Walmart is using. After this is when we actually started designing and building our control plane, which we did over multiple phases. And right now, our platform is the go-to platform for new applications being built and developed in Walmart. So the platform itself, at a very high level, is a GitOps-based platform. It allows application developers to focus on their application code. And from then on, the platform manages pipelines for testing, integration, and deployment of applications to production. And it's a fully-battery-included platform that integrates with the entire Walmart developer ecosystem and provides a secure environment for running our applications and production. And lastly, it allows developers to focus on their application code and takes care of everything from that point on. So operating this platform at Walmart scale, right? So what does that mean? So Walmart scale is pretty big. We have thousands of stores. We have thousands of clusters deployed to each of these stores. And this forms one digital geography, which we call edge. And the second digital geography that exists for us is in cloud, where we have clusters deployed in multiple regions and multiple countries as well. And this provides a very diverse environment in which we need to run and operate our controllers and our control plane. And over the years, we initially grew a lot in edge. And then we started growing in cloud. And we continue to grow a lot in edge as well as in the cloud. So extending Kubernetes. So generally, when we think of extending Kubernetes, we think of Kubernetes' custom resource definitions, which is the way that we have also gone about in extending Kubernetes. And combined with our custom controllers, this allows us to implement our entire platform's business logic. The custom resources are used to store the platform state. These can be combined into three major categories. The first is inventory. The second is networking. And the third is namespaces and applications. And the namespaces provide us an abstraction to manage multi-tenancy in the platform. And applications provide a mechanism to deploy applications out to multiple clusters. High level overview of the control plane architecture itself. So you have your groups of resources to store the platform state. Some of these controllers interact with resources provided by other controllers itself. So for example, your inventory controller would generate monitoring resources to monitor your inventory. And it would generate networking resources to manage the DNS record. So you can load balance request coming into a cluster. Or you can have an A record to reach an application that is deployed out to a cluster. And on the right side, what you see over here, so these are our workload clusters. And we have a controller that runs in each of our workload clusters as well. And what you see on the left side is our control plane cluster. So there would be one control plane cluster, but several thousand workload clusters. So application orchestration. So when we peek under the hood, there are three main facets of application orchestration. The first facet is the hub and spoke design, which is what I just spoke about. You have a central controller, which is your hub. And you have an endpoint controller, which is your spoke. And the way we have implemented this is we have a multi-cluster application specification. This is generated by GitOps pipeline, right? So developer writes code, commits it to Git. A pipeline is executed, which generates a multi-cluster application specification. That specification is then reconciled by the controller, generates a cluster application specification, which represents a single deployment to a cluster. When I see deployment here, I don't mean a deployment in the Kubernetes sense of a deployment, but I mean a Helm release. We treat Helm as an atomic unit for managing releases out to our workloads clusters. And the endpoint controller, in this case, it synchronizes an application specification and then triggers a job, which actually manages deployment and all the hooks necessary for before a deployment and then after a deployment. The second facet of design is in fleet orchestration. So here you can see a multi-cluster application specification. It provides a Helm chart and a set of values and then a set of targets to which this application needs to be deployed out to. And we allow developers to provide a static list of targets, which is pretty limited and generally used in our lower environments. But in production, we allow developers to target a fleet of clusters. And for about 80% of the use case, we have platform-defined fleets. But for about 20% of the use case, we need to allow application teams to provide their own cluster fleet specs. And cluster fleet specifications itself can be defined in two ways, either a static list of clusters or a bunch of label selectors. And we also have a controller that would, based on a cluster fleet spec, generate a list of clusters. Because inventory is always changing. A cluster can be added, removed from inventory. And in that case, we need to keep an updated version of our fleets. And the third facet of our application orchestration is about how we manage to scale this whole thing up. So again, on the left side, you have cluster application specification, which is reconciled. You have a specification for each individual cluster. And this is synchronized to the endpoint. But our endpoint clusters are of two types, one run in an edge location, where CPU memory network is very restricted. And we need to conserve on those resources. And the second kind of environment we operate in is cloud, where all these resources are available in plenty. And we can easily synchronize a large number of resources instantly. So to scale up and handle the thousands of edge locations that we have, we also maintain an index of every cluster, which is updated whenever any single application deployed to that cluster changes. And this single index, which is less than a KBN size normally, is what is synchronized to an endpoint. So it's pretty low on network. The second aspect of scaling up is our deployments are actually executed by Kubernetes jobs. So these jobs are triggered by the endpoint controller. And the jobs itself are where most of the resources get used up during a deployment. And this allows us to keep our endpoint controller pretty lightweight on CPU memory as well as network. And this has really helped us in scaling out to edge, where these resources are very restricted. And generally, deployments in edge also happen at night when stores are closed. So that time, we do have a few more resources available to deploy new versions of an application out. So why did we do all this, right? The control plane design and its architecture began in 2019. At that time, there was a community called SIG Multicluster. There were a lot of projects under that community. And we still went and designed and implemented a custom solution. So the main reasons for this was our edge environment was very different from what most of these projects were targeting. So we needed to focus on WCMP edge. And that really required a custom solution. The second reason was that we had a unique set of challenges in the sense that our control plane came after the platform itself. So we built the control plane after we already had a large chunk of the platform running in production. And the last reason was one of the design goals in our control plane architecture was to offload as much as we can to the Kubernetes control plane itself. So we did not want to reinvent the wheel in the sense of creating maybe a multicluster deployment and a multicluster stateful set or a multicluster demon set. We wanted to rely on Helm and then rely on the Kubernetes controllers to manage deployments, replica sets, stateful sets, because Kubernetes does that really well. And we wanted to be able to leverage that. But we still wanted to be able to represent our namespaces and applications in an abstract form in a control plane. And the easiest solution for that was to build something of our own. And because the community projects out there did not fit exactly what we were looking for. So some of the lessons that we learned along the way, and some of the challenges that we faced. So one was building performance controllers, especially when talking to external networks and making external API calls. Performance of a controller is important in very two key aspects. The first is ensuring that your reconciled loop finishes really quickly. And then the second is in ensuring that, say, if your controller restarts, you do not bombard and dedos an external service. So we did face challenges around that. And we have been able to fix all of those challenges. One of the other challenges that we faced was with scaling up the Kubernetes API server itself. Because as a platform scale, there were a lot of requests coming into this API server. And we fixed that or worked around that by caching almost all our platform state in memory. And most of the consumers of any platform state use APIs and get the data they need from a cache instead of directly from the Kubernetes cluster. One other challenge that we faced and was in local development and testing. So we saw this using kind. As kind became more stable, it really solved the problem for us. So kind is about running Kubernetes and Docker, which allowed us to run everything locally on our laptop. And that really sped up the process for development for us. The other problems that we faced were around synchronization of the specification and status from a control plane hub to an endpoint spoke. And the last was on cross cluster finalization. This itself is not really solved for us completely. But we do not see any operational issues from it on a day-to-day basis. And it's something that we are still looking to solve. If anyone has any ideas, please email me. We would love to hear these ideas. And some of the things that worked well for us was extending Kubernetes. There was a lot of familiarity in the team with Kubernetes. And extending Kubernetes was really helpful. The second major thing that worked really well for us was using Helm as an atomic unit and not using any of the Kubernetes core resources like deployments or stateful sets or demon sets. Some of the things that didn't work well for us was in scaling at CD. We have a pretty large at CD cluster at this point. At CD itself, by default, comes with a 2GB limit. As of now, we've increased that limit to 8GB. And things are very sluggish when reading and writing to at CD. And this is a problem that we are looking to solve going forward. The last one I want to tell you about is about having monolithic CRD specs. So this is something you need to be very careful about when designing a specification initially because an interface once in place can lead to codebase that becomes unmanageable down the line. So some of the key takeaways for us, the first one was to partition early. If we had partitioned early, we would not run into the problems with at CD that we have run into now. And another one which are kind of related is for building for failure and outages and for implementing a robust and resilient feedback loop. Because our system is distributed in edge locations where network is patchy. There could be a storm. There could be an outage of one day, two days. And we realized that as we scaled up and we hit more and more natural disasters at time or other issues, we have seen operational issues arise from feedback loops that are not resilient enough for this environment. And sometimes we've had outages itself of the control plane. And that has also been an issue for us in the past. The last thing I want to leave you with is Kubernetes is a really great platform for building other platforms. It provides a lot of mechanisms to extend. And it's a great place to start because your team is already familiar with Kubernetes. But it's not always the end goal. The end goal is to build a platform and sometimes APIs and databases and the way of doing things that we've always had works better. And we hope to be able to open source things in the future. We have shared with the community in the past in a previous KubeCon. And we are sharing with you all again today. And we hope to keep doing that. And we would love to hear from any of you all. So if you want to reach out to me, please do. And thank you for your time.