 Welcome to this talk on operationalizing Kubernetes site cards and production at Salesforce. My name is Mayank and I'm a software architect at Salesforce on a team that provides managed Kubernetes and platform as a service offering for Salesforce engineering teams. So who is this talk for? This talk is for everyone who operates Kubernetes clusters and especially for those brave source who also operate mutating emission controllers. More about mutating emission controllers in a bit. We will dive into how mutating emission controllers became popular at Salesforce and is now one of the most important pieces powering foundational infrastructure services in our public cloud hyper force architecture. We'll also look at our open source framework that makes it easy to add new site card use cases across teams. Mutating emission controllers are not without their own challenges and we will look at what are those challenges, what considerations we need to keep in mind when operating so many emission controllers and how our continuous monitoring framework helps make it easy to operate these emission controllers in production. We'll also show some alternative ways of managing site cards which should be considered and lay out a recipe of how to choose between those and emission controllers. By the end, I hope you will appreciate what it takes to operate site cards using emission controllers and we'll be able to take some of these learnings back home. So let's dive in. So what is an emission controller and back to mutating? An emission controller is a piece of code that intercepts request to the Kubernetes API server prior to persistence of the object but after the request is authenticated and authorized. Emission controllers may be validating, mutating or both. Mutating controllers may modify the object they admit so why talk about mutating emission controllers in a talk on site cards? Because mutating emission controller is a very popular mechanism for injecting site cards in Kubernetes and that's what we are using for our site card. So let's look at the journey of a typical API's request to the Kubernetes API server. In the picture, you can see the internal layout of the API server. The request is handled by a HTTP API handler, passes through an authentication and authorization module which validates whether the request is authenticated or not and if the user is allowed to do that operation. It passes through a set of mutating web books. The mutating web books are invoked in order, alphabetical ordering by their names and each of them get a chance to modify the API resource to meet a certain goal or use case. After all the mutations are performed, the API request passes through this object schema validation module which verifies that the API still adheres to the schema for a given version and looks same. If everything looks good, the API request now passes through a set of validating emission controllers they are again invoked in order and each of the web books do some validation for certain scenarios. If all of the validations pass, the API request is finally persisted in that CD so that other components of the Kubernetes can start acting on it. Let's briefly look at when they were really introduced and emission controllers are not a new feature and have been part of core Kubernetes since January 2015 when the basic admission controller framework was introduced and added always admit plug-in. Since then, around 30 core admission plug-ins have been added into the Kube API server binary. In 1.9, the Kubernetes authors added the concept of mutating emission controllers and allowed service owners to write web book services that can send arbitrary patches to modify the API resources before they were persisted. Since then, the mutating emission controllers have been used by cluster owners to enforce various behaviors or add extra functionality by default to all resources running in the cluster. One of those use cases was injecting sidecars. Let's take a little detour. We'll use the word hyperforce in the next few slides. So what is hyperforce? Hyperforce is a complete rearchitectural of Salesforce designed to deliver an even more powerful and scalable platform to support the growth and success of Salesforce global customer base. Hyperforce will empower Salesforce customers to securely deploy Salesforce applications and services from anywhere while using the scale and agility of the public cloud on top of the major public cloud providers. To learn more about it, please go to the URL on this slide. Salesforce hyperforce architecture is all about public cloud and uses Kubernetes as the container orchestration layer. Using the mutating emission controller concept, services running on hyperforce get Salesforce infrastructure integration using dynamically injected sidecars. Currently, there are more than 10 sidecars which serve various infrastructure integrations like decrypting secrets from VOD, refreshing PKI certificates, MTLS communication using Istio, service to service Authzy using OPPA and Istio, Docker container signature validation, logging and metric shipping and more. Every Kubernetes cluster managed by our team by default comes with these admission controllers and can be expected to be available to provide these services. So what is a generic sidecar injector framework? When Kubernetes announced the support for mutating admission controllers in 1.9 release, the sidecar pattern really became a first-class citizen of Kubernetes. Many of the infrastructure teams at Salesforce independently chose the Kubernetes mutating admission controller webhook model to dynamically inject sidecars in Kubernetes workloads. This worked well until we realized that each of the teams was writing the exact same code, writing the same set of unit tests and integration tests and producing the same Docker image, writing the same health chart, debugging the same problem in Kubernetes clusters. At that point, we realized we should take a step back and see if we could derive a common pattern. We discovered that each team was using an annotation on newly created pods to trigger the injection of one or more sidecars. Each team had their own annotation, namespace, sometimes more than one, and an annotation trigger. If the annotation was present on the pod, that meant it was target for injection. We looked around in the open-source world to see if someone had already solved this gently. We did find at least one open-source project, but it didn't fit our situation particularly well. Or meet all of our needs. At that point, we wrote a spec for what an ideal sidecar injector would look like, dropped some code into a new repo, and the generic sidecar injector was formed. So what is it? At a high level, the generic sidecar injector is a mutating admission controller that allows injection of additional containers, init containers, or volumes at the time of pod creation. How do you use it? The generic sidecar injector uses generic configuration that consists of two parts. What needs to be injected? Sidecar configurations. What triggers those injections? We like to call them mutation configurations. Separating out these configurations allows teams to specify multiple sidecars and multiple mutations, and independently choose which mutation injects which sidecars. This loose coupling supports different team structures, such as if one team is supporting multiple sidecars or each team is supporting just once. You can see in the picture sidecar configuration which specifies two sidecars, sidecar one and sidecar two. And a mutation configuration which specifies a mapping between the annotation that triggers the injection and what things it will inject. In this particular case, the mutation configuration will inject sidecar one for logging annotation and sidecar two when it sees monitoring annotation. The framework supports injection of containers and init containers and volumes in response to annotations. It also supports multiple mutation conflicts, meaning you can specify many annotations at what injections they will trigger. This allows you to independently choose which mutation will trigger which injections from the sidecar conflict. It also supports configuration of the injected sidecar via annotation on the pod as well as sidecar configuration as goal and templates where certain parts of the sidecar being injected can come from the pod being created in addition to the native way which is using environment value. So what are the advantages of this generic sidecar injector? No need to write code for injecting new sidecar. It's very easy to sometimes get this wrong. In one of our earlier KubeCon talks on war stories, I talked about how a small invalid patch in an admission controller, a mutating admission controller brought down the whole cluster. You can learn more about that from the talk. Seven teams within Salesforce are using the same code to solve multiple critical infrastructure sidecar needs like monitoring, logging, certificate rotation, image signing, etc. Inner sourcing avoids duplicate work, avoids reinventing the wheel, avoids repeating the same mistake and is allowing these teams to collaborate more closely on the code, configuration, design and any other problems that they discover. Let's now switch gears and see what does it actually take to operationalize sidecars, especially when they are being dynamically injected using mutating admission controller. At Salesforce, we built a continuous monitoring framework exclusively for mutating admission controllers. Let's dig in to see how it works. In the picture you can see at the top right corner a simplified spinnaker pipeline which is used across hyperforce for all service deployments. It is showing three stages where the test and the prod stage deploys some mutating admission controller or a combination of them depending on the team structure and dependencies. The middle stage called fit is basically a change promotion stage which runs some integration test to decide if the webhook deployment is looking good to promote to the prod environment. The fit stage here is part of the framework which can be used by any webhook to test their change promotion by calling a configurable set of standard metrics. Every managed Kubernetes cluster also ships with what is called a synthetic deployment which is configured to receive all the important sidecars we care to monitor and a scalar component shown on the left, bottom left, which is on the left which is configured to continuously scale the synthetic deployment up and down so that it will keep on receiving new injections continuously in exactly the same way as customer port with. The synthetic deployment also has a test sidecar which continuously tests for functionality provided by the injected sidecar and exposes Prometheus metrics. There are also additional tests running outside the synthetic port which expose additional Prometheus metrics which exercise sidecar functionalities which cannot be exercised by the test sidecar running inside the synthetic port. The fit spinnaker state which is querying for metrics looks at a mathematical function on all metrics it cares about from not only the tests running on synthetic ports but also the metrics emitted by the sidecars themselves from actual customer ports to make a decision on when the new webhook deployment is working as expected. The scalar component ensures high availability of the webhook and also ensures that we don't need to rely on an actual service deployment to know if the new webhook is broken or the webhook is not available. The metrics coming out of the synthetic port and the sidecars are not only used at change promotion time but also used for always on triggers to alert the operations on-call for malfunctioning webhooks or degraded functionality. Multi-tenant clusters across sales ports is the norm. Testing starts with the sidecar owner testing their changes in a standalone test cluster they own with the current version. Kubernetes reliability engineering team or Kali for short takes stable versions of all sidecars published by the sidecar owners and tests them in their own test clusters. After they are done with their testing they start rolling out a tested bundle of all sidecars in dev clusters. They use the framework discussed in the previous slide for continuous monitoring for any alerts while the rule out is going on in the dev clusters. The Kali team waits for the new changes to sync and bake in the dev cluster for some days before expanding the rule out to new clusters and in new environments until all of production is done. All sidecars are expected to make backward compatible changes when rolling out new sidecar versions. There is always a mix of old and new sidecar versions running together in the same cluster and that is expected and so far has not caused issues in the scenarios we are using them from for. Again, these processes are not perfect and we are still refining and automating them and you will hear more as we perfect them. What are the challenges of operating available? There are many challenges. They look easy and magical in that the sidecar that provides this extra infrastructure supporting functionality now seems someone else's problem but it does not come free of cost. If the service owner part now depends on the other sidecars, it basically means the creation of the service owner part is dependent on the availability of eight other web books or some combination of those web books depending on how these sidecars are implemented. Also, sometimes the service owners are surprised on where these new injections are coming from while debugging. Another issue is when should the sidecar upgrade be triggered and who decides that? In a multi-tenant cluster, does the cluster owner choose to upgrade all sidecars for that cluster or the web hook owner as and when new sidecar versions are introduced or should it be the service owner? Currently, we let the cluster owner roll out new versions and the service owner gets it on their next deployment. As a web hook owner, there are many responsibilities other than availability of the web hook itself. Like upgrading the web hook with newer versions of Tuberdeckys, making sure new versions of sidecars are backward compatible, opt-in or opt-out model for the sidecar, coordinating between other sidecar owners about potential interdependencies and making sure these web hooks don't cause deadlocks. One very good example we have seen in the past is service owner pod is not coming up because API server is not able to reach the web hook. API server is not able to reach the web hook because code DNS is not up. And code DNS is not up because the code DNS pods are basically waiting for web hook to come. So these kind of scenarios we should avoid. So what are some alternatives to web hooks? Web hooks are not the only way to inject sidecars. Before mutating admission controllers, service owners injected sidecars in line in their health charts or Kubernetes API configuration. Another popular pattern we have used in our physical data centers is the Per Cluster Controller Operator Model which fetches Kubernetes YAMLs, inject sidecar statically and then applies the final manifest to QAPI server. This is an easier and preferred model but requires coordination between sidecar owners and cluster operators when configuration changes are needed or a new version needs to be rolled out. You can also do this in Spinnaker at big time. So when should people write a web hook? So our answer is surprisingly when they don't have any other options or when the static inline injection model doesn't work for some reason. As we have seen from previous slides, it's not easy and operating a set of admission controller has its own set of challenges. In evaluating your sidecar needs, if you think the sidecar is very broadly applicable to all clusters and all services within your organization, it's probably a good candidate for a mutating admission controller. Another reason you want to think about is when you want to centrally and automatically inject and not give control to users or your users don't even care about how that functionality works and don't have anything to configure based on their service requirements in the sidecar itself. Sometimes experimental features which are not sure we should be bet on long term are also good candidates because you don't need to involve the service owner and can transparently roll out a new sidecar, measure its functionality and performance and then decide without meeting coordination with service owners. What object the web hook should act on? Admission controllers, as you all know, can be configured to act on any Kubernetes resource. They can modify workload APIs like deployments or pods as well as any other resources like service, ingress, secrets, etc. Configuring your admission controllers to act on deployment stateful set jobs is generally better. This is because if for any reason if your pod restarts or gets evicted, they don't need the web hook to be up to get the sidecars because the sidecars are already injected into your workload API and by workload API I mean a deployment or a stateful set or a job or a ground job. This has its drawbacks, as in that every mutating web hook has to understand all the workload API types and inject accordingly. Configuring your admission controller on pod creation or update is the more common pattern we see and in our opinion is more dangerous as well. Since pods can get evicted, restarted for many reasons and your web hook's availability is of paramount importance in those scenarios. It's also easier for mutating web hooks since pod is the least common denominator that will work for injecting sidecar no matter what the top-level workload API you are using. So the complexity of the admission controller code acting on pods is minimal. Another common scenario is injecting certificates in your containers. One pattern is how certificate manager gives you certificate using secrets. This model does not require injecting a sidecar for requesting certificate and rotating them. Using secrets instead of sidecar is better in this scenario, meaning getting certificate using a secret is better than injecting sidecars which will inject and refresh certificates. Another thing to think about is how broad your web hook that you are writing applicable. Is it applicable to cluster-scoped objects? Is it applicable to all namespaces? Or do you want to limit it to only a few namespaces? Or do you want to limit it to specified objects using the object selector model? What dependencies should we take on pod restarting? That's related to the previous slide. Injecting sidecars on pod creation as we have already seen is dangerous. A successful pod creation in this case depends on eight other web hooks availability. You want to minimize the number of dependencies you take on pod creation path to maintain high availability of your services. Pods can be evicted due to zone failures, node failures or patching. In these cases, you want the evicted pods to come up as quickly as possible with minimal interdependences. There are many best practices listed on the Kubernetes documentation pages. We are not going to cover all of them here and there is a link you can follow to see the official documentation for mutating admission controllers. Here I have listed some more guidelines and recommendations which are good to keep in mind when operating mutating web hooks. First, you should run multiple replicas of your web hook for high availability. This basically ensures that your service is up and if one of the replicas of the mutating admission controller goes down, there are others which can serve it. And obviously that assumes that there is a load balancer in front of your admission controller pods. Use an opt-in model in the beginning when you are testing out new sidecars and as and when they become more stable, you can slowly switch to opt-out model. You should also use a 3AZ spreading by specifying pod anti-affinity based on topology key S zone. If you are using a newer version of Kubernetes, you can take a look at the pod topology constraints feature to ensure your web hook pods are spread across failure zones to improve availability of your pods. Web hook should be idempotent so that re-invocations does not cause side effects. Your web hook should not take a lot of time while processing the request. You should also exclude Kube system and other important namespaces, especially the namespace in which the web hook itself is running. We have actually been burned down by this couple times. You also need to order them correctly to avoid dependency deadlock situation. For example, two web hooks cannot depend on each other and sometimes you may have to inject the sidecars statically in those web hooks to avoid the dependency issues. Make sure your operation team knows how to recover from web hook failures, how to disable them or how to temporarily limit their scope using label selectors and mutating web hook configuration in order to recover from failures or deadlock situation. To summarize, sidecars are a ubiquitous tool that helps decouple your common infrastructure requirements from application code. Mutating admission controllers have helped sidecars become more popular and easier to operate from the sidecar owner point of view. Operating infrastructure services and mutating admission controllers is no easy task and requires continuous monitoring and high availability of the admission controllers. Not all sidecar needs to be deployed using mutating admission controllers. With the common shared framework, adopting best practices and continuous monitoring, you can manage admission controllers more effectively too. In future, rolling update of sidecars in a cluster in a control manner and giving more control to our service owners on what versions they want to run. Sometimes in break glass scenarios, for example, with a deadline of when those versions will be supported, will be also useful to have. This is all I had. Hopefully, you have all learned something from this presentation. I am now available to take any questions you might have related to this topic. Thank you.