 Good morning. Welcome, everyone. Thanks for coming out. I know we overlap with a keynote, so I appreciate you making time to come out and see our talk. My name is Michael Elder, and I'm here today with Chica. We're going to talk to you about the seven missing factors for production applications. So thinking about what is it beyond 12 factor applications that you need in order to run Kubernetes in production? So when you think about what's required for a production application, there is a number of key factors beyond just as my code function correctly. You need to think about how secure is the application. So managing TLS certificates becomes a really important aspect. You have to think about how resilient is the application? Does it respond correctly to failure? But as you manage the application, how does it tie into the observability aspects? How do I actually measure its help? How do I ensure that it's running correctly on an ongoing basis? So we'll talk more about these factors and provide some great examples through this talk. So how many of you are already familiar with this notion of a 12 factor application? A few hands? OK. 12 factor is a methodology. And it just says there are 12 specific practices that you should follow when you're building scalable microservices. And we'll talk about those in just a moment. But it just gives you a way of defining step by step. Here are some best practices that you want to use as you create applications. And it turns out that Kubernetes really has taken a lot of these factors into account. In designing the Kubernetes API and things like deployments and stateful sets, we talked about this at KubeCon Barcelona. You can watch that recorded session where we explain the relationship of 12 factor applications to Kubernetes. And the link is an article that walks through how that lays out. But today, we're going to go a little bit further. So when you look at the 12 factors, there's, well, a dozen of them. And they focus on many different aspects of the application. And sometimes, if you're not familiar with them, they might seem a bit overwhelming. Why do I need to do all of these different things? So let's talk about these just briefly. And then we're going to go beyond these basic factors. And we're going to look at what's missing. What are some of the best practices? And for context, Shik and I, we work for IBM. And we build products that use Kubernetes as part of the foundation. And as we've been doing this over the last three or four years, we've identified several common practices that we think are required to run applications in production. So that's what we're going to go through in this conversation. But let's look at the basics first. With a 12 factor app, what you're really thinking about are sort of three parts of the lifecycle. How do I write the application and manage its code? How do I deploy the application? And then how do I operate it? So if you're not familiar with 12 factor as a general set of principles, this is just a way to think about it. We'll talk through this here and break it down. But ultimately, this may give you a simple way to get that introduction to what you need to do with 12 factor apps. So code base. When we think about the code base, we're really thinking about, how do I define the application and how do I manage it? The great thing about the Kubernetes design model is that it allows you to represent everything as text. All parts of the application are code. Therefore, I can manage them with revision control, with source control. I can make it part of an automated build process. For developers, this means that I can get out of the details of explaining how to scale the application. And instead, I can declare things like I want three replicas. And Kubernetes will help add more replicas or take them away based on that desired state. Now, this is true from the container image and how I represent that using a Docker file, all the way through to the Kubernetes API and representing things like a deployment or networking services or how it talks to persistent storage. So code base and really what you'll notice will represent the factors with this sort of F number and a letter, I, two I's, et cetera, Roman numerals here, will represent that so that we can tie it back to 12 factors through the conversation. But ultimately, if you look at code base, the build, release, and run cycle, and development, and production parity, which are three of those 12 factors, you can think about them really as how you manage your code. The next group of factors here is how I deploy code. Now, what's neat about this is that within the various factors, one of the key principles is separate the configuration from the code base. And Kubernetes introduces specific types called config map and secret, which help you manage that configuration in a way that's still declarative and can still be source controlled. So on the picture, this is sort of representing what's called a Kubernetes pod. Kubernetes runs containers, one or more containers in the pod, and then mounts those volumes that storage into the pod itself. I can use a config map and I can use a secret to manage configuration outside of the pod, outside of the code base. And so my config maps or secrets for production environments versus development environments can be different. Services also become an important aspect of exposing everything in my system, every service through port binding. So a port binding lets me get out of really, what's the way to think about this? So if I use a port binding, it gives me a very simple way of exposing any service in the system on a port. And as I wrap that service in a container, it means I can actually have multiple versions of that service running side by side, believing that they're using the same port. What we're doing is we're using network namespaces in the container to isolate its networking layer. And that becomes a very powerful principle if you want to test multiple versions side by side in the same environment. It also allows us to represent the pods that containers using the Linux process model, which has a lot of efficiencies for scheduling, and allows us to treat pods as things which are disposable, meaning that if they fail, it's OK. I can recover them and start new pods. So the operate factor has then become sort of the next grouping, right? So we've talked about code factors, how I write code, how I manage code. We've talked about deploy factors, how I actually get that code into a running environment. And now the third group of the basics is how I operate. Now the key thing here is what you're seeing in this picture is it's changing from one state to another. And it's demonstrating a deployment, which is a Kubernetes controller and API object, using a replica set to add or remove pods. And what's powerful is that I can simply change the text of my deployment, apply that text maybe as a developer from my command line, or as a developer using a continuous delivery pipeline as part of an automated ongoing process. Concurrency also becomes a really powerful way to really take on and establish more capacity, more subscribers, more users, more transactions. And I can use the built-in autoscalers, also part of the foundation in Kubernetes, to add or remove pods dynamically. And again, I can describe in a declarative way the policies and the descriptions that I want. And the horizontal autoscaler will add additional pods. The vertical autoscaler simply makes those pods bigger. It adds more memory, more CPU. It allows them to use more of the compute infrastructure. So what I want to do now is hand it over to Shikha. And she's going to take you through the next part of this journey. We've laid down the foundation around 12-factor apps. Let's talk about what you need beyond that. Thanks, Michael. So as Michael mentioned, we work at IBM. And we are creating these services on Kubernetes, to run on Kubernetes. And as we were looking at all our services, we wanted to standardize some principles, or some design principles, each of the services follow. So 12 factors were very handy. Map to Kubernetes were following. But then we figured that these are other seven factors. Am I echoing? Am I echoing? I'm echoing. OK. These are additional factors, which unconsciously we are applying to every service. Because we need to get these services running in production environment. And these are very critical for running any of the services in a production environment, in an enterprise organization. So that's where these originated. And as a matter of fact, we were writing down all the principles, all the guidelines for the services. And these stood out, did not map to the existing 12 factors. So Michael and I got together. And categorized these are seven missing factors that we need to take care of in any services that we create in IBM. And of course, it's good to share with rest of all you too. I won't go through here in detail, because there's slight for each one of them. Observability. In a distributed environment, it's really hard to manage and make sure every piece is running properly. Because it's distributed, there are multiple moving parts, moving at the same time. Anything can go wrong impacting the other service. Think about 100 microservices. And one goes wrong. And it's impacting 99 or 10 other microservice. So you really need something inbuilt to make sure their services or your apps are resilient. Kubernetes has some features. And it's up to us to take advantage of it. Readiness Pro is a great one to make sure that you have it in each of your microservice or each of your pods. And that really helps make your system resilient. As you can see in that animated GIF, each of the pod or each of the microservice, you can check for the readiness of all your dependent or your dependencies. And if all your dependencies are ready, then your pod is ready. The way Kubernetes uses it is it only sends the traffic to your pod if it finds out that your pod is responding success to the readiness probe. So think about the front end, dependent on a database and a BizLogic microservice. You want to check in your front end microservice that your BizLogic is responding and your database is responding and ready as well. So that's kind of checking. One other thing that we're using this readiness probe heavily now for is as your dependencies change their versions, API versions, we want to make sure that my microservice that is dependent on other services does have API toleration. And that's what I check in the readiness probe as well to make sure I have the right APIs of the dependencies in the system before I can respond to the traffic. Livelyness probe, it's there to Kubernetes checks for the liveliness probe. And if it's not live, it actually just replaces the pod. So it's very important to make sure that initial delay second is set right. Because if your initial delay second is less than your application boot time, guess it gets into the cyclic nature and your application will have a hard time coming up. So something there to keep in mind, well, is this enough? My application, for example, the front end application that I described, it's dependent on it's accepting traffic, it's doing some transactions. So I might be interested in transaction for a second. That's not something that Kubernetes gives out of the box as part of the health check. So I can introduce my own metrics. I can have my custom metric introduced. I can have an exporter for it, collect it in Prometheus, have a Grafana dashboard. So that's what you can use if you have Prometheus and Grafana as a monitoring system in your Kubernetes environment. In the cloud provider that we have at IBM, we do have the Grafana and Prometheus as our monitoring out of the box monitoring solution. And to help with this last use case, what we have done is we have created CRDs for the Grafana dashboard. So in production, if you're an operator and you find out that you really want few of these metrics to be monitored for these three mission critical application. No problem. You can introduce your own Grafana dashboard and make sure that you can set all the thresholds and everything that you need. All right. Okay, schedulability. This is also not covered as part of the 12 factors that we could see. And here's a good example to consider when you are thinking about this. Your organization creates Kubernetes, wants to venture into Kubernetes and sets up the Kubernetes environment and you're the first team to pioneer an application that runs in Kubernetes. It comes up well, coded well. It's running, up and running. Everything is awesome. Performance is great. And another team comes in and starts to experiment. They also come up with their application, application B. It's working well, but slowly your performance, your application performance starts going down. The first thing to check is, do your pods or do your containers have the resources that it needs, the compute resources it needs? That's the first place to check because what Kubernetes does is, if your containers do not have the request and the limits defined, request for the CPU and the memory and limits for the CPU and memory defined for the compute resources, it just forwards it to somebody else and your application will get start. So that's where it comes to define the resource, compute resources for your pods. Another neat thing is, if you're an administrator and you're setting up your Kubernetes environment and you're having different namespaces for different dev staging and production, you might want to allocate the resource quota for it. It's just not only the CPU limit and the memory limit you can define as the resource quotas, you can also define number of pods, persistent volume claims, so on and so forth. So that really helps make sure your environment is set up right and it's really useful in production environment if you're thinking that you're an operator and you're thinking that you're gonna set this environment and have your multiple teams work in the same cluster. All right, let's go to upgradability. When applications are running in production, there's always security enhancements, feature enhancements that we have to put in that are required to be backwards compatible. It's not just that it's a new version of the application, but it fixes that you apply to the application. That's where the rolling of grades is very handy. What that does is Kubernetes provides a really great way of, as shown in the animation here, you can introduce version two of your deployment, your pod comes up, and slowly, the Kubernetes will take the service when the whole application is up, your service will point to the new version two of your application. We find this very handy for applications that need fixes on regular basis for security, for feature enhancement, for any other fixes that you can think of that goes in the production. Couple of things to keep in mind. I did not find it in the Kubernetes doc, but it's important that the max search is defined. Max search is one additional pod that you want to be available besides the number of replicas that you have set up. It's an important one to define there, otherwise your Kubernetes might take away, it might make less number of pods available for you. So that's an important variable to set. Max unavailable is the number of pods you want to be unavailable at any point of time when the rolling upgrade has happened. All right, least privilege. Michael touched on how you create your containers, trusted image from a good source, et cetera. This is additional. These are some additional things to keep in mind when you have the containers up and running. You want your containers to have as minimal privileges as possible because any additional privilege is a source of attack. There is a pod security policy in Kubernetes that you can use for limiting or restricting any actions that you can allow your containers to have in your environment. Access to the host, access to the file, access to the type of volume, et cetera. So that's another thing we find it useful. We also have, when you create namespace in the cloud provider that we have, when you create namespace, you can also tell Kubernetes what kind of pod security you want to apply, and that gets applied to all the pods running in that namespace. Auditable. When you are working in a production environment, an enterprise where maybe you are allowing credit transactions to happen, you definitely want each and every critical, create, update, delete operations to be logged because you never know when you have to go back and do the post-bottom of all the audits that you have taken account of. Or you may have to figure out who's violating or who's not doing the right actions. So that's important from the production or enterprise application or setup point of view. We follow the CADF open source format, which helps log each and every detail, which is initiator ID, who is starting the operation, target QRI, which is what specific target is getting operated on, and the action, the action along with the resource type that's getting looked at. So that's another one, is an important one to have as a factor when you're creating your application. Access control. So this whole topic can be a session by itself or can be a session for the whole day. There is several aspects for your application when it's running really at the enterprise scale to make sure that it is secure. It has the right authentication authorization when a user accesses your application. Network policy, security policies. It has the right level of network policies security policies applied. It's running in an environment where it has the right level of policies. For example, the policies could be who's changing the configuration of my application. You might want to put a policy in place to allow only operators and administrator to change the configuration of an application. That's an important one to consider when you are putting your application in production. So I want to go through this in detail. Certificate management. This is part of the previous factor. Any communication between pod to pod should be through TLS. If you are letting your front end to be accessed from the browser, then there should be a certificate established. That's going to this one. In the cloud provider that we have, we use OpenStack as our third manager. It's an open source project. We use that to make sure that we can have this certificate manager, we can have the certificate rotation, we can renew the certificate, and everything that goes with to make sure any communication between the pods or between the services or browser to application is all done right. And that's the last factor, I believe. Ultimately, whatever you set up in your Kubernetes environment, somebody has to pay for it. So make sure that your application has a way, based on however your cloud provider provides the mechanism, so that it's measurable. You know how much resources you are consuming from the Kubernetes environment. So that's the last factor, I believe, we have. The next slide. So what really makes production-ready app ready? There's three things that we touched on. If you, our 12 plus seven factors goes through or is covered in each and each, one of the other categories here. How are you building the containers? Trusted, small image. How are you configuring the Kubernetes, configuring the Kubernetes to host your containers? That's the second one. And then finally, the cloud provider you're using. The cloud provider is providing all the capabilities you need to run your application in production. All right, so we do the demo first, or you wanna talk about the lifelike first. Okay, let's do the demo first. And so we do have, we're using Michael's laptop here for the demo. So what we did here, so there's a lot of topics we touched on. I'm not sure if we can demo all the aspects of it. What we thought we could demo is creating a resource. We've already created the resource here with three replica sets, so let me bring that here, and let's see if it's running. Okay, we still have the problem, it's not coming up yet, but we have the NGINX demo system, NGINX demo resource created. What I wanted to show here is, as we created the dash four limits show here, get qptl, let me take a step back and tell you exactly what we are trying to demo here first. And it was in the default namespace, I think. Okay, I'll try one more time. I think I was missing default. What we are trying to demo in here is, maybe if I look here better, then I can, my is not that great. What we are trying to demo here is, we create the resource. I, as a production environment operator, wants to create new dashboards. What we have provided in our environment or in cloud provider is you can create, we have Grafana CRDs, so you can create the Grafana dashboards on the fly. And that's what we wanted to show here. We also have the JetStack way of JetStack open source project to do the, establish the mutual TLS or TLS between the services. And we were going to show that as part of this one as well. So, Can you, maybe let's talk through the demo. Okay, let's talk through the demo first. So in the demo five, it's we're creating the NGINX resource the important things that we talked about the number of replicas, we have three, then the type is rolling update. And the two fields that are two parameters that are important is maximum surge and maximum unavailable, never set that to zero, because that'll tell Kubernetes that you can have maximum unavailable as zero or maximum additional as zero. And then the liveliness and readiness probe that we talked about. There are three protocols, TCP command line and TCP command line. And there's one more I forget, but those two you can apply to check for the readiness and the liveliness of your probe. Path is important, that's what you use for enabling or enabling or putting down all your health checks where you're checking or for your dependencies if your dependencies are ready or not. Path is last readiness. And then here is the important one where when you are doing your TLS setup or you're making sure all your communication is through a certificate, you want to make sure that you have your certificate authority in place, you have your certificate authority in place, you have issuer in place, you've registered your certificate to the certificate manager and you do have your certificate mounted to get you the secrets and you apply your secrets, you use your secrets in your custom resource. So right here at the bottom is your secret name, that's where I'm getting the cert from, demo one engine excerpt, that's the secret name where the certificate is available, I get that certificate and then I can do my secure communication to the service of where is, okay, so that's the one, certificate is defined in the secret, I think I touched that one, the certificate is defined in, I access this through the secret and do you want to add anything more? Oh, yes, so just like we have the Grafana dashboard as the CRDs we talk about, any custom resources we are bringing in or any new resources that we are bringing in so that applications can take advantage of is all through CRDs. So Grafana dashboard is a CRD, certificates is through CRD, you can see all the, you can introduce your own CR for that CRD and that's the certificate is in the same boot. We have alerts, alert rules as the CRDs as well, so any new resources that we are introducing is a CRD. Yes, so that's the dashboard 42 minutes, that's the dashboard we created right before we started the session. We created through the command line as a resource and that's the dashboard that's coming up which is few minutes back, it's collecting the memory for the application, engine X resource that I just created. I think that's kind of, I think the key thing we wanted to really kind of walk through is just showing all of these different characteristics from the 12 factors to the seven missing factors are things that typically can be represented as source code and leveraging this powerful Kubernetes programming model to address all of these application needs. In terms of some of the things I'll leave you with to think about just a quick highlight, as we think about managing one cluster, the next problem really becomes how do I manage many clusters? So I'd invite you to take a look at some of the things we're doing with our multi-cloud manager. This actually brings a language to deploy and manage applications using the application CIG, CRD, and then also brings a policy language to manage security while also providing visibility of the health of applications and the readiness of these clusters for compliance. And then a few additional resources, the three titles that you see here are articles and medium. These kind of go into more detail. The 12 factor base example, this seven missing factor, which has really been the core of this talk, some additional guidance on readiness probes, and myself and a few other folks from IBM co-authored a book around Kubernetes just to highlight some of the introductions, how to run applications, how to run clusters, and you can download a free copy from that link if you'd like. And with that, where I think almost at time, so maybe we'll take questions up here if there are any questions if you want to hang out. And if I can gently maybe request, if you're open to it, I'd love to get a quick selfie before we close. Is that okay with everyone? All right. So Shikha, you want to jump in with me real quick? Yeah. And maybe on the count of three, if we can say Kubernetes, all right? All right. E-R-SUN. Kubernetes. I don't see it. You're okay. Awesome. Thank you so much for obliging. We'd love to talk to you more. Please feel free to reach out to us either on Twitter or come talk to us after the talk. Thank you.