 So yeah, first of all, thank you all very much for coming in here this way more people than we ever would have expected Yeah, and we know it's late in the conference the last couple of days were pretty hard and we really appreciate yeah, your stamina here and See it coming holding on until the the final end Yeah, one thing before we dive in I wanted to ask real quick And on the first day keynote Chris mentioned that 58% of the participants of this just keep going were like first-time participants So can I just check again who in here is here for in the cubicle for the first time? All right, that's good. Cool. Well, thanks Yeah, and so and and two of you kind of currently on in the process of Going towards Kubernetes and considering as a productive platform for their for their workloads It's still a few. Okay. Yeah, so that's that's perfect. We this is as described an intro level talk and I hope we can share some of our experience and And things we've learned so far with you. So this is the title how to make Kubernetes rhyme with prod readiness My name is Matias. I come from Smaller or medium-sized company in Germany. We are focused on on software engineering like modern software Distributed cloud cloud native and microservices based software And I'm here with my colleague Hi, I'm Tiffany journey in and I'm way shorter. So this is kind of fun So I am a developer advocate at VMware. So Actually, I'm curious who here has ever heard of the title developer advocate before Okay Not so much but still kind of yeah, we do stuff like this as part of it. Um, yeah, so It's kind of VMware is kind of the opposite since we don't have like the consulting But we have a bunch of our own products that are used in this whole cloud native Kubernetes and a lot more space And if you still use Twitter, we have our Twitter handles at the bottom there And I think on my Twitter is my mastodon link too. So there's just there's things All right, thanks. So before we dive into the details We first of all wanted to like share with you how we came to think of Doing this talk and there's also also a bit to do with what we just said like what our working backgrounds is So as a Tiffany is working for a company that basically provides a product For running or like a form having Kubernetes production ready with all the things it should contain Whereas I'm more from a leak Technology consultancy side where we help people to get their workloads onto the cloud But also to build their their stacks and and maintain them so with that we kind of want to combine the things we've seen so far and What kind of advice we can give back to you? So in general, we're not a fan of bulleted lists. So we're gonna see probably many Kind of visualizations in those slides. I hope they make kind of sense to to the most of you So in the end, this is what it's gonna be about. So the users and the happiness of the users so the users are the lovely people here on the bottom left and We want to make them happy, of course I think most of you will want to do that too. So there's the software running in the clouds based on Kubernetes containerized modern software and We want to talk about how to enable or make that service good enough that that all the users are happy So that kind of brings to the point. What is actually production readiness? I mean, I don't think there is an official official definition But if you we try to kind of summarize it on a high level to say It's kind of the state of a system which is fully prepared and capable of running prod workloads and provides the level of service and Performance required by its users. So again kind of going back to that picture. Of course, this definition is not Like bound or scoped to to what Kubernetes does and not it's way way more than that But of course with this talk as we are in KubeCon right here We certainly want to focus on on cloud native and CNCF based technologies There's a bit more to it and I said, I don't like bulleted lists So there's I think this there's another one But that's that's gonna be it and I'm not gonna read through all of that But that's a couple of more detailed things how we would say what we would take an account to have things prod ready I think the bullet to pretty much summer summarizes pretty good. So to say it's ensured. It's reliable stable and secure That also include they includes that it performs well under expected and also unexpected conditions that it's adaptive to changes and Basically, that's whatever it takes to be up and healthy and providing the things to the end users of course that includes to be monitored and observable and So giving whoever Admin administers that the option to say what's going on currently or make some predictive analysis What might happen in the future in order to take the necessary steps for that? Right so with that I'm handing over to Tiffany All right So let's first talk a little bit about like what vanilla Kubernetes actually provides and the concept in general of prod readiness is More than just within the space of Kubernetes, but well since this is cube con We're basically nearing down the scope to both like Kubernetes and also like the cloud native Landscape CNCF type of thing as well All right, so as kind of mentioned a little bit earlier This is more of like a like more entry maybe intermediate level talk Since it's the last day of the conference I'm assuming that you have learned at least something this week related to Kubernetes if not The videos are online. So it's okay, but So basically at a super high level for anyone who is Make sure you're all awake I'm sorry. I'm gonna put that away here. I don't I don't know what it was go away At a very high level Kubernetes basically it runs and manages your container workloads because it's like hey I have this application and Well, what if I want to have a bunch of versions of that running? What if I want high availability what like how do I handle all that stuff? So Kubernetes if you are pretty unfamiliar with Kubernetes still There's a QR code that is up here It's for a blog. I wrote on getting started with Kubernetes. So Might be useful for some people in this diagram basically you can just see here's a bunch of users and a bunch of pods that the users are Creating you can see that for instance There's these cute little like little log things there because you know people definitely hold out paper nowadays for their logs Kubernetes, but by default in Kubernetes you can use kube-cuddle and do kube-cuddle get get logs if you have access to the Control plane you could go as a station to that and you could find your kube-lit logs like these logs are all there by default So as kind of like a high little level Kubernetes is a distributed platform for running connecting and logging your containerized workloads You can do things with being able to scale up for instance for your on the level of infrastructure Like it's not where you go and say hey, I'm going to create a Kubernetes cluster I'm only going to have four nodes and then later you're like, oh, no now. I need 20 I need to kill my cluster or I'm scared like you can actually just add more nodes and that is totally fine and as a result you can also Like you can scale it down as well And if you're using some of the like managed Kubernetes offerings, they make this type of thing a lot easier for you as well And in addition to being able to scale up and down the infrastructure You are able to also do that on the workload level So for instance, you could have a deployment and be like hey I want to have 20 copies of my redis application or whatever you want and Have that be highly available. You could have a bunch of nodes and then it'll distribute those across those different nodes You can do things like Hey, I'm going to have this new version of whatever. I'm running and do some sort of graceful migration There's also things like fault tolerance So for instance assuming you're not using just a plain pod where that's just going to die and well You have to start a new one yourself There will be auto restarts if you There's like how it's mentioned. There's like load balancing between the Different replicas of whatever workload you're running. There is the ability to do scaling There's the ability to do updates without actually having downtime So there's just a lot of things that you can have there So as another like summary of all of those things Some of the focus is like having high availability and resilience of your workloads. There's fault tolerance There's elasticity both of the infrastructure and of the workloads that you have as well and as well as extensibility All right, thanks. So, I mean, this is was the kind of Trying to make sure part we are on the same page here and and kind of the common awareness and understanding of what Kubernetes will do for you Looking back at the topic of production readiness You can clearly see that the the core of Kubernetes are like the the internal part which is available all the time Has is has a strong focus on that Reliability and scalability part so that kind of brings us to the to the next part of say, okay If I have like my application running in a container and I can deploy it to mini cube or like kind or whatever How far am I still away? Of getting this in into a into a productive landscape there is actually then you will see there were quite a few things missing and It was also a hard time for us to like Separate them out so we try to categorize them a little bit Even though you will later on see that a lot of the things will actually kind of overlap and interact with each other But the the five main categories where we say there are shortcomings or things that you have to certainly look into and work on If you want to make it or harden it for production is on the one side security observability on the other Network kind of in the middle infrastructure on the bottom and the workloads on the top So I'm going to tell you what I mean by all those so Yeah, if you if you run Kubernetes, of course You are responsible basically to provide the underlying hardware that it requires Tiffany I said it's it's easy to as possible to scale up and down. That's fine But still of course you need to make sure that underlying hardware is available In addition to that. Yeah, it can these can be VMs. This can be still bear metal This can be respiratory pies whatever probably not for productive workloads, but technically it can You need to provide the storage the store and and the storage classes for the for the persistent volumes so This is actually a good example of how Kubernetes works in many ways here So there are API objects for storage for example There's persistent volume claims and persistent volumes and they will pretty much work the same in every cluster But still if you want to have external storage attached This is something you need to provide and make available so that the API objects of Kubernetes can actually consume that in addition to that To be on the safe side, of course We recommend to do backups both of the cluster infrastructure itself and the workloads and Also, if you want to connect to things outside of your Kubernetes world They say some kind of managed services like databases Messaging services or legacy systems is also you to do to make those available and accessible from the workloads inside From infrastructure, we're gonna go on to network. I mean networking is also something which probably every Kubernetes user will have to deal with first of all of get of course to get the inbound traffic to the application So to make your applications accessible from external users First thing you probably there are things like ingress, which is again Kubernetes API object, which is kind of the same all that most of the time But the ingress controller for example that that implements that is something you need to define and you need to evaluate and need To make sure it's the right one for your purposes Now there is this new thing with the gateway API coming up or you can still add some kind of custom implementation for something exposing load balancer IPs and have mappings like at DNS externally The possibilities are there that the point. I'm just trying to make there is no single golden path And it's not gonna be out of the box for you. So you have to basically do that for yourself Going from External traffic to like internal traffic this may or may not also be a requirement for you for example to encrypt or make or Secure connections between workloads in the cluster or in an advanced way to do like weighted routing or Advanced I would say advanced network features. This is not a requirement everyone has but certainly people have and Kubernetes is not providing things out of the box there So you can use service meshes or something alike to put on top to basically get there in the end You want traffic control so defining network policies encrypting or as I said advanced routing From network next part was about observability So, yeah, this was also one of the hot topics. I think on this cube con I mean as Tiffany said before Kubernetes provides basic logging and if you have a metric server You can also get of basic metrics from from your from your nodes and from your pods But that's probably not what's gonna make you happy all in total So what Kubernetes is kind of missing here as well is like a tool stack where you actually persist your information that you're from your monitoring something to visualize something to evaluate and Kubernetes does not go beyond locks. So everything that comes into metrics and traces You will have to instrument and make your workload adaptive to that and make sure it's gonna be routed somewhere Where you connect basically look into that All right, so From I mean, of course, we can only scratch the surface here I'm pretty sure that there are many more things But we just tried to summarize that as we only have a half an hour of time so security is next and Of course, this would certainly be able to fill not only one but many talks on its own as well So a couple of things you need to make sure you handle there is of course to provide the right people the right access To the various resources. I mean for example in most of our productive scenarios or pretty much all of them We don't give a production access on a cube CTL API level to any users We sometimes do that on a on a time-based or ticket-based thing So we can really go sure somebody needs to go in there for like getting a log for a certain time period But in general, we just cut that off. So the right our back policies service accounts API access limitation On namespace isolation depending on the way how you structure your namespaces or if you use multiple clusters For pot security, these are all things which which are very important and probably not there's probably even more to that But we're gonna leave it with that or Just should I add something? I mean there there's also stuff like dealing with certificates or like OIDC tokens There's also things with like some sorts of secrets manager So that way, you know accidentally do something like committing your secrets up to github because it happens and I may have posted secrets at some point Yeah, please don't do that, but I guess you knew it already now Some something which is arguably a part of Kubernetes or not are other workloads I mean in the end the your consuming users of your application will probably not care If wherever they run and what which program language they will be written in and how containerized they are But in general the production readiness of your overall system is of course the combination of the runtime and the workloads on top and and of course that means If you have if you like harden your cluster in the best possible way and still fail on your on your workload aspects That's probably not going to be a great thing so Question is of course then what can you do on the workload side to go in the same direction and make them stable secure observable and so on So I hope this is a visible well I mean, this is basically more like a software engineering best practice than actually looking too much into Kubernetes, but Of course, we always recommend to automate all the steps and may and I get like in a repeatable continuous way So you that that you have a chain from from source code up until your productive runtime Where there are basically no manual steps involved where you can do that take the Compilation the testing the packaging containerization and then later on push out to the to the cluster and of course You should make one of this secure as well. So basically and encrypt your Your images make sure you have safe base images or use technologies like build packs that take the thing things away from you To basically say, okay, this we can kind of call this a git sec ops workflow where all the things are coming from Now this is the summary kind of all those shortcomings and yeah, this slide is intended to look as chaotic because this these are all the things and Potentially more that you have to worry about in order to make your your cluster in a way that you can say, okay I'm feeling safe to deploy things to the world and and I'm not worried about things going wrong so With this what's what can you do then? I mean, how can you approach? There is again no golden path But just being aware of the things you actually have to do you can start approaching that that that problem scenario So some checklist might help. I mean this checklist has no like Aspect of being all the way complete. It's probably because each of the checklists will be different You will have different workloads different client scenarios and so on but just going over it and say, okay Do I have I covered all those things? Are they working? Well, have I then tested them thoroughly? It's definitely a thing to approach them So while this was all completely conceptual will now look a bit more into actually How can you approach this with technology from like the the CNCF world? Okay, so who here has been in like the booth area during this conference That is way less than I expected. Okay, sure The people have you can probably see that there is just so many booths the entire floor is Tons of different companies tons of different project booths, etc That kind of gives you a little bit of an idea of like how vast the cloud native slash Kubernetes landscape basically is This lovely image if you were able to read any one thing on here Can I like get your vision because I can't There's just so much stuff that isn't there like someone posted a little bit earlier that in like 2017 There was only I think like seven projects that were part of the CNCF and this is what we're at now. It's a lot So obviously I can't I don't know about every single one of these things I don't think that's possible at least it's not possible for me maybe someone here knows about all of them and wow you rock but What we're going to try to do next is to for all of those categories that we had just gone through to list a few open-source products that we thought either are things that are like Graduated incubating sandbox or tools that we may have specifically had some interaction with it doesn't mean that anything else Is any less valid they totally are but I can't talk about those ones yet all right, so First on infrastructure, so there is cluster API, which is this cute little like turtles all the way down thing going on There's also Valero. So these two on the left are specifically focused on kubernetes So cluster API you do things you have an existing kubernetes cluster you use CRDs to be able to Spin up new clusters and deal with managing and things with that and then there's Valero So that's like backing up the workloads that you have it also backs up your persistent volumes Then on the other side of things a little bit There's Terraform cross plane and Pulumi. So they're big their infrastructure as code And you can use it to create a bunch of different resources, which can be kubernetes infrastructure as well All right, so networking so we have Istio and linker D Which are specifically on the side of having service mesh I mean they're more than just networking as we were kind of saying a bunch of these things kind of like overlap into different Categories so that also falls in under security and observability as well There is also psyllium which is based on eBPF So basically instead of like with And linker D where you end up having a second Container that is running in your every single one of your pods This actually is interacting on like the node level so it can do things like service mesh You can also do things like observability, etc there's also and Tria and that implements the CNI or container network interface and Also the kubernetes network policy, so it helps with network connectivity and security for your pod workloads For observability I feel like a lot of people in general have probably seen a lot of Prometheus and Grafana kind of being like two things together So you have Prometheus where you have a bunch of metrics and then Grafana is there for you to be able to visualize all That it can be kind of chaotic for some people such as myself to look at just like here's a ton of numbers inside Prometheus, but like I want to actually see what does that look like how has that changed and be able to do things with it Then there's keali, which is for observability on service meshes, which is kind of relevant since we just mentioned service meshes There's open telemetry, so that's kind of a standard for like things with integration layers for like logs metrics traces There's also Yeager, which is for distributed tracing. There's fluent D which is also for logging It's quite a bit. Yeah, it just keeps going so security So there's a open policy agent or OPA as you can tell there's a lot of acronyms or a lot of long words here So that one and Coverno are for policy-based control. There's ser manager, which that one is very self-explanatory It deals with managing certificates We have key cloak so for identity access and management, which is I am There's spiffy and there's also spire just kind of ran out of room But spiffy spire kind of go hand-in-hand and there for an identity control plant across infrastructure there's also in toto and That is for like supply chain integrity So like if you think back to that get sec ops slide where I had all those little locks at the bottom of each path along the Like starting from your code to building all of those things that you need to worry about hardening your security all the way along the chain and That includes making sure that you have some sort of tools that can help you go along with doing that Additionally, then there are the workloads because yeah, you have a production ready Kubernetes cluster But if you're not running any workloads on it, it really doesn't matter. No one's using your things so Who here has dealt with YAML in Kubernetes? All right cool a lot of you. Okay. There's a lot of YAML I mean sometimes you can get by with doing like some kubectl things But overall there's gonna be a lot of YAML Especially if you start doing once you're at the point of dealing with CICD and so there are tools that can help you with that So things there's like Helm. There's a tool in the Carvel tool suite That's YTT that basically instead of doing a bunch of copy pasting of your YAML and changing a few things You can have a few consistent ones and have like pull-in variables from other files and just do a bunch of different things with that Carvel also has a bunch of other tools in there for being able to like different ways of running your applications Things like cap controller where I can deal with like say you have stuff They're pushing to GitHub and be able to update things based on that and there's a bunch of other things there There's Harbor and I mean if you don't have a container image stored somewhere you don't have an application So basically you need Harbor for that and Harbor also Falls a bit into the security side too since it has image scanning and then we also have on the CD or continuous deployment side so like on the get-off stuff we have Flux and Argo and Then there's also build packs So like perhaps you're very familiar with using Docker files to go and be like hey I want to do all these things to create my container image whereas with build packs. It's here's my Java code Here's my go code. Here's my whichever supported code make me a container image. I don't want to write a docker file So I guess the question is like alright Now what do we do? Like we have all of this stuff But we need to figure out things of like how where are we running and how are we doing that so? In 2019 which honestly feels kind of like quite a long time ago now Um Kelsey basically was saying that it is your responsibility to purchase Staff patch scale and upgrade so that is a lot of stuff that you have to do out if you want to do all of that You need to make sure all your versions and everything are in sync because if they don't work together Then everything is just going to crash and burn So we kind of this isn't like the one ultimate way of separating these things out This is kind of our way of separating these things out There's little dot dots dots because there's obviously paths that are in between So on the far left side here, I don't even know if you can see my laser, but there is one and yeah that side So that whole section is basically saying you are managing all of these things so everything from the infrastructure the hardware Like hardware for that like you have your own data center and you're dealing with all that you're managing kubernetes You're managing the security in the network the observability like and then on top of that also your apps and your workloads Then there is this concept of basically managed so instead of like you still have to deal with the apps the workloads and you Stop to do the security networking visibility by yourself But the infrastructure the hardware and kubernetes is managed by some sort of provider So for instance like GKE or EKS or AKS or a bunch of the other things where they're handling those things for you Then we just made this term called fully managed There's a little star there because I mean again, it's not some official name But basically it's on top of that It's adding that layer for the security network and visibility Which is basically all those things that we've kind of been talking about this entire talk that are fitting in right there as well That can be managed with things So this is just like an example from Tanzu kubernetes grid um basically showing you a bunch of the open source projects that they specifically picked to fill those voids in What is needed to be production ready with kubernetes like you probably recognize almost I think only Calico wasn't on another slide or something, but like most all of these were on that So it's just like a bunch of projects working together to be able to come to some sort of endpoint and that's gonna be run either on cloud or on-prem All right, so I'm taking over here real quick just for this one slide Where you can see another kind of stack of similar components being grouped together in in a cluster So we from Novatek we provide also training on kubernetes for our clients where we have an own implementation So we're not building a product We basically started building the cluster from scratch not exactly from scratch We used to manage cluster deployed by a terraform on ashore and assembled all those projects inside to make them part of the training And yeah, I know it doesn't make sense to run Istio and Linkety in the same cluster Just yeah, this is certainly for training purposes But then in the end what I'm the point I'm trying to make is that this definitely gives us quite a bit of a hard time To keep all the ways in sync I mean sometimes the cluster needs to get updated from the provider Sometimes the projects get updated and we always need to validate our is all the entire matrix of the of the projects still working together well and This is exactly what I want to manage product project is basically trying to solve for you. So I'm gonna give back Okay, so there's a couple choices that you stop to make So there's do I want to be on the cloud or do I want to be on-prem? So for instance, if you have like a ton of bare metal sitting around you just have all these servers Then you probably won't end up going to on-prem unless you just have buckets of money or something If you have some sort of sensitive apps, so say something like military intelligence medical Then there's the question of can is there a cloud in my country? That will allow me to actually run these types of workloads on them if no Then you're having to do it on-prem and if yes on the cloud and pretty much every other scenario We're suggesting that you go and run it on the cloud Then there's the concept of do I want to deal with managed or self-hosted so For the most part if you're on-prem you probably end up having to just go down the route of self-hosted if you have a super huge Cluster and there's not enough support from any of the providers then same direction If you need to use the newest kubernetes features or you want to be able to like you want to use the latest kubernetes version Since most providers are tiny bit behind Or if you want to be able to enable some like beta or alpha things Same thing and if for some reason you tried it really really hard and tried all of them and nothing seems to work You fall into that last bucket, but otherwise try managed so like quote from Jerome Pedazzoni is the fastest easiest and cheapest way to run your kubernetes cluster is to get someone else to do it for you so Just a suggestion there All right, so yeah again We probably should highlight Jerome a little bit more because he was also he gave the name to our talk He was actually the creator of that title. So thanks again for your help not only with a title but all the video feedback and So I'm trying to bring this to an end with the second bulleted list slide that we have I mean In the end why we're doing this is just basically to to show you and indicate It will take quite a bit more than just putting your application a container and writing some yaml files to deploy that The way you approach it that there are various options I mean and and I'm whenever start diplomatic here and say in the end there is no right or wrong on any of these approaches It's more about your decision Where you want to invest the time and money and there can be multiple drivers for either side of the decision I mean sometimes it can be definitely desired that you want to maintain the stack yourself that certainly means You need to invest in the skill of the people that doing that But in turn you have that skill in-house which can of course be a valuable thing in other cases It might be that you say yeah, I don't need all that I I just want to focus on deploying my applications fast and I want all this management be done by somebody else Then you go for the for the managed solution in that case you pay the price to the service provider and this is something that everybody has to decide I'm for themselves given under the Like restrictions or guidelines that that Tiffany had just said In general if you can basically speak of a general rule what we always would suggest From our experience to say go with the highest abstraction possible I mean in the end don't try to solve problems again that somebody else has already solved So maybe somebody has already figured out which kind of combinations from security Observability networking whatever tools and what it will actually work So you don't have to redo that over and over and and that's what they what they offer as a service and The more of course you already built on it Battleproof or well-tested platform the more enables you to spend the money and the time that you have in Basically your core value, which is supposedly and that's I might be blurred there because I'm coming from like a Software engineering background, but it's the value is like the application logic of your applications and not the configuration of your infrastructure So in the end if there is a managed solution that suits your need I would recommend to go for it if you wanted to set yourself Also try and reach out to people that can help you with that because this landscape is changing the technology is constantly refreshing So my people are there are people out there that provide training that provide enablement provide their insights That can certainly help you just to save some time and I think we are on the edge So this is if you want to you can certainly give us feedback on that presentation again If a big apologies if we missed some one or the other technologies that you would have liked to see on one of those pages This has no guarantee of completion whatsoever It's mostly a few things that we have already worked within the past Or like from from from products that we know but of course there might be many out there Which do this things in the same or better ways? So is something missing here doesn't certainly mean we don't think it's it's great I just wanted to make that sure and If you have any further questions again feel free to connect to us You will find us on LinkedIn under our names or these are Twitter handles and with that I think we would like to say Thanks for listening