 Hey everyone, and welcome to my talk on Kubernetes deployment models at the edge. My name is Charlotte and I'm an engineering manager at Container Solutions. We're a cloud native consulting company, and this is also where I've picked up most of the content that I'm going to be presenting today. Working on client projects, not all of them at the edge, but you pick up different things about Kubernetes on the way as well. Why am I doing this talk? There's often a big gap between, oh, we have this application and we can run it in a container. We've had it properly containerized, so now how do we run it in production? This is where Kubernetes often comes in handy, but during my last project I realized this is even more true for edge computing. So the gap from container to Kubernetes to Kubernetes at the edge gets increasingly larger. And at the edge you have challenges that cloud computing might not have. So you have limited hardware, more or higher latency requirements, so the workloads will change and also the different things about communication between the edge and the cloud. So this gap just gets larger and larger, and the instructions on how to move from step one to final version often tend to go further apart. So it's getting harder. And this talk is hopefully helping to bridge this gap a little bit, but giving an idea on what the different options are for Kubernetes and how to go from here, this is how I have an application, to okay, this is how I can run my Kubernetes platform at the edge. Starting off with edge computing. Why would we even need edge computing? Why don't we just use the cloud and state it? Since this is a slightly toggle focused conference, I'm guessing you already know this, but the edge is enabling lower latency workloads to run faster and run better. And you have the closer to end user and device processing. So you are enabling telco and other industries to provide better bandwidth, better mobile connectivity, and you're helping advance use cases such as AR, VR, gaming, IoT, faster streaming, and many others with the edge. And if I say edge computing, there's different definitions out there. So I'll run you through mine just so that we're on the same page when I'm using different terms that might not be the same, or might not be understood the same by everybody. So if you're looking into edge computing, you'll run into mixed versions with for computing and cloud computing and miss computing and cloudlets. And all of this can get a little confusing because it's just very different versions and people don't always have the same understanding. I personally find the one separating the different versions by which network they belong to and what kind of workloads they run makes the most sense. So basically you have the edge computing on the left here with the actual edge devices, which can be user equipment, so phones or laptops or cars, other sensors or IoT devices. And from those you go to the access network, so base stations and gateways. And then there's a small overlap with the metro network where you have the private data centers and the central offices. And this is also where for computing already takes. So for computing is bridging the gap between the edge and the more traditional clouds seen on the right. So the hyperscalers like AWS or GCP. So you might not need for computing in this context. So it could work with just edge and from the edge where the edge tops, the cloud starts. But I like the more softer way and for computing also giving you the thought about how would you even manage workloads that need to transition from the edge to the cloud. And that's it for these definitions. Now if we look at what an edge platform can look like. So we get this nice example from the Etsy standards group what the expectations can be for an edge platform because here we have the multi-edged access edge computing or Mac reference architecture. And this is only one version and one example of what you might want from an edge platform. This is no definitive thing for everything. This is just a specific use case. But you can see you have two host machines. So the one in the middle and the one on the left and various management and orchestration pieces around it taking care of the platform itself and the workloads. And this is just one of the many use cases. But this is the one showcasing the complexities very nicely. So what things you might need to consider with a platform like this. This is a little too complex for this talk. So we're going to simplify this a bit. We separate the different layers into the underlying infrastructure. So the actual hardware, servers or VMs including the OS this hardware side. So while some might be running on bare metal and some might be running on VMs. It doesn't matter all that much for this talk but some of the Kubernetes versions that we look at do need specific underlying not even Kubernetes versions but then you need specific underlying infrastructure like VMware or OpenStack to actually run their systems. Which is not something that you can say for vanilla Kubernetes. So above that you have the Kubernetes layer and above that the workloads running on Kubernetes ranging from edge workloads that as similar to before like mobile gaming, AR, VR processing and others to general applications that are less edge specific can just be workloads that you might run on the cloud as well. And on the left we have the orchestration and management of each of these three layers and we will focus on the Kubernetes layer and on the orchestration and management of the Kubernetes layer but also touch on the orchestration and management of the gray areas because they are often intertwined. So OpenShift for example the infrastructure and the orchestration of the platform comes with the same thing. Not every platform is like that but it does touch on it so this is why it's included. For more information on more complex examples and other reference architectures you can check out the Acryno wiki which is collecting all kinds of different blueprints and reference architectures. So these are helpful for understanding a little bit the ecosystem and what it actually looks like on the edge. Considering a large part of this talk is about Kubernetes I'll work through the essentials first before getting too deep into the differences for anybody who knows all this by heart don't worry it's not going to be a super long part. Kubernetes is an open source container orchestration tool which is basically a fancy way of saying Kubernetes manages your workloads your containers for you and make sure they communicate properly. And it was initially started in 2014 based on a Google project and has been maintained by the CNCF since its creation in 2015. Its ideal use case is for applications in a microservices architecture and it provides various deployment maintenance and scalability features while also providing an API and this is important that is expendable extendable not expendable via custom resources to allow for additions to the core functionality by users. And this is important because this is one of the reasons Kubernetes is liked so much because you can add your own Kubernetes native things to Kubernetes while leaving the core functionality intact so you're not breaking the actual Kubernetes by adding your own things on top and a lot of the distros and Kubernetes flavors that we're looking at today actually do this so they use custom resources to add their own spin to it. The architecture itself of Kubernetes works like this so you have a control plane with different pieces that are responsible for workload and cluster management and orchestration. You have the API server basically the brain of the operation you have the LCD storage behind the API server the scheduler to determine where workloads will get scheduled and the controller manager to implement one of the Kubernetes central patterns so a control loop to ensure that the current state matches the declared state so making sure that if something is not as it's supposed to be it changes it. And aside from the control plane components we have the work in our components at the bottom consisting of the kubelet and the kube proxy to facilitate the actual workloads in containers on the worker nodes and in this example the number of nodes and the control plane just being one single node is not an actual representation of a cluster it's just exemplary for the reference. There are various different flavors and distributions out there so this is a very brief and highly opinionated intro and the ones that we'll be touching on today and if you check out the CNCF webpage there is a list of dozens of Kubernetes conform certified Kubernetes conform distros out there and flavors the vanilla Kubernetes we already talked about so this is the upstream original version which is where the reference is on that architecture came from. Next thing are minimal versions so very lightweight distros and k3s and when I say lightweight I mean it's built as very small binaries install quicker, the user's less memory usually often run as a single node cluster that has simplified some of Kubernetes' default like not using etcd but still letting you use etcd but the default is smaller the enhanced versions are kind of Kubernetes plus so built in differences and or add-ons so the open network edge services software or openness for short is edge and networking optimized Kubernetes with a central control plane and edge nodes so already distributed version of it Red Hat OpenShift is actually a suite of products but we'll focus mostly on the OpenShift container platform it's a commercial version of Kubernetes with the focus on enterprise friendly default settings and add-ons especially security focused. Robonio cloud native platform is similar to Open as built on Kubernetes but also commercial and different networking optimizations and tools are added to handle multicluster management there's the cloud distros and Amazon's Elastic Kubernetes Service, EKS and Google Cloud's Kubernetes engine GKE are just two examples though probably the most well-known ones from the cloud platforms out there these are also based on Kubernetes with less control over the actual control plane because that is managed by the cloud provider themselves so you still can run your Kubernetes workloads as usual on them but you might not be able to change everything on the control plane and you want because you also don't have access to the control plane to your underlying system running the control plane. Specific edge focus is Google Anthos is actually a suite of different cluster options it in itself is not that edge focused but in the version that we're looking at it is so we'll be focusing mostly on the GKE and bare metal clusters because those are the ones that I've worked with Mobile Edge X is not really a Kubernetes distro it's more of a management system but it takes care of the cluster creation and workloads for you so it's included to showcase some of the other aspects that matter during choosing the project deployment. QBET is a different take on Kubernetes with the centralized control plane and the option to have cloud and remote edge nodes to all connect to that same control plane so in a way it's similar to openness but it is intended to be a more lightweight version as well so to minimize the resource usage at the edge. OpenYord is not very mature but promising alternative to QBET and it is installable on an existing cluster with a central control plane and edge nodes so basically turning your current cluster into an edge cluster now that we've seen the basic architecture of a cluster and that there are a range of different Kubernetes options out there how do we actually figure out which is the right one for a specific use case there's different things to consider first and foremost it's about the workloads or applications that are meant to run on the edge so what kind of workloads you're running and planning for and what requirements are coming with those that can range from security aspects for different tenants to requirements for specific hardware acceleration same for sensor equipment or potential IRT use cases that might not be able to or even need to run full clusters so if you'd only need to gather sensor data and forward it instead of processing you might not need much processing power so not much space so might not be a cluster at all the hardware that you have available or can purges for your edge or data center sites might be limited in terms of performance or number so that can be potentially a good point for using VMs for cluster nodes instead of bare metal clusters if you still want to separate tenants for example and you only have a limited number of bare metal nodes VMs might also be a good thing the connectivity and networking is another big aspect do you have specific requirements or limitations that you need to take into account from VPNs to specific agency or specific connections that you need to have or specific firewalls that you need to take into account the existing infrastructure and ecosystem play a big role as well so if you have licenses already and existing clusters and infrastructure they are definitely worth taking into account usually or very often companies don't want to switch something completely if they already built up something in one of the areas if your team already has experience with Kubernetes that should factor into your choices as well or at least the potential training and support should go into the planning phases of creating a platform like this it will also factor into the cost same as running your own machines versus getting a pre-box version or using edge locations from telco operators so basically using a hosted version instead of running it yourself those are considerations that a company needs to make by itself and some companies for example prefer to use managed services but others tend to go for more control and flexibility but also more predictable cost so let's start with the first range of options how many clusters do you need and how do you control them there's three different options on this slide to show the extremes in reality you're more likely to have a mix of them instead of just a single one so reasons to have multiple standalone clusters can be different Kubernetes versions so you might need different versions for workloads you might need different cluster-wide settings, separate clusters for security requirements for tenants, especially things that can't be namespaced or isolated with RBAC and other security facilitation and on the other hand if you have less clusters you have less overhead on the edge but also less to manage then again if you control plane your central control plane goes down that's it for you if you have a well, does depend on the version but we'll get into that the first option on the left is having a central or admin cluster centrally or in the cloud which may have different clusters at the edge locations and those can be different kinds but usually if you have a central cluster that is made to have clusters at the edge it's the same version of it so if it's pure Kubernetes or pure OpenShift or Anthos often it's not one cluster managing different kinds of clusters the second option is having a central cluster but acting as a control plane to edge nodes instead of having control planes at the edge as well so basically one large cluster with a control plane in a central location and the nodes either in the cloud or at the edge and the third option on the right is skipping the central cluster entirely and just using another form of cluster management like a central cluster that ventures fleet or proprietary management tools this is not necessarily that different from the left one because even for your central cluster you will still need some sort of cluster management but if we look at them specifically the first one having a central cluster or admin cluster and edge clusters that concept can be seen with Google Anthos so you have an admin cluster and user clusters you have OpenShift OpenShift lets you use the Hype operator sorry it's used Hype operator to run OpenShift clusters from a single OpenShift cluster and RobinIO also lets you do that by having a specific system to create clusters from one cluster and DIY method would be running Kubernetes with cluster API in MetalCube to make sure you're provisioning bare metal hosts and then adding clusters on top of that for the edge for the edge extension we have options like OpenNest Cube, Engine, OpenEart and the latter two are trying to have a smaller footprint for the edge nodes and OpenNest mostly adds functionality to vanilla Kubernetes on top so additional resources are needed the way the extensions work is usually custom resources extending the API and adding their own functionalities and then facilitating the potentially slower communication between the control plane and the nodes so what I said earlier with the control plane, losing the control plane and your nodes don't work anymore in this case there is actually a built-in system to make sure this doesn't happen with device clones and the ability to keep the nodes operational even when the connection is lost but it does depend on which version you're looking at not all of them are quite as advanced as the other ones the last option with the minimal clusters like K0S and K3S or with general just standalone clusters doesn't have to be minimal ones there's Kubernetes on this list as well because it doesn't come with an inbuilt admin cluster system and it's the same for the cloud based models you're not managing your GKE clusters from another cluster usually Mobile Edge X provides a cloud based management system and takes care of the cluster provisioning for you but you still have to provide the underlying virtualization system to make sure that the clusters get created inside that virtualized environment so that's it for these considerations if we look at the infrastructure and the cost so the left to right arrows don't necessarily mean good or bad or more or less it's just left for the first one is right one open source and the middle ones don't mean they're like half half they mean pieces of it are open source or it's based on open source systems and so you can see all the open source ones on the right and something like OpenShift for example has so many operators that are open source or tooling that is open source same for Anthos that you can consider it a strange hybrid of having to pay license fees but still having a lot of open source in there so these are things to consider with the cost so you don't only have the infrastructure and hosting cost but also the cluster operation and then potential licensing and support costs so while you do get additional costs you also get additional support with the proprietary version so this is something more of a trade off than a good or bad thing for the for another aspect for the cost is how much do you actually need in terms of resources so this depends on your workloads of course so if you have gaming for example you will likely need larger nodes GPUs and or other enhancements for those nodes to make sure that the workloads are actually being processed properly so then the smaller footprint versions probably won't be for you but this is also something to take into account if you don't have many resources at the edge so you might not be able to run a full Kubernetes cluster next to a sensor but K3S would be able to run as a single node deployment OpenShift for example is not the best version for having a small footprint because it just comes with so many added things to your Kubernetes cluster that you might not want to use it for a smaller thing like K3S that can also run on a Raspberry Pi kubetch and OpenHert are also focused on providing a smaller footprint especially on the edge nodes so they don't change much about your control plane or central cluster in the cloud but they do change a little bit on the edge node by replacing the kubetch with the smaller version for example so I think that's just kubetch for the manageability so the more Kubernetes API conform one of these options is the easier it will be for you to switch from one platform to another so if you're running something on Kubernetes pure Kubernetes you should be able to switch to all of the available options here but it might not always be the other way around so for example OpenShift specific RYAML or resources you might not be able to just go left to right to Kubernetes for RobinIO they have bundled workloads in a different way than the traditional deployments on Kubernetes so this might also not be the easiest way to switch away from or switch to Mobile Edge X is fairly Kubernetes API conform but also adds some basically workload connectivity requirements for their SDK GKE and EKS for a workload site very conform but you won't be able to do a lot of things with the control plane that you might be able to change with all the complete open source versions on the right then managed and self-hosted ones this is also something that kind of ties into the cost but you basically have the completely managed versions of GKE or EKS in the cloud Mobile Edge X has a control plane not reachable or reachable by you but not hosted by you so that one is taking care of your infrastructure OpenShift is kind of a hybrid because you have a managed version but you can also self-host so this is kind of interchangeable same for Anthos and RobinIO they kind of connect to but are not self or are self-hosted by you still and then you have all the completely open source ones which are generally self-hosted even though there are many providers out there that just basically do Kubernetes hosting for you as well for flexibility and integration so you have a choice between things that are a bit more stand-alone or fit into a larger ecosystem so OpenShift for example comes with much tooling for install and management and add-ons and integration but at the same time it is fairly limited to OpenShift so you won't be able to create a normal Kubernetes cluster with OpenShift Hive for everything in the middle there is this open-source component so you have the community around it you have people contributing and all the kind of tooling that comes with it which often works with the other ones as well so this is a nice piece that the community tools generally tend to work as long as the Kubernetes API is the same and for GKE and Anthos larger ecosystem in this case just means there is already this pre-built integration with the cloud so you don't have to worry about this piece but you don't get much more than that in terms of management for the on-prem versus hybrid versus cloud it's not that different from the managed versus self-hosted but it is a little bit more on how do you actually run a cluster across those lines so basically the fog version of it Q-Bedge and OpenYord are more open for having nodes inside the same cluster but across the lines so basically nodes in on-premises and nodes in the cloud while GKEK is a completely cloud based Anthos is also same as OpenShift the hybrid version and everything on the left is pretty much completely on-prem while you can integrate so workload communication of course with the cloud it's not as pre-integrated as some of the other ones additional considerations so a lot of this is often managed by GitOps you probably heard the buzzword and it's coming out of your ears by now but the idea of having a declarative auditable and facilitation of easier rollbacks by just having your current state of your environment in Git having some sort of way of having it in a control loop that actually makes sure that what you have in your Git is the version that you have in production and or development and or test whatever your environment is it is making it a lot easier to ensure that you're actually managing these things at the same time and correct it so just to keep that in mind for not having especially with edge notes right you don't want to have some sort of imperative version where you have to apply an Ansible script and then go there and change something and then maybe something changes on the way but you will never find out because you didn't log on to the edge note ever to the upscaling people is better than vendor lock in part this is something that I'm a firm believer in so while it might be easier now to just go for the support version generally it is better to teach people to manage your systems how to go from there and how to deal with it yourself because while support will always be helpful it might not always be helpful in time but then again I guess this is a question of how much money you can put into that the majority of tooling is another important aspect so most of the options we look at today are not that old so from one to four years I would say most of them go so the majority of it can be especially with the open source ones not quite where you would expect them for an enterprise environment and the last one workloads workloads workloads well I think this comes with it you've seen that there's various different options out there none of them are inherently better than the other ones that they do depend on what are you actually running and what kind of workloads do you actually need what kind of things in your platform from that so running Kubernetes in production they said it will be easy they said I hope this helps a little bit to clear up the different aspects of what to consider when you are running your own Kubernetes platform at the edge if it doesn't thanks for listening anyway and ping me at tinydata42 on twitter for comments questions, complaints, concerns whatever you can think of thanks everyone