 Hey, and welcome to the session on K-Ops, recent advances and the journey tone open source software Kubernetes distribution. My name is Urimarkis, I'm a head architect with SportVeter and I will kick things off with an introduction to K-Ops. So K-Ops is what we call a cloud aware Kubernetes installer. And what we mean by that is that K-Ops will first take the cloud vendors API and provision things like instances and networking. And as the instance is built, K-Ops will start installing a container runtime and Kubla, and then it will start installing the main components like HCD and the Kubernetes core components, which make up what we call the control plane. And then finally, it will install other vital components such as a container networking and DNS. And until recently, that's been kind of the main concern of K-Ops. But what we see happening now is that more and more functionality is being split out of this larger control plane and into separate standalone components. These new components can be things like search drivers, it can be image potential providers. Even the part of Kubernetes that interact with the cloud vendors such as the thing that takes a service object and a type load blancer and converts it into a cloud load blancer. The typical install instructions for these split components is like apply this large amount of YAML or use the soundtracks. These new split components, they have their own life cycle and you have to track your, as an operator, you have to track yourself the compatibility between the control plane and these split components. And that's pretty big additional burden on the cluster operators. So what K-Ops aim to do is to take on these components and manage them transparently, just as if they were a part of the old control plane. And that's what we mean by becoming an open source software distribution. So you have the kernel, the control plane and in addition, you have all of these smaller components around it. And of course, for those who have an existing production cluster, we want to make the transition from the larger control plane into using these split components. So for example, when we added support for EBS CSR driver, the migration is completely transparent and the user doesn't have to do anything. It just happens in the background and we will track the, to ensure that the right version of the driver is being used based on what Kubernetes versus your running. So in addition to these very essential components, we also started supporting things that aren't essential to the cluster, but that is typically installed on every cluster. Things like cluster auto scaler, if you're running GPU workloads, the device drivers for that, when AWS typically wants node termination handler and so on. And many of these Adams are not that easy to install either. They might, for example, require additional permissions to the cloud vendor. And the chaos will just handle all that for you. Our goal here is not to support like an immense large number of Adams, but rather a few really essential ones. At the same time, we also don't want to support every feature of these components, but the ones that most people typically will need just to reduce the burden of operators. So one of the things worth mentioning around this is that we have now start to support external permissions for these Adams. So instead of the Adams inheriting the instances privileges to the cloud, they will get their own roles and can interact directly with the vendor and authenticate directly. So this is what is known as IAM roles for service accounts on AWS or workload identity on Google Cloud. So that means that in order, for example, to support cluster auto scale, we don't have to add permissions to instances themselves to be able to set the desired flag for ASGs. It will be assigned to the service account of the Adams instead. And this will ensure, for example, other posts that are running on the same instance account we use that those privileges. And that's a pretty significant increase in security. So actually, if you are using external Cloud Controller Manager, then the control plan instances don't really need any privileges. So they're all just a small set that Kops itself requires in order to bootstrap the cluster. In addition, Kops can also install IAM roles for your own workloads. So you can, in a cluster configuration, you can specify what kind of permissions they need and the service account. And Kops will provision the role and set up the trust relationships and everything. So that's everything from me. And now I will hand over to John who will talk about security improvements. I'm John Gardner Myers, a principal engineer at Poofpoint, and I'm going to be talking about some of the security improvements that have been made in Kops. One big behavioral change in Kops 119 is that update cluster and export cube config no longer export an admin credential by default. There is a new admin flag one must use to do that. The create cluster command still exports a credential by default. Additionally, the lifetime of an exported credential will be 18 hours. One can supply the admin flag with a different lifetime duration. If you authenticate to the API service with something like OIDC, you can instead use the user flag, specifying the name of an existing cube config user to use with a cluster's context. The combination of these changes should reduce the probability and impact of these admin credentials inadvertently leaking. We have improved the way we provisioned several certificates. Previously, all certificates and private keys for our cluster were created and placed in the secret store by the Kops command line when the cluster was first created. All certificates were given a 10 year lifetime. Notes would use their instance role or other ambient credential to read the certificates and keys from the secret store. That mechanism is still used for certificate authorities and the service account signing key, but we have made changes for client and server certificates. Control plane nodes now use their access to the CA private keys to issue their own client and server search upon node bootstrap. The lifetimes of these certificates have been reduced from 10 years to somewhere between 455 and 485 days. That 30 day skew in lifetimes is so that the nodes created around the same time would be likely to have their search expire on different days. This increases the chance that an expiration would be noticed and fixed before it happens on multiple nodes and thus becomes a service outage. The skew in lifetime is derived from the node's interface addresses so that all the certs on a given node will expire around the same time. Kops also has a new mechanism for provisioning certs on worker nodes. Nodeup, which runs on the worker node while it's bootstrapping, generates the private keys for the certificates that the node is going to need depending on the configuration. It sends the corresponding public keys to Kops controller, which runs on the control plane and has access to the CA private keys. It authenticates its request using cloud provider specific instance credentials. Kops controller, after verifying the credentials and performing an authorization check, then issues the certs and returns them to Nodeup. Asserts have lifetimes of between 455 to 485 days using the same skewing algorithm as the certs provision for the control plane. This mechanism is currently only implemented for AWS. Other cloud providers still have worker nodes retrieved shared tenure, certs, and corresponding keys from the secret store. Implementing the cloud provider specific authentication for other cloud providers is an area where we would appreciate contributions. As a version 122, Kops supports graceful rotation of the CA and surface account signing keys. All of the necessary components have been extended to support trusting multiple CAs. There are new commands for performing the steps of rotation, creating and staging trust in new keys, promoting the new keys into active use, and removing trust in the old keys. By default, all of the CAs and the service account signing key are rotated at the same time. There is a documented procedure for performing a rotation, complete with rollback instructions for each step. Kops has improved its support for using local asset repositories. Local asset repositories are for when downloading files and images directly from the internet is undesirable. The ability to use local asset repositories has been a longstanding, but obscure and poorly documented feature. To use it, one provides in the cluster spec, locations for an image repository and or a file repository. The problem then becomes how to get the assets into the local repositories. The previous mechanism was particularly obscure and hard to use. Kops 122 adds a much simpler get assets command for either copying assets to the local repositories or getting the list of assets one needs to feed into an external process. With fees and other changes, Kops continues to improve the security of its default configuration and the ease of securing it further. Next up, we have Peter Riffel who will be talking about advancements. Hi, my name is Peter Riffel and I'm a software engineer at Datadog working on our internal Kubernetes platform. I'm also a Kops maintainer and I'm going to be talking about other recent advancements we've made in Kops. First is our Azure support. As of Kops 120, Azure clusters are now supported in Alpha. Kops can create virtual machine scale sets in availability zones for each Kops instance group. It supports using either public or internal load balancers for the Kubernetes API. We can specify custom VM images to use in instance groups and clusters can run on shared networking defined outside of Kops. For example, shared virtual networks, subnets and route tables. And this is what our Azure roadmap looks like. Azure DNS will allow the Kubernetes API to be accessed via domain name. Blob ACLs will allow finer grain permissions for objects in the state store. As John discussed earlier, we want Azure nodes to join clusters using Kops controller similar to AWS. And we plan to allow Azure resources to be defined in Terraform similar to our existing support for AWS and GCP. If you're interested in Azure support, please give it a try and subscribe to that GitHub issue for updates. Another area of focus has been our end-to-end testing. We've recently began testing more add-ons and running their own test suites to confirm functionality. We also have tests for IPv6 in AWS as we work to stabilize those efforts which Cyprian will discuss later. Upgrades of both Kubernetes and Kops are now being tested in addition to ARM64 and CA rotation. We recently added testing in DigitalOcean which helped with its Kops support being promoted to beta. And we plan to do the same with Azure soon. All of this testing aims to catch bugs earlier in our Kops release process so that we can offer a more stable experience to users. Lastly, I'd like to discuss a new feature in Kops 122 for Terraform users. Kops will now define many of the state store objects as Terraform resources. This includes add-on manifests, static pod manifests and node-up config. This is primarily to resolve a race condition during upgrades where Kops update cluster yes was updating objects too soon. They will now be updated during a Terraform apply which is closer to when launch templates are updated which reference these objects. A nice side effect of this change is that it improves discoverability of changes to these resources. Even that you can preview the content changes with a Terraform plan which was previously not possible. At the moment, this is only supported by S3 but we plan to expand support to GCS and other cloud providers as they adopt Terraform support. And next is Cyprian discussing the future of Kops. Hello everyone, my name is Cyprian Hackman. I'm a senior DevOps engineer at IO and today I will be talking about the future of Kops and IPv6 support in particular. So let's get started. 20 years ago, everyone thought that the end of IPv4 is near. They were wrong and feels like the lifetime ago but things are finally starting to happen. IPv4 addresses are harder and harder to obtain each year and much more expensive. For IoT, telecom and large enterprises using IPv6 is no longer a preference but a hard requirement. Is Kubernetes ready for this? Is Kops? I think the answer is yes. These days, most companies use a cloud provider for their infrastructure. As the biggest cloud provider, a history of AWS support for IPv6 should give a better idea of where we are today. Until 2016, the IPv6 support in IPv6 was limited for various service endpoint. It was useful to expose services but was yet not appropriate for large scale setups. Five years later, AWS added IPv6 support for EC2 instances in VPC. Still not ideal as it meant working with individual IPv6 addresses but it made it possible to experiment. Microservices pushed things forward and this year seems to be the year of IPv6 in AWS. Assigning prefixes to EC2 instances enabling IPv6 for services like instance metadata, time sync and DNS adds most of the missing pieces. Kubernetes also unofficially supports IPv6 since its very early days but official support was added only four years ago in version one nine. And since then it's been quiet until version 116 when IPv6 graduated to beta and support for dual stack was added. There was a major milestone. Finally, operators could use IPv6 but also connect to IPv6 services which allowed for a more gradual migration of the infrastructure. It took a while to fix most needs and this year dual stack support graduated to beta and was enabled by default in Kubernetes but what about KOP support for IPv6? Discussions about IPv6 support were started early last year by Antonio Ojea one of the initial developers of the Kubernetes feature. Ecosystem was not yet ready for IPv6. Many of the things were still available only via IPv4. We decided to wait a bit longer and with the release of Kubernetes 121 and dual stack support, we decided to give it another try. A few weeks later, the cloud provider part was ready and we moved to the networking setup. We used in parallel both CaliCo and Celium with unique local addresses, which means not. Quickly it became obvious that something was missing. The AWS Cloud Controller Manager was not aware of IPv6 addresses, which was a wow moment for us. We tried to create a patch, we submitted it upstream and with the custom built based on that we moved forward to testing and it was successful. Finally, we could say that KOPs can work with IPv6. Conformance tests were passing and everything looked great. So coming with KOPs 122, IPv6 will be available as an option. What's next? Global addresses using prefixes instead of unique local addresses is a much better solution. This would allow clusters to communicate with the outside world without any need for that. Simper setup when AWS will allow using launch template for prefixes is also preferred. The NAT64 appliance will also be a much needed addition. What are the challenges? Cloud provider support is still being developed. We would like to see better support for this from other cloud providers and a better API from AWS. IPv6 prefixes are still not quite easy to configure at OS level and each OS handles them differently. CNI support for IPv6 is missing some features that are available only for IPv4 but considering the recent advancements in cloud provider support, I think this is not very far. As for any open source project, KOPs is always looking for new contributors. This is what drives it forward. We would like to invite anyone interested in contributing to join our weekly calls and see if you like it. Thanks everyone.