 All right, everyone, let's get started. So thank you so much for attending our talk on custom resource definitions, also known as CRDs. It's great to be here in Shanghai. Let's start off with some introductions. So my name is Sam Gunnaratna. I'm a software engineer working for Pivotal from London. My name's Ed King. I work alongside Sam, also from Pivotal and also from London. Great. So there are a lot of talks about CRDs at this conference. It's sort of a hot topic in the community at the moment. So let's just set the scene about what we'll be talking about in this talk specifically. So Ed will start us off just talking about our motivations for giving this talk. Then since this is a beginner's talk, we'll just cover some basics about the Cates API, custom resource definitions, and operators. Then we'll move on to talking about what it means to build on top of the Kubernetes API for things that are not workloads based on Kubernetes. And we'll talk about the pros, cons, and some of the considerations for using the Kubernetes API for some things that are not so obvious. All right. So I'd like to just take a few moments just to run through some of the motivations that we had for flying all the way out here to China to come and talk to you today. And so I hope that this will help you to understand exactly where we're coming from and also to sort of better frame the rest of this talk. And so this all started about six months ago when Sam and I were assigned to a new team. And we were tasked with building a brand new system component that would eventually integrate into part of a bigger product that we were going to be shipping. And so we went off, we did some research, we spoke to some users, and we eventually came up with an idea for what we might like to build for our alpha. And the question then became, OK, so what technology should we use to build this alpha? Now, traditionally, we probably would have just reached out for a classic web server, probably written in Golang with a backing relational database. And that would have been great. However, at the time, we were very aware of a lot of noise coming out of the Kubernetes community around the Kubernetes API. And specifically with regards to extending the API by a use of these things called CRDs. And so our little ears pricked up, and we thought, OK, what might it look like for us to build the system component on top of the Kubernetes API as opposed to a more traditional web server with a backing relational database? And so we went off again. We did a lot of research. We spoke to some of our colleagues. We took a look at what was going on in the community. And we learned quite a lot of stuff. And so really, the idea for this talk is we just would like to go through some of our learnings and perhaps some of the pros, cons, and considerations for what it might look like to build something on top of the Kubernetes API. So that if you are yourselves asking the same question, like should I be doing this, hopefully by the end of this talk, you will have a better idea as to whether or not that's something that you might to look at further. OK, so let's take a quick step back and talk about the Kubernetes API from a high level. So the Kubernetes API is an API centric. So what do I really mean by this? So when I first came to Kubernetes, I came from thinking about wanting to deploy workloads, wanting to use Kubernetes as an orchestrator and a scheduler for my containers. But the more I came to understand about Kubernetes, the more I realized that one of the reasons that it is such a good scheduler is because it has this API with a set of rich features that allows for the Kubernetes ecosystem to evolve and for features and APIs to be added, an extreme rate from companies all across the world. And it's really the Kubernetes API that facilitates this. So let's have a little look at some of these high level properties now. The first is that the Kubernetes API is declarative. So this is the what versus how approach. So in an imperative procedure way, you might say, I would like to go to node A and start one pod. And then I'd like to go to node B and start another pod. You're telling the system how to achieve something. With a declarative system, you say what you want to happen. So you say, I want two pods. And by the way, don't have those pods run on the same node. And there's been an evolution over time of how you treat infrastructure moving from procedural to declarative. So you might have written bash scripts 10 years ago to manage your infrastructure. And then you might have chosen something like Chef or Ansible to manage your infrastructure. And then things like Terraform came along, which is a purely declarative way of defining infrastructure. And Kubernetes takes that to the next level being declarative. But also, as you'll see, constantly reconciling that state of the world. So next is that the Kubernetes API has this clear separation of states. So there are two pieces of states. One is the desired state that is specified by a user or another system. Then the other is the observed state of the system. So you have this clear separation that's uniform across the Kubernetes API. And that allows for ease of reasoning about the system. Next up is that the Kubernetes API is level-based rather than edge-based. So what do I mean by that? This is a terminology that comes from system programming and electrical engineering. And it's about how you measure the changes on a wire over time. So in a edge-based system, you notice the individual changes. And you keep a track record of those changes. Level-based, however, you just periodically poll and check the value. And so level-based systems are more forgiving for distributed systems. For example, if your component crashes in an edge-based system when it restores, it has to understand the history of all the changes that have occurred over time. Whereas a level-based system just reads the current value and makes it so. So really, it allows for less error-prone components to be written. And as I said, it's more forgiving. So next up is that the Kubernetes API is transparent by default. So writing resources to the API, you have a single control plane for everything. And permission allowing all of your components can access all of the other resources from other components. So this allows for composability between teams. Controllers can watch other teams resources and then extend the functionality. All right, so those are some of the high-level features that make the Kubernetes API really, really interesting. What I'd like to do now is just dive down a little deeper and walk through a typical example in order to take a look at some of the core components we're going to need to know about in order to understand the rest of the talk. And we're going to go with the classic example of creating a pod. And I'm sure that some of you are probably very familiar with this, so I try to keep it quick. I just want to make sure that we're all on the same page so that we have the terminology to understand the rest. So at its core, the Kubernetes API is a RESTful API backed by an SED data store. And there are a bunch of resource objects that are available on that API. So these are things like your pod, node, deployment. And these are the objects that users of the API are able to interact with and to create, update, and delete. And as a user of the API, it's up to me to declare my desired state of the world. As Sam said, the Kubernetes API is entirely declarative. And so for this example, I might say something like, I want one pod to be running. And the way that you do this is by posting YAML into the API. And typically, this is done with a kube-ctl-apply-f. Let's take a quick look at that YAML. We tried really hard not to get any YAML into this presentation, but that is a really hard thing to do, like when you're talking about the Kubernetes API. So we'll try to keep it brief. The important points to note here are the kind in bold. So here, we're specifying the kind of resource object that we're interested in. And then also the spec, where we are declaring our desired configuration for that object. And so we take this YAML, and we post it into the API. And it goes and gets stored in the SED data store. And so at this point, I need to introduce the next really important component in the Kubernetes system. And that is the controllers. So the Kubernetes API is comprised of a number of controllers. And each controller is watching for any changes to certain objects of a certain type. So for example, this controller in the top left may be looking for any changes to any object of type pod. And when it notices a change, in this case, that a new pod has been created, it gets notified about it. And it then goes through a process that's known as reconciliation. So essentially what happens here is the controller will go and look and grab the desired state of the world. So in this case, the new desired state is that we want one pod to be running. And it will then go and look out to the current state of the world. In this case, the zero pods are running. And it will then take whatever actions are necessary in order to combine those two worlds together such that the desired state is equal to the current observed state, i.e. the one pod is running. And then once this has been done, it will typically talk back to the Kubernetes API to update that original object to update its status. So it might say something like status, state running true, or whatever it might be. And this is really important because this is how other components that are interested in this object are able to find out its current state. So we've got our resources. We've got our controllers. How do custom resource definitions fit into all of this? Well, essentially, custom resource definitions, or CRDs for short, allow users of the API to define their own custom resource types. So perhaps I don't want to create a pod or a deployment. Perhaps I want to create something more exciting. Let's take an example of a lovely piece of cake. Everybody loves cake. What if I wanted to create, update, and delete cakes on my Kubernetes API? This is what CRDs allow us to do. And so in this example, we might say, OK, actually, I want a kind of cake. And in the specification, I'm saying that I want a Victoria sponge type of cake. And the nice thing here is that I can take this declaration, this YAML, and post it into the Kubernetes API in exactly the same way and using the same tooling as for any other resource that would exist on that API. And so I could post that off. It would get stored in the LCD database. But there's now a problem, because by default, none of the default Kubernetes controllers know anything about my cake types. And so at this point, nothing's going to happen. And what this means is that as me as the author of the CRD, I also need to break along my own custom controller and plug this all in. And this new custom controller will be watching for changes to type cake. I will know how to reconcile the world upon any changes to those objects. And this is really at the core of the extensibility of the Kubernetes API using CRDs and your custom controllers. And so you could imagine that you could keep going building up your own custom types, custom controllers, such that it looks a little something like this. OK, so let's have a brief look at the timeline of extensibility in the Kubernetes API. So around, I think, Kubernetes 1.2, third-party resources are introduced. And this is the first time that you had a way to add new things to the Kubernetes API. So shortly after, CoreOS introduced this thing called the operator framework. And so really what they realized is that Kubernetes is great for running workloads, especially like stateless workloads. When things have state and are stateful, you need to concern yourself with the day two operations, like taking a backup, keeping the thing running, making sure things have quorum, that sort of stuff. And so CoreOS developed this framework that allowed you to extend the Kubernetes API to run stateful workloads. And a great example is there will be an operator for MySQL. And so that operator, its job is to replace a human operator that might have traditionally looked after that MySQL cluster. This operator will ensure the cluster is alive and healthy and take backups and that sort of stuff. And this has proved an immensely popular pattern in the Kubernetes community. So the operator framework is one framework of actually doing this, but now there are more, such as KubeBuilder. So in 2017, CRDs were introduced as a replacement for third-party resources to have even closer alignment to the Kubernetes native resources and to be more powerful. And then after that, we started to see bigger systems being built on top of CRDs. So a good example of this is Knative, which is an entire new platform and set of abstractions on top of Kubernetes. Another great example is Istio, which allows you to get an entire service mesh on top of Kubernetes. And I think Istio has something around 23 or so custom resource definitions that it's made up of. So next, this is something that was very interesting. It was that AWS released an operator. And this was special because it didn't act on resources within your cluster. Instead, it acted on resources in AWS's public cloud. So as a developer, I created a custom resource asking for infrastructure, and the operator would go to Amazon's public cloud to provision that. And this is an interesting example of the operator pattern being used to not manage workloads in your cluster but managing something completely different. And so this really got us thinking, like, what is the next step of this transition for CRDs? So just in the near term, CRDs are hoping to go GA in the next couple of releases. There's this movement towards a desire to want to use CRDs to replace some of the native resources in core Kubernetes. So yeah, we wanted to spend the rest of this talk thinking about where this might lead. Cool. So let me paint you a picture. So for the next part of this talk, we're going to be running through an example service and take a look at what it might look like to build something on top of the Kubernetes API. And the example that we're going to go with is that of a home automation service. So we're going to imagine that Sam and I have a hugely successful home automation company. And we've got a really great home automation service that's used and loved by thousands of people all around the world. And before taking a look at what it might look like to build that on top of Kubernetes, let's just imagine what it may look like today. And perhaps it looks something like this. Perhaps we've gone with a classic microservices architecture, whereby some user of the system posts some JSON into some front end, some gateway. And the microservices go off and do whatever they need to do. So let's take an example here of perhaps this is what it might look like to switch on some lights in our kitchen. So we take this JSON, we post it in, and the microservices go off and do their thing. And because we're a hugely successful company, we have hundreds and hundreds of developers co-located all across the globe. And so it's quite likely that each service is going to have a corresponding team who are going to be responsible for developing that service. And really, each team, their main focus is that they want to be delivering the absolute highest quality best lighting service that they possibly can. And so let's draw some team boundaries around each of these so that it looks a little bit like this. And then we can also imagine that each service is going to need some data. And so each service is probably going to need a backing database. And so it's going to look something like this. And then we're going to imagine that we really care about a high availability. And the reliability of our service is really, really important. If this goes down, users are going to be locked out of their home. And it's going to be a disaster. So let's add in a whole bunch of high availability to each of these services. But we know that nothing is ever perfect and things do go wrong. And so when things do go wrong, we're going to need to make sure that each of the services has a whole bunch of retry logic baked in. And then finally, we're probably also going to need some sort of service discovery mechanism so that each of these services can find each other in order to do the job. And so actually, I look at that. And I think that there's a lot going on there. And importantly, there's a lot of stuff going on that actually is distracting each of the teams away from delivering the features on each of the services. And that's really the thing that they want to be doing. So there's a lot of distractions. Each team has to think about their storage, the backing databases, and all of the operational overheads that come with that, the high availability, the reliability, the API contracts between all of the services. All of this is just taking time and effort away from the main focus of each team, which is developing the best service that they possibly can. And so maybe there's a better way. OK. So we very well could have written this service with these microservices. And then Kubernetes would have been a great tool to deploy these services into. We could have created deployments and pods and run this service in Kubernetes. However, given that we now know that the Kubernetes API has all these great features, could we actually build our service on top of the Kubernetes API? So not using Kubernetes as a container scheduler, but just using its API to achieve this sort of business problem that we have here. So let's go through that example now. So first of all, each of the team wouldn't be responsible for creating a microservice, like a RESTful microservice. They'd be responsible for creating a controller instead. And we wouldn't have a gateway. Instead, we'd have our controllers watching a single Kubernetes API. And so before our teams had to manage their own database, now they can rely on the Kubernetes API to store their data via CRDs. So in this world, the user might express intent to the system, declaratively saying, I want the kitchen lights to be on. And so they could post this into the Kubernetes API, and they might have a resource that represents a room, for example. And I have the specification for what I want the desired state of the room to be in terms of the lights here. So you can see I have, I want two lights to be on with different brightnesses. So that could be posted in, and we would have a controller, the team that's responsible for that, watching the API. It would be notified. It would start to reconcile that request. It would look at the system and say, hey, so what would it do? It would create some light resources. And those light resources would represent the individual actions for each light bulb in our room, wherever it is. So the light resource just has a single spec requesting the brightness. And then this would be picked up by our lights controller. And this is the boundary between our two teams in this example. The room team just creates this light resource, and then it's another team's responsibility through its controller to satisfy that request. So the lights controller then would go and look at the actual state of the world, go and talk to the LED bulbs wherever they are, notice that they are not on, turn them on, and then update the status in the Kubernetes API. So you can see here at the bottom, it's set the status to reflect the spec because it has successfully reconciled the desired state, and now the observed state matches. So then our team's controller would be notified of this change because it's apparent of these light resources. It created them. It has an interest of what's going on with them. And then it would update the room resource with the status of the lights. And this is important because the user, or whatever, system component created the room maybe doesn't care so much about the internal resources used to accomplish the request. It cares about this abstraction. And so it can just look at the room resource to see the status of the request. And obviously in this world, if one of our controllers goes down, for example, then it's decoupled from the API. You can still post room resources, post light resources. And then when the controller comes back up again, it can just reconcile. And because of some of the aspects of the API being level-based, that will just work. All right, so that's perhaps fairly a contrived example. I'm not suggesting that you should immediately go out and rewrite a home automation service on top of the Kubernetes API, but maybe I am. Let's see. If you are thinking about doing this sort of thing, there are going to be some pros, cons, and considerations. And so for the rest of this talk, we're just going to be walking through some of the main things that popped up during our research for this. So I'm going to talk through a few of the more purely sort of technical considerations. And then Sam's going to talk through some of the wider considerations around this as well. We're going to kick things off by talking about storage. And really here, I'm referring to this idea that if you are developing on top of the Kubernetes API, you essentially get the backing STD data store for free. It's just there. It's part of that API control plane. And it means that it's probably owned and operated by whoever is responsible for keeping the Kubernetes API up and running. And so as a developer who's developing on top of that API, this is hugely enticing to me, because it means that I no longer need to worry about any of the operational overheads that are associated with managing your own database. But of course, there's no such thing as a free lunch. And there are some other considerations around here. It's not quite that simple. One of the main things that popped up while we were sort of looking at this is that XED is not a relational database, which means that some of the relational features we've come to know and love aren't available. And the example that we tend to give is, what if you needed to do a big join of your data? It's just not really possible. And so I guess the key thing to note here is just make sure that your data model is suitable for running in this new model. Loosely related to the storage is this idea of high availability. So when deployed in a ModiMaster configuration, the Cates API is highly available. And again, that's sort of highly available just for free. And this is, again, really, really enticing to me, because it means I don't have to worry about the availability of my API. It's just highly available for free. And I then have more time to focus on writing my services and shipping the features that I care about. There is another aspect to this idea of high availability. I think Sam touched on it earlier when we were talking about the high availability of the controllers themselves. And so there's maybe a question of like, hey, should I be running my controllers highly available? And I guess the thing to note there again is that in this like level-based system, perhaps it's actually OK if your controller goes down for some period of time, because all that's going to happen when it comes back online is it's going to read the desired state of the world, trigger its reconcile function, and away you go. And the last thing I want to talk about is performance. So again, the performance of your API is going to be largely dependent on the underlying EdCD data store. And there have been some concerns around the performance of EdCD in the past. I guess the key thing here is maybe the key worry is that if you've got a particularly noisy service, that then could potentially have a very negative impact on the performance of not only your system, but also like any other system that is also running on that Kubernetes API. And so we've got to be a little bit careful. I guess the general advice is if you're developing something that's going to be slamming the database quite a lot, like maybe like a metrics thing, or if you're doing something like dumping your logs in the database, probably not a good idea. But we do also have this idea that there's a big conversation going on in the community at the moment as to whether or not you should just be running one huge Kubernetes cluster, or if perhaps it's better to run lots of little ones. And if you're off the mind that lots of little ones is the way to go, then perhaps these sorts of considerations are actually not so bad. It's even possible that you may just have a separate dedicated cluster purely for these like extensions. You may not even enable the schedule on that, for example. So there's lots of things to think about. The final thing I want to say about performance is the API machinery folks are thinking about scaling targets for CRDs themselves. And there's a survey there. If you're interested, please go pull that out. So if we want to take advantage of all of the benefits of Kubernetes being declarative and eventually consistent for our own needs of our own service, then we need to put in some work to actually achieve that. One of those things that we need to do is we can't just take our microservice and turn it into a controller very easily. It's a completely different paradigm of writing software. Your service is probably like a RESTful service, and it's request and response, talking to a database, maybe talking out to some other services through RPC or more REST calls. However, writing a controller is writing a reconciled function that is watching an API and converging state. It's a very different way of thinking and writing software. And as Ed was saying, not everything fits. The Kubernetes API is opinionated and for good reasons. But if what you're trying to model doesn't fit, so you do need that request response synchronicity, you can't map your domain as resources, or you have a very complex domain model, then it's going to be an uphill battle, and you're going to be fighting the API to model it in Kubernetes. Next up, and this is particularly interesting for us, that what we really want to do is collaborate the way the Cates community collaborates. We want CRDs to be the standard interface between our teams. In the microservices example we saw earlier, maybe each team had to define their own API, and maybe you have company standards for that API. We also had to have a service discovery system so that we could discover all the other APIs available. So in Kubernetes, the CRD is the interface between our teams, and what CRDs are available in the cluster is the discovery mechanism. And so this leads to interesting collaboration patterns, where we can have different teams responsible for different CRDs, and maybe independently another team can monitor the status of one team and extend the functionality through this mechanism. So it allows for real loose coupling of interactions between teams. And finally, we don't really have time to go through too many of these, but there are a whole bunch of other API features that you probably want to consider and research if you're thinking about doing something like this. I'll just talk through a couple. So binary data, for example, you should not be using XCD to store binary data. If you do need this, this might not be the right approach for you. Alternatively, you could upload your data to something like S3 and then just refer to it in your specification in your custom resource. Next, if Kubernetes supports defining relationships between resources, and there's two main mechanisms for doing this. The first is a owner reference, which allows you to set up child-parent relationships between your resources. And this gives you properties such as if you delete the parent, the child resources will be cleaned up. The other common one is adding labels to your resources. However, it should be noted that managing the integrity of those labels is up to you as the controller writer. So you have to ensure that those labels make sense in a changing system. Next, versioning is interesting. So custom resources allow you to have multiple versions installed at the same time. So you can imagine that teams are writing new versions of their custom resource. But other teams are relying on older versions. So there's some backwards compatibility features there. And now in Kubernetes 1.5, there's a feature that allows you to migrate from one custom resource version to another custom resource version. And finally, since we have this one control plane, the Kubernetes API, and everything is represented as CIDs, we can take advantage of any tooling that exists for this control plane. So kubectl will work, any UIs that are developed will work, the way you list everything in your system will be ubiquitous. So all tooling will be the same. So I guess the question sort of remains, is it a good idea to use CIDs outside of the scheduler? So to CID or not to CID? So if you are wanting to run complex workloads that maybe are stateful on Kubernetes, the operator pattern seems to be the emerging technology and way to do this. And as CIDs get more features, they're going to become even more applicable to other problems, such as the home automation system we showed here. And I think we'll see maybe an explosion of experiments in this way in the next sort of 12 months to sort of see what things can we build with the Kubernetes API itself. But just be careful of jumping on the bam-lugging when any new technology comes out. There's a whole exploration of what is possible. So I hope this talk has sort of helped with some of these considerations and things you should be thinking about. Yeah, and so I think one of the main things that we learn in looking at all of this over the past six months or so is that people have very strong opinions about this, right? Some people think this is a great idea and you should go all in and it's going to be fantastic. Others are more concerned. They just stay away. And so whenever in doubt, we like to look to Twitter to see what people are thinking. There was actually a good tweet the other day, some of this up, which was this one here. And I guess the takeaway is, yeah, this is a new technology. I think people are getting excited about it. And I think quite rightly so. I think there is a lot of potential in this new way of working. And I'm really, really excited to see where it's going to go. But at the end of the day, we need to remember not everything has to be a CRP. And with that, I think that that's all we've got time for. So thank you very much. I don't know if there's time for questions. Perhaps if anyone has any. Or perhaps just come speak to us afterwards. Oh, yes. It's there. That was a very interesting talk. Thank you. So it was a very interesting talk. Thank you. So I have a question. So if I'm making a cluster-wide CRD, and if I want to have only one instance of that CRD on my cluster, is there a pattern to do that? It's like a single-term CRD. Because the answer is I'm not sure. So there's quotas, which have been introduced in the newer version of Kubernetes. But I'm not sure they're cluster-wide quotas. I don't know if anybody knows. I'm seeing shakes of head. So I don't think so. You can set quotas within a namespace, but not cluster-wide. But not today. Any other questions? Is those CRD controller only works deployed on the master node? Or my question is how those controllers scale? Yeah. Yeah, so actually, your controllers don't even need to be running in Kubernetes. So for example, when we're developing these controllers, one pattern we see is that you run your controller on your local developer machine. And you just have to point it and make sure that you have access to the master API. So you can actually run these controllers from anywhere. I think the typical pattern is that you would typically run them inside the cluster. But there's nothing that says that you have to do that. Yes, maybe? So I have a question about the multi-version of your CRD. So not sure in your case, how will you support the multi-version for your CRD? For example, for your case, for team A, he wanted the lighting service version, for example, version one. But the other server team see version two, how you manage the relationship between the multi-versions. The other one is how you validate your CRD schema, how you make sure all the CRD objects is correct. Answering your second question first. So you now have structural steamers for your custom resources. So you can ensure that the custom resource has the correct structure. But then I guess it's up to your controller to do validation. And you can do this via an admission controller so that you can reject requests before storage if it doesn't, it breaks any of your business logic, that sort of stuff. Question again? Ah, OK. So I guess you're always going to have to have communication between teams. So the question was how do you manage having multiple custom resource definition versions and multiple teams that rely on different versions. So I think CRDs allow you to have multiple versions installed. But you still are going to have the problem of needing to know to some degree which teams are consuming that version. Because you don't want to keep your old CRDs around forever as a team. You want to know when you can deprecate them. And I don't think there's any sort of technology solution for that. It's sort of a team collaboration sort of problem. So you mean, for example, I have two versions of live service. So I have no way to force the team A only use version one, but only by the collaboration between team. Is this the case? Using the newer versions in communities, you can change, you can migrate from one version to the other. But the other team has to be aware of that so that you don't break them. Or you have additive CRDs so that they can still rely on the older versions. So I want to know if there are any way to force, for example, the team A only rely on version one, but team B rely on version two. Is there any way to do to force this? Not that I know of in terms of forcing different teams to use different CRD versions. Not that I know of. OK, thank you. All right, well, thank you very much, everybody.