 Okay, so it looks like we're on a good point to start. All right, good afternoon, everybody. My name is Ilya Chekrigin. I'm a founding engineer at Abound. I'm also lead maintainer on the cross-plane project. It's an open source project. We started at Abound about eight months ago and we open source it right before KubeCon in Seattle last December. And today I want to talk to you about Extended Kubernetes to orchestrate and support resources managed by public cloud providers. All right, so over the last decade, we have witnessed the emergence of cloud computing as predominant IT paradigm. Cloud computing enables organization to focus on the core business competencies and quickly respond to changing demands without expending significant resources on infrastructure and maintenance. Organizations can take, instantly take advantage of the world-class services across infrastructure and they can do so with the scale of their businesses globally with efficient pay-per-use model. Yet despite its predominance, the cloud computing remains completely under control of small set of cloud providers. Amazon, Microsoft, and Google are at the front of the race and compete aggressively for market share and talent. And while these providers themselves are heavy adapters of open source technologies, the cloud computing remains predominantly proprietary and closed source. Each cloud provider offers a walled garden of proprietary services that are designed to lock in and keep the customers in maximized utilization of this specific cloud provider infrastructure. And while your own services, when you write your services, they typically run in VMs, containers, service functions, or otherwise portable, more frequently they depend on platform resources like databases, message queues, big data, AI, machine learning, and so forth. The very resources that cloud providers offer as a managed services. And what makes managed services so appealing is that you don't have to worry about a lot of these tasks like provisioning, deploying, scaling, doing disaster recovery, backing up, in other words, all the tasks that are required to run production-grade services. The cloud providers, they will take all responsibility for all those tasks and they even give you SLA to guarantee uptime and delivery of your own services. In return, you will pay typically above and beyond to what normal hosting chargers are. And effectively, you could run the same services yourself on the same cloud provider and save some money, but that additional markup is proprietary specifically for the managed services. Or for managing those services. Typically, when cloud providers adapt open source technologies, they will offer the same services. For example, right here you will see MySQL offered virtual probably by every cloud provider in sub-shape and form. However, the service name and wire protocol could be the only common thing. The way you provision, scale, and maintain that service could be specific for a given cloud provider. And when you look at an overlapping set of services provided by different cloud providers, you can understand why there is ever growing demand for being multi-cloud. And what is missing? The control plane that spans across multiple cloud vendors and what is needed is the way to manage workloads and resources in uniform and consistent way across multiple clouds. So we are the KubeCon. So I'm gonna still do a quick intro to Kubernetes in case not everybody very familiar with Kubernetes. Just a brief couple of slides. Probably not necessary, but anyway. So Kubernetes is an open source container orchestration system for automating application deployment, scaling, and management. It was originally started on Google in June, 2014. And actually, this month, we celebrate fifth anniversary of Kubernetes. Quickly, within a month, other big company like Microsoft, Red Hat, IBM, Docker, they joined Kubernetes community and community grew at exponential rate. They had first major GA release in July, 2015 together with CNCF partnership announcement. And in the short five years, Kubernetes became de facto the platform to run containerized workload and services. It also achieved an amazing adoption rate among cloud providers. It became virtually ubiquitous today. Every major cloud provider offers Kubernetes as a service. With the latest one was adapted from AWS web services. They offered EKS in 2018. So clusters themselves became just yet another resource like MySQL database offered by all major cloud providers and again, similar paradigm. While provisioning them easy, but those provisioning steps could be very specific for given cloud provider. The way provision GKE could be different, slightly different, or a lot different from the way provision EKS or AKS. So needless to say that Kubernetes is pretty incredible product. It revolutionized the orchestration of containerized workloads and services. And when I started using Kubernetes in 2015, I was sold on the idea. By that, at that time, all I kind of knew and cared about was running containers, services inside the containers. It took me some time to realize that perhaps the best feature of Kubernetes is Kubernetes API. With its declarative style, so when user expressed what they want in other words, desired state of your system versus imperative style API saying that this is the steps I want to use to achieve that state, it is level-based which enables robust behavior even if you miss in some intermediate state changes. It is complete and authoritative and most importantly, it's extensible. And last part is very important because this in fact that what I've been doing for the last couple of years, I was working on the extensibility of Kubernetes and I was writing Kubernetes in native applications. So how do you extend Kubernetes? So any programs that reads and writes from Kubernetes API can provide useful automation. And there is specific pattern for writing those programs that pattern called a control pattern. Controllers typically read objects from Kubernetes API, perform some operations, and then save either the state of operations or state of an object back to Kubernetes API. Custom resources are extensions of Kubernetes API. And Kubernetes application combined with custom resource definition or CRD together, basically it's Kubernetes application is CRD combined with the controller. And operator term was if not invented, it was introduced by CoreOS company and it was a method of packaging and deploying and managing Kubernetes applications. The operator format gives software developer the template which will tell Kubernetes how to deploy and manage the application. And there are several frameworks to choose to author your operator. Most popular of them, operator SDK and cube builder are based on the controller runtime. There are many few others. Some of them very similar, some of them different like meta controller, I believe it's bash controller. And of course you can use client go yourself, very low level library to author your own operator. Not always writing from scratch. So if you look in basic stretch of controller pattern, it's kind of continuous operation, continuous loop, which basically retrieved the object from Kubernetes API, compare the spec of an object to actual system state. If they're the same, maybe there's nothing needs to be done and continue. Otherwise, if there are any differences, perform some steps to bring the actual state to a desired state and the same results into the object status. So when you write an operator, typically you extend Kubernetes types. So for example, most common pattern will be my operator will create additional deployments or state will set and has custom logic to do something additionally, which you normally would not do with the helm installation. In other words, you do something active lifecycle management. So operator will read from API and create those Kubernetes types typically in the same cluster and it's all kind of deployed in one cluster. And what that pattern became very useful is in the doing stateful set applications, specifically those applications which require some kind of data management. In other words, what's started with I believe with CoroS was at CD operator, Prometheus operator and then Rook operator. And that was all of them dealing with some additional data state, which normally you need to either some user knowledge to do manual steps. They kind of basically encoded them in operator logic. And that pattern started growing pretty rapidly. More and more independent vendors now package the applications with this operator format. And moreover, as you can see that there could be multiple vendors or multiple project dedicated to the same operator. Today, if you look in MySQL operator, you will see like up to five, we're probably even more right now, types to author or people how to author the MySQL operator. And nowadays, it's almost easier to name open source technologies which offer a stateful set solution or stateful applications for Kubernetes who don't have operator. Basically, and if you go to resource like what is it, operator hub.io, it's pretty good resource where you can actually list all the operators gradually you can actually select by level area of the main area right here and you can see them all. It's not the only place. I also recommend to go check out the awesome operators. At the end of the slide, I have a reference link and the slide is also available on the website of the scheduler, Kubernetes KubeCon scheduler. So check out this last slide as reference to all links. So now if you can provision your stateful services, not just run your MySQL database as an operator in Kubernetes, it's kind of tends to this question. So do I really need to use many services from cloud providers? Do I really need to use RDS from AWS or Cloud SQL? And this notion has very appealing aspect. When you run your data and database services in the same Kubernetes cluster, you ultimately have this ultimate portability. When you run stateless service, you can run it anyway, you can run the laptop, you can run it in any Kubernetes cluster anywhere, in Minikube, GKE, on-prem. And it's easy because it's stateless. Once your state get involved, now you decide what to run it. However, if you put your stateful applications also on Kubernetes, ultimately have this portability you can run as well anywhere. However, this thing has the, well, ultimate portability is a goal. It has some problems as well. Some of them temporarily, some of them can be maybe long term. For example, not all operator applications are very mature in the sense that even frameworks themselves like control runtime may be not exactly mature today. So they're still evolving. Other things that when you deploy your stateful application you still don't have this unified console like managed cloud providers offer you like where you can go and see all your stateful services at scale in one glance. So kind of if you can develop your own dashboard for individual services but in fact there is not one unified dashboard or console. You're still on the hook to providing all the support and SLA for that operator. And that is an important part because just because it's easy to deploy stateful set as an operator, it still does not absolve you to know what the stateful set is doing. So now you still need to have the main knowledge of this given service. And sometimes it's a little bit more than if you run, let's say in the cloud provider because cloud providers themselves take some responsibilities for doing that, for doing all those tasks I mentioned earlier. Excuse me. So what happens is that we can actually take the same operator pattern and we can extend it not necessarily only to Kubernetes types. We can take operator pattern and extend it to types outside of Kubernetes. So for example, just like I can use operator pattern create stateful set for my SQL using Kubernetes API, I can take the same concept and use cloud provider API and cloud provider SDK libraries to provision services on the cloud provider. So instead of doing my SQL in cluster, I can run my SQL in AWS using AWS provider API library to schedule and start up RDS instances or cloud SQL or my SQL Azure. And with exactly those goals in mind, we started and introduced the crossplane project. We, as I mentioned earlier, we open-sourced it right before the KubeCon in Seattle last December, excuse me. And the crossplane is multi-cloud control plane. A single crossplane enables provisioning and full lifecycle management of services and infrastructure across wide range of providers and regions. Crossplane presents declarative management style API that will cover a wide range of portable abstractions including databases, clusters, buckets that's what we initially started since then we added various clusters and we're working in adding more and more resources. And crossplane based on declarative resource model of Kubernetes. And it applies many concepts from container orchestration to multi-cloud workload and resource orchestration. Crossplane runs atop of Kubernetes and I'm sorry, yes, atop of Kubernetes and leverages cloud provider infrastructure. Crossplane extends Kubernetes API via custom resource definition and the crossplane is an operator. And each manager sources is represented by dedicated CRD. So today we support three cloud providers, AWS, Azure and Google. And we support, as I mentioned already, resources like relational databases, Redis, memory caches, clusters themselves and buckets. In addition to manager sources and providers, crossplane has notion of the resource classes and resource claims. So let's say first let's take a look at the cloud provider as a resource. Here's an example of defining cloud provider YAML for crossplane. You can see it has two parts. First part is the secret which contains cloud provider credentials. And second part is the provider which has a reference to that secret and maybe have additional metadata like in this case it's AWS has a region information. Now each cloud provider has a CRD and it has also dedicated controller or may have dedicated controller which will perform additional provider validation. Similarly to provider crossplane represents every single manager source by dedicated CRD. Each CRD is strongly typed using cloud provider API for that resource. And we try to kind of model it after the cloud provider API. And each type has a dedicated controller which is responsible for provisioning, validating and once resource comes up and become available for generating connection secrets so it can be consumed from the applications. It's also responsible for tracking the status and if it's diverged from declarative state it will attempt to make adjustable state just like similar in Kubernetes when you create a deployment. You run let's say tell deployment run with three parts. If let's say one part died or service not died Kubernetes will automatically schedule new parts. So similar to that it will attempt to do the same active reconciliation against resources. So if something changed in resource it will try automatically next reconcile look to verify and make sure the changes match into declarative state. And this is the example of my SQL server in Azure provider. Notice that it has pretty much the property set it's not complete property set but effectively which matches Azure API. So in addition to cloud providers and managed resources crossplane presents clean separation concerns by introducing what I mentioned earlier resource classes and resource claims. So developers when I'm as an application developer I typically don't really care what my SQL instance I'm using squint and not sometimes I do care but anyway if I'm writing my application let's say WordPress and it needs to consume my SQL. I really don't need to know it comes from the AWS RDS or from cloud SQL or from somewhere else. What I do care is the connection string and credentials. So I can connect to it and my application can perform a business logic. Now you as a cloud or cluster administrator can use classes which modeled very closely or inspired by Kubernetes storage classes and storage claims. You can model resource class providing all this metadata needed to provision resource and claim can be now slimmed down and have very limited information. So for example, here's this class for yes for standard Azure my SQL. If you notice it has very similar properties as a raw resource which we'll look in couple slides back. So right here this is the actual resource concrete resource my SQL server and this is class for that. So they're very similar in terms of completeness of the properties. Now if we look on the claim for this resource it now has a very limited information. In fact, it only has a class reference and potentially now has the engine version for my SQL which I want to use. So it has very powerful concept. Now I'm as application developer can use these claimants and hey give me my SQL in this cloud provider with the engine version 5.7. You as a cloud administrator can define what it means to provision my SQL in AWS for example or in Azure with all these properties. So it kind of creates two interesting notions. First of all it has separation of concerns. I don't know all these details of infrastructure. Moreover, I don't know the credentials to provision those resources. Those could be stored away in the protected namespace let's say a cross-plane system namespace to which I'm as application developer don't have access to. So I cannot simply go ahead and provision any resources nearly really. However cross-plane will understand the claim and will match with the resource class and provision and fulfill this request returning back the connection secret. Let's look at the next slide. Yes, so now I can model my application in similar fashion. This is just mockup example, otherwise it will not fit in the screen. But on top I have my SQL claim saying I give me my SQL engine version 5.7 and this is WordPress deployment which will consume that claim. And in this case the my SQL instance will generate the secret with the same name which now I can mount in my deployment and now consume credentials using a password even connections from it using them as a properties for example where the properties somewhere there. Yeah, somewhere here. So now if you look at this YAML file or manifest for application if I will need to deploy this against let's say different cloud provider. Let's pretend that it was defined against Azure and I decided to deploy it against AWS. Very little need to change in this YAML file. I'm as application developer in fact may even use the same class name if they were defined in abstract names. You as a cloud administrator now need to furnish all the resource class for the cloud provider. So now you can actually define classes for as many cloud providers as we support and I can apply exactly the same manifest file and provision my service without any changes. So that actually creates this very powerful concept of portability of my application. Now I can run and deploy it in any cloud provider without changes to manifest itself. So if that would be not enough, so there's more to that. In addition to cloud providers, resources, resource classes and claims, cross-plane provide definition and support for workloads. So what workload does is basically bring this all together. So now in one concise document I can define my payload in terms of deployments or services what I have to my resources and string it all together. And since I mentioned earlier that cross-plane is capable of provisioning Kubernetes clusters, now I can run cross-plane and a thin Kubernetes cluster, let's say in my laptop and mini-cube. And I define workload of the WordPress application and I can provide credentials to my cross-plane saying for AWS, Google and Azure. And I say, go ahead and deploy it. And what can happen if that cross-plane will say, oh, I need to find a cluster where to pull this workload or I don't have any. So go ahead and schedule one and it actually can figure out which to schedule. Today cross-plane supports direct schedule through the educational case. Schedule not cluster specifically, you will have to provide cluster selector. In the future, we can have a cost-based scheduler. In other words, we can say, oh, where is it cheaper to run my WordPress today? Which cloud provider gives me better price for managed database? Deploy it there. Moreover, if you run in conjunction with other applications, you can use affinity, anti-affinity and say, what is the best proximity or latency to deploy my applications? And cross-plane will fulfill that. And if there is no matching cluster, it can go ahead and provision one for you. So last, in December in Seattle, I did the demo. You can still find it on YouTube. Where basically I use this concept, I provision cross-plane using the same YAML file across three cloud providers and run the WordPress deployment across it. So I started with only cross-plane in my laptop and I end up with the JKE, EKS and AKS and three managed database providers like RDS, Cloud SQL and MySQL Azure and deploy the WordPress application to three target clusters, all from scratch. Just only providing credentials for the cloud provider. So a bound mission is to create a more open cloud computing platform. And at the heart of every cloud is a control plane. And we hope that cross-plane could become the control plane for open cloud. With open control plane, you can, anyone can add new API and extend to manage any open source or even commercial resources. So we are at the very beginning of a journey. Upbound is a very young company. We started about January 2018. So cross-plane is very young project as well. So I encourage everyone of you to come on GitHub, go on GitHub and check out and go through the examples. We have a lot of work through examples from the WordPress, from specific services or even GitLab example where we can run complete enterprise, great deployment of GitLab in cloud providers of each choice. In fact, GitLab partnership was very important for us. We kind of working closely with them as a real work application example to vet and proof our design concepts and constructs in cross-plane. With that, again, check it out, give us your feedback, submit PRs, open issues and it was pretty much talked to you today. Thank you very much for your attention. Okay, open floor for questions. I guess we have time for Q&A. So if anyone have questions, go ahead. Thank you for your presentation. I can see you can create deployments to a destination cluster. How can you schedule your deployment among these clusters, such as two clusters Right, so today cross-plane supports the scheduler through cluster selector. So in your workload, you will say select cluster based on the criteria. So just like in the common label selector, you can say select where provider equals AWS, for example. So you will actually have to specify that's the way it works today. Or even you can say if you name your clusters not exactly as cattle, you can say deploy in that green cluster. Tomorrow, in the future, you can say, okay, define the cluster way cheaper to run. So you can create scheduler, we're working on creating a scheduler which can be cost-optimized. Find the cheaper cluster to provision this workload. So now if I'm using Redis database, the cluster, which cloud provider gives me better rate or lowest price to run Redis? Then you can actually put your workload in that cloud provider. Okay, so from now on, so we can't schedule our deployment amount to clusters? No, not today, exactly. So I think what you're asking may be more closer to federation type of question. Yes, yes, okay. Yes, while cross-plane workload scheduling is similar to federation in a sense of propagation, we shared a lot of concepts and we kind of learned lessons from federation how to propagate our resources. The model is slightly different. In federation, you will have a control plane where you deploy your source here and it will be propagated based on your label annotations or templatization. To cross-plane have slightly different objective, we're saying that we're not gonna do propagation to multiple clusters. We will have specific cluster deployment only. Because again, right now, we're not pursuing yet the model where we're gonna have one application instance distributed across multiple clusters. Because that involves also solving the networking layer underneath that. In other words, when I deploy my WordPress in Google Cloud and AWS, I wanna make sure that they can use the same database, something like that. So that's a little bit more challenging problem to solve. Working that and probably in the future, we can have something analogous to that. Okay, yeah, let's see. Thank you very much. Any other questions? Oh, yeah. Okay, first of all, thank you for introducing to this awesome project. So I think my question, you had answered some just now, but I want to get more information about how you manage the data, manage and sync the data between the cluster, as you know. Right, so exactly. So this is very similar to the sense, not maybe from perspective of propagation, but yes, so when we schedule the databases in different clusters, data right now not gonna be synced. And if you want to do that, you probably need to find your technology provider which allows you this global multicluster data syncing. Because again, it's a solving network problem. In the future, potentially, if we have more solutions towards how to, let's say, punch the VPN hole between two VPC clusters, we can do something like that, but we don't have it today. And another question, yeah. So I think we will enhance it in the future. And do you have plan to define the PV and PVC and use it across the cluster? Oh, that's a good question. Well, we're not really in the domain of solving storage technology. However, abound founders are the same people who started Rook project, which is very close dealing with the actual storage providers for Kubernetes clusters. So in the future, we can provision Rook-enabled cluster which have different storage backends, cloud-enabled backends. So that's, again, that in the works, so we don't have anything right now. Thank you. Questions, any more? Okay. Hi, I have a question. When you install in the first place crossplane, you will install that in a Kubernetes cluster. That means that you've got one Kubernetes cluster that is not part of your crossplane. That's correct. Okay. Yeah, that's a, you can potentially reuse the same cluster to deploy your workloads as well. Kind of similar with Federation where you share the same cluster as your control plane cluster. But ultimately, you probably want to select cluster which has very limited permissions. Because typically when you deploy your application, it implies some level of elevated credentials or permissions in that cluster. So what we typically recommend right now is to have a very limited cluster. We have a clean suppression concerns which namespace can be used by application developer, which namespace can be used by administrators. How propagation happens into some other clusters where you can have total different RBAC setup. Maybe not even giving anybody access to that altogether or maybe different level of access. Thank you. All right. Well, thank you for your questions, guys.