 Welcome back to another OpenShift Commons briefing, and today, Chris is back with us. You've seen him before when we've also talked about Crossplain, and that was a few months back. And Crossplain has applied for graduation from the CNCF, so that's exciting news as well. And Chris just had a blog post go out that I will send to Chris for those of you watching on all the OpenShift TV channels. You can go open up his blog post as well, and I will let Chris go ahead and introduce himself and talk about, well, I will just let you go, Chris. Thank you. Thank you, Karina. So yeah, hi everyone. My name is Chris. Today we're going to be talking a little bit about Crossplain providers, as well as their recent OLM repackaging effort from the Office of the CTO. So who am I? First things first. So I'm a software engineer in the Red Hat Office of the CTO, more specifically in the emerging technologies team. I'm also a maintainer on the Crossplain sub-projects for provider AWS and the provider in cluster. I'm also a six storage member and maintainer on some sub-projects. So my Twitter is, is Chris Tao underscore if you want to follow me there. My GitHub is just Chris Tao. So our agenda at a high level, we're basically going to go over the what, where, when, why, who for Crossplain providers, as well as then the how for OLM repackaging. In between there, we're going to go over the main components to help provide some more context for folks who aren't familiar with Crossplain. So let's start with the what. So what is Crossplain? Let's quickly go over the project data at a high level. So the first main pillar for Crossplain is really provisioning. Crossplain allows you to provision cloud resources from within your Kubernetes cluster. It also allows you to manage the entire lifecycle. So not only creating, you know, cloud resources, but also then updating those and, you know, eventually deleting them if you need to. And lastly, the specific providers expose information on these external resources, right? So you can not only create but also observe these resources and observe their state over time. Right. So something happens to your, your RDS postgres instance. You will be notified on the custom resource through events. And so from the Crossplain docs, we have that providers are packages that enable Crossplain to provision infrastructure on an external service. Right. So again, the main goal of a provider is really to provision things outside of your Kubernetes cluster or your OpenShift cluster. So these providers bring CRDs or managed resources that map one to one with external infrastructure resources, as well as controllers to manage the lifecycle of these resources. So something key to note in that sentence is this notion of mapping, you know, one to one of providers really being focused on fidelity for these external resources. This is going to be a common theme. But what are providers really, right? Definitions are all well and good, but what does it mean? So providers are similar to operators, right? But the main difference is that, you know, for operators, we utilize OLM for providers. We use the Crossplain provider to handle installation and all other management, right? So updating, removing providers, that's all handled by the Crossplain operator. And so providers use a lot of the same tooling as operators, such as QBuilder, controller runtime, and controller tools. For this reason, a lot of folks in the community often consider providers to be an opinionated form of an operator. But again, the main difference is that providers are designed to reference some external resource, right? So for a developer, the really nice thing is you don't have to worry about our back deployment and all the bootstrapping since there already exists a lot of tooling for new developers and folks that want to create new providers. So the where. So where are providers located, right? So Crossplain providers are open source and, you know, most will be available on GitHub under the Crossplain or Crossplain contrib organizations. So some examples, we already talked about the provider AWS and provider and cluster. There's also, you know, the provider SQL, which allows you to orchestrate SQL servers by creating users, grants, rules, et cetera, all your favorite SQL resources. As well as the provider helm, which allows you to manage and deploy helm charts using a custom resource. And lastly, the provider AWS that we talked about. But there's also many others that are not listed here. My personal favorite is the provider pizza, which allows you to order pizza from Domino's using custom resources. And so they're all available under the Crossplain orgs. So the when. So when does it make sense to create a provider, right? When should you create a provider versus when should you use some other solution, right? So it really makes sense to create a provider if you're consuming external resources, right? We've mentioned this before, but if you have some external resource and you want to manage the CRUD lifecycle for these resources, a provider is a really good fit. Similarly, there's a really high emphasis on high fidelity, right? A really important focus. And so you create a provider if there are resources that you want to use are well-defined and granular, right? That doesn't really make sense to create a provider for managing abstractions, right? Since that's not the goal. And lastly, a special feature in Crossplain is this notion of a composition engine. And so the best, with an asterisk, the best way to utilize the composition engine is to create a provider that manages your resources, right? The reason that there's an asterisk here is it's technically possible to utilize any resource within a composition, but it's really recommended that you only use resources that are exposed by a provider. So the why, right? Why should you create or contribute to a provider? Well, so Crossplain is a CNCF sandbox project. As Karina mentioned, they're applied for a graduation, so hopefully they'll be in the next steps in the CNCF soon. Yeah, and so all the providers are open source so anybody can contribute. So this comes with quite a few benefits, right? Not only is their shared development and maintenance of common resources, right? If there's 10 companies and all of them want the ability to provision resources on Azure or IBM Cloud, they can all share a common code base to achieve that instead of creating their own bespoke solutions. And for cloud vendors, they can choose to expose their API in Kubernetes through a common interface, right? So this is not only good for the cloud vendors, but also good for users, right? I would prefer a standard interface or a common interface at least for creating resources on Azure, GCP, AWS, and Crossplain helps to make that possible. And lastly, there's a very streamlined development and consumption process, right? Crossplain handles a lot of the messy parts, so cluster administrators and developers can get started on doing the work they need to do. So why might you want to repackage a provider, right? We've talked a lot about the Crossplain design and structure, but why might we want to take a provider and structure it as a standalone operator? So one of the first issues has to do with proxies, right? We'll go into more detail about this shortly, but something that's really integral to the Crossplain operator is pulling an OCI image, right? So the operator itself pulls an image. And so this can cause quite a few issues when there's a cluster that's running behind a proxy. This is not like an unfixable issue, but it's definitely one more thing for administrators to note. And it's not immediately clear that this is how Crossplain is designed. Similar problem has to do with credentials. If you're using a private container registry, you'll need to supply credentials separately to Crossplain and the Kubelet, right? As you can imagine, this can cause issues with credentials going stale or diverging between Crossplain and the Kubelet. And lastly, and I think the most compelling reason has to do with resource limitations, right? Essentially, you might not always want to install the Crossplain operator and allow Crossplain to manage the entire lifecycle of a provider. With the OLM repackaging effort, you can choose to only install the providers you need and directly provision resources from those providers. So for example, I might just want to use the provider AWS to create S3 buckets. And so I can do that with the OLM repackaging. And I don't have to worry about all the other setup for Crossplain or if there's any other issues. So who is actually creating and maintaining these providers? Well, there's three main groups from my perspective. There's folks from the open source community. So as I mentioned, almost all of the providers that I know of at least are open source and there's many contributors and active members from the community. There's also contributors for organizations like Upbound, Red Hat, Squiz and Accenture. And lastly, there's engineers from organizations like Alibaba, IBM, AWS and Equinix that contribute to the development of their respective providers. And so let's get into the components of a provider and OLM operator. So let's do a quick primer again on the provider AWS. So this is one of the most popular Crossplain providers with support for dozens of AWS resources. And for our purposes, it's also a really great example of a provider to dissect. And then from the OLM side, we're going to be looking at the memcached operator, which is a common operator that's used in guides and examples like this one. And it's also a great example of the standard structure of an OLM operator. So what are the core components? Right, so let's take a look at the memcached operator. So we can see here, first we have our API folder. This is where we define all of the API resources that this operator exposes. Then we have the controllers directory, which contains all of our controllers and reconciler logic for the aforementioned API resources. And lastly, we have the config and bundle directory. Together, these allow us to describe all the metadata and deployment information for our operator. And as for provider, we have, we'll see the structure is pretty similar, right? We have the API's directory here, which just like the API directory contains all of our types that we want to expose. And then we also have our PKG directory, which contains all of our controllers and reconciler. So as you can see at this point, the difference so far is pretty minimal. And lastly, or I guess most importantly is we have a combination of folders here. We have the package folder, as well as our cluster and build folder. And so together these three directories kind of handle everything around deployment and packaging that we'll get into in a second. And so at this point, I think most people have noticed that the difference here is mainly semantic, right? But the last point that we talked about around deployment for providers is really key. And we're going to hone into that a little bit more here. So how do we actually then deploy a provider, right? There's no deployment, there's no cluster rule, no cluster rule binding service account, nothing that I can use that's clear here that I can deploy with. So this is something that's missing, right? And so let's take a quick aside on deployment. So as we mentioned before, within the cross plane model, you have to have the cross plane operator installed already. And the operator, the cross plane operator, exposes a few different API groups. So namely, the PKG package dot cross plane dot IO group contains a provider resource, a provider custom resource. And this is one of the most popular ways to install a cross plane provider. All you have to do is define the image and tag that you want to provision. So for this example, we're going to use the provider AWS image with the alpha tag. So this is straight from the cross plane docs. One of the other ways to install a provider is through the use of a configuration, right? This is installed the same way as providers, but configurations are designed to expose XRDs or composite resource definitions that we won't get into in this talk, but that's a whole other topic in itself. But users can install providers through the use of dependencies on this configuration. So we can see within the spec there's a depends on array where we can define the provider as well as the version. The minimal version that we want to install. And so what is the delta then, right? Essentially, this boils down to where are the missing components related to deploying the provider. So, you know, my favorite way to try and figure out something like this is just to start hacking away and break things down. So let's actually take a look at what our package contains, right? So with a tool like undocker, you can actually browse through the contents of our image without digging through the entire build process, right? I personally would rather look at the end result than take a look at every step of the build process and dig through a make file. So let's pull our image with the 0.18.1 tag. We'll run undocker, which allows us to unpack the image layer by layer. And then we can see that within this directory, the only file is a package.yaml file in the root of the image. This is interesting, right? There's no controller. There's nothing here that stands out. It's just a yaml file. So if we take a look within this yaml file, it starts off with a long list of the CRDs that this provider exposes. And then all the way at the bottom in the last, you know, 143 lines, we'll see there's meta.package.crossplane resource here. And at the bottom of that, we have within our spec an image referenced here, which is the provider AWS controller, right? So this is the actual controller that we'll be using. And everything inside of the package.yaml is basically just metadata and CRDs that we have to create. So what's the rest of the puzzle then, right? So based on this, we can deduce that there's two separate steps here, right? There's our actual metadata image, right, which is the provider AWS. And then there's the actual controller, which is the provider AWS-controller. And so it turns out, if we take through the crossplane operator code, that the operator parses the package.yaml, installs all the CRDs, and configures the RBAC for the provider at runtime, right? So it parses all the CRDs and creates our service account, our cluster role, cluster role binding. All of that happens at runtime when we try to install the operator, or when we try to install the provider. And so for brevity here, we're not going to go over the crossplane operator code base in detail, but there is a link in these slides if anyone is interested in digging around there. And so at the end, the operator, the crossplane operator creates our deployment using the tag referenced in the spec.controller.image within our package.yaml document. And so this is where that aforementioned issue we talked about a little while ago around proxies comes, right? So if the main crossplane operator is pulling an image, we have to make sure that it not only has the appropriate credentials, but that it can also even access the container registry we're using. And so this is kind of the crux of that issue. And so this is notable, right? Because OLM opts to define all of our RBAC and deployment at build time, whereas crossplane opts to do it at runtime. So let's get into the actual process of OLM repackaging then. So there is a project repository that we can use to help out with this process. Namely, there's the OLM repackage repository within the Red Hat ET GitHub org. And so if we clone the OLM repackage repository and examine the contents, we'll see quite a few files here that we're going to be using. So from the top, we have the Dockerfile and Makefile that we're using. So the Dockerfile just contains everything required for building the controller itself. And the Makefile just has all the targets we need to actually build our operator and build our bundle that we'll see in a little bit. We also then have our project.boilerplate.txt file. So this just contains some boilerplate for our project file. This is used by the GenProject script here at the bottom. And so this project file just contains a lot of metadata about our operator. Lastly, I guess we also have the config directory. So as you can see, this has quite a few subfolders, the contents of which has been omitted. But we know that the CRD directory contains all of our generated CRDs. The manager contains our deployment. Manifest defines what our CSV, our cluster service version looks like. And lastly, within the RBAC directory, we have all of our generated RBAC manifests, which is related to the GenRBAC script. So this script will create an RBAC.go file, which contains all of our QBuilder annotations that are used to generate the RBAC during the build process. So we use quite a few different tools during the repackaging process as well. Namely, we use YQ, which is a tool for querying different fields within our YAML document. The operator SDK CLI tool, which is used to handle generation and validation for deployment manifests. And lastly, we use Customize and Controller Gen. So the former of YQ and Operator SDK must be installed manually prior to repackaging, and the latter two, Customize and Controller Gen, are both automatically installed by targets in our make file. So not pictured here is Golang and Docker or Podman. Golang, of course, you need for actually compiling code and running different generation, and Docker and Podman are needed, of course, to build and push up the image. So what is each process then of repackaging? So part one is all the setup steps. So you have to clone your target repository. In this example, we're going to be using the provider AWS. Then we're going to set up all of our dependencies. So this was everything that we saw on the previous slide. And we also have to make sure that we have credentials and access from those credentials set up for a container registry like key.io that we're going to use here. And lastly, you have to clone the OLM repackage repository. And then copy the contents from the OLM repackage directory into the contents of the root of the provider. If that doesn't make sense, we're going to go over an example now. So in this example, we're going to clone the provider AWS, as mentioned before. And so we're just going to be using the master tag so we can get all the most recent changes. And then we're going to CD into our provider AWS and clone our OLM repackage repository. So my preferred way to do this is just to create a hidden folder. I like to use .work. And so once that's done, we can just recursively copy the contents. So this will override our provider make file as well as bring the config directory as well as our Docker file. So now we just have to set some environment variables. More specifically, our container registry user or organization as well as the operator image that we're using. So under our org, we're going to use the provider AWS tag or the provider AWS image and the master tag. So now we can just run Docker build and Docker push with our new image. So within our make file now, we're going to pull controller tools and controller gen, generate all of our or cogen all of our files that we need. And in one second, we're going to generate all of our CRDs. The next step is running the gen RBAC script, which generates our RBAC.go. So the actual contents of this goal file is a bunch of annotations basically within our within this file, we have comments which describe what the RBAC needs to look like. And then we'll just quickly build the image. So for me, that was all cached, but if you're building this yourself, it might take a second. And so we've gone through the build process for the operator. We've added all of our environment variables, run the cogen and push it up to our container registry. And the last step of the build process is actually creating our bundle. And although we've created our operator and that's been pushed up, we still have to create the bundle. And this bundle contains all of our manifests and metadata. And this is put into a single package that's installed by the operator lifecycle manager or OLM. So now we can start the process of building our bundle. So we're just going to have to set all of our environment variables again. So again, with the image, the user, the repository and the tag that we're using, we're just going to set the operator image again as well. And now we can run the gen project script. So this just defines, as I said before, our project file, which contains a bunch of metadata about our operator. Although this isn't strictly required, this helps to provide more detailed information for folks that are trying to install our operator. So this will finish in just a second. Good chance for me to grab some water. Yeah, so there we go. That's done. And now we're just going to run a few commands here to template in some values into our config directory. Of course, we can manually go through this, but it's easier to just run a command here. And now we're going to rename our cluster service version.yml to include the name of this operator. So we chose to name our operator provider AWS. So we're just going to make that change. And now we can actually create our bundle. So this again runs all of our cogeneration for our CRDs, as well as generating our RBAC again. So this will take another second. And so just to talk about something we did kind of behind the scenes. So when we templated in all these values into our config directory, this just updates a bunch of references and metadata. So our bundle has been created now. We get some warnings here, but no issues, luckily. And now we can set our bundle image, which is the image reference for what our bundle is called. So I like to follow the convention of image-bundle basically. This will effectively just be called provider AWS-bundle. And it will have the same tag as the provider AWS. So now we can just build our bundle, which is just a Docker file. And this should happen pretty quickly since we're just copying a bunch of files around. And now we can push that up. Okay, perfect. So now let's actually run the thing. Let's run the provider AWS operator. So again, here we're going to set our bundle image. And now we can use the operator SDK to run the bundle. So this is not the way to run the bundle in production, but for our purposes here, this should work. And so the operator SDK will start the process of creating our catalog source, our operator group, and subscription. So these are the main resources required to not only create a new operator within OpenShift, but also it allows us to create our operator group, which defines naming spaces that we can target, as well as the subscription, which actually requests the operator to be created. And so now we're just waiting on the cluster service version to be created. We'll see that it's pending, now it's installing, and it succeeded. And so just like that, we now have our operator running. So we can check to see that all of our CRDs are defined here with all of our AWS resources. So now we can get started with creating an S3 bucket or RDS database or anything we want. So that's all for this presentation. Thank you for your time. My LinkedIn and Twitter are here if you want to connect or chat. Or feel free to pin me on Slack. I'm on the Kubernetes Slack as Krish or the Crossfine Slack as Krish as well, I believe. Are there any questions? Thanks, Krish. Hey, Krish, do you see any questions in Twitch? All right, we do have a question. Why is the repackaging necessary? Yeah, so let's go back a few slides here. There we go. So there's a few reasons why you might want to repackage a provider. If you're looking to get started off the ground as quickly as possible, it might not be the right fit. From my perspective, one of the main benefits of repackaging a provider is you can choose to just install the providers you need and directly provision resources from those providers while building your own abstractions on top of them. All right, you might not necessarily want to utilize the Crossfine Operator or Composition Engine. You could achieve something similar with, let's say, Helm, for example. So that's one reason. Aside from that, the first two issues here, you can either work around these issues, and I believe there's also fixes that are inactive development to kind of handle these. So for example, one use case that I can talk about myself is another project that I contribute to is the container object storage interface project under the SIG storage. And so for that project, we essentially are trying to create drivers or provisioners for object storage vendors. So we could use the provider AWS operator as a driver for the COSY project. And that makes things easier for folks that want to get started since they just need to install the provider AWS operator instead of having to install the Crossfine Operator and handle everything through that operator. Yeah, and I guess, I mean, if you're strictly an OLM type shop, just running operators, you may not want to have another layer of abstraction from, you know, Crossfine, where you can just kind of repackage it and plug it in to work with all of your existing OLM workflows that you've already established. So I do have another question, Chris, that I don't know if it was clear to everyone, but so will this repackaging like the package, the repackage repo work with any of the Crossfine providers? Yeah, so not only will the OLM repackage repository work with any of the Crossfine providers out there. There's also an automated version of this process. So what I showed today was kind of the manual way to go about it, but under the Red Hat ET org, there is GitHub Action that can be used to kind of automate every step of this process and repackage a provider for you and push it to, you know, a target registry. So that GitHub Action basically goes through the exact same workflow here, except, you know, it can be automated and applied to any operator. So let's say, you know, you always want to use the most latest version of the provider GCP. You could set up that action to run on a schedule, let's say once a day, you know, at 12am. And it can, you know, repackage the most recent version of the provider GCP and push it up to, you know, your favorite container registry. So Krish, I have one more question for you. And it's a leading one. Does this Crossfine operator have you tested it up with OKD yet? Yes. So we've primarily been using OpenShift, but we haven't directly tried it with upstream OKD yet. But if you wanted to use, you know, OKD directly or Kubernetes, vanilla Kubernetes, all you need to do is install the OLM operator and then you can get started so that you're able to achieve that pretty quickly. So it's not limited just to OpenShift OCP. Yes. Perfect. And Krish, you've been working mostly with the AWS provisioner, right? And you're the main maintainer for that provisioner. Is that correct? I would say I'm one of the maintainers. I wouldn't say I'm the main maintainer. Yeah, but I've been contributing to the upstream provider AWS for quite a while now. I was mentioning that because I was wondering if you had any tips as you're, as anybody else is looking to help write the providers, what are some things that you've run into, some gotchas, some tips that you'd like to share? Yeah, so I guess from my experience, I think the biggest tip that I could have for someone that's looking to get started with contributing to either the cross-plane operator itself or any of the providers. I'd say the biggest thing is just to reach out and talk to people. So the cross-plane communities is I think one of the best open source communities out there. Folks are always willing to help out and answer questions. And there's community meetings that run, I believe, every Thursday now. So that's always a really great place to start, just to talk to people and get a sense of what there is to work on. And more generally, if you're looking to contribute to an open source project, I think cross-plane providers are a great place to start. Not only is it a good chance to learn the fundamentals of contributing to an open source project and also everything about Kubernetes and Go, it's also a good chance to learn about a cloud provider. You might want to learn about GCP or AWS or Azure. And you can use your experience from writing a GCP provider or resources in the GCP provider to kind of learn more in-depth information about Google Cloud itself. In terms of gotchas, I think the biggest thing is just staying up to date. The cross-plane project is still, I guess, it's more stable now. But when I started contributing, the project was definitely still in the earlier stages. And so there would be breaking changes once in a while and it would be important to stay up to date. Now that the project has matured quite a bit, I think the API is pretty stable these days. Thank you. Hopefully that answers your question. Yes. I'd love to see different repackaging scenarios and using OKD as a playground for that. I think that I'll take that to the OKD working group that's kicking off in about 30 minutes. So, yeah, hopefully we can get some testing done from the OKD working group on this and give you some feedback there as well. So, and that should be able to find a blog post if we get that done on OKD.io sometime in the not too distant future. So, yeah, again, really great work. It's wonderful to see the use of OLM and operators to repackage this. And I'm looking forward to getting some feedback on this and watching the journey that cross-playing takes in the CNCF too in the not too distant future. My only critique, Chris, would be we should have used provider pizza because it's lunchtime. Yeah, you can get everybody a slice of pizza. Yeah, it's lunchtime somewhere here. We're still drinking coffee on a West Coast. That's good. So, yeah, Scott and Chris, really, this is great work that you guys have been doing and I'm looking forward to seeing a few more providers out there. So, we'll get those tested and give you feedback from the OKD and other arenas and Karina as always. Thank you for organizing this and making this happen. Thanks again, Chris. And Chris, if there aren't any other questions, can you please see us out? Thanks again, Chris. This is awesome. Always love having you on. Thank you, Karina. Thank you. Thank you, Diane. Take care. Thank you for having me.