 Hi, I'm Marcus Johansson. I'm going to talk to you about using crossplane to provision CloudInit resources, which is effectively text, which we're then going to use to provision bare-metal resources or VM instances that take a user data parameter. The user data parameter is a blob of string text, but CloudInit is a special format for user data that allows us to get more text in there by compressing it and mime encoding it. Let's take a look. What do we know about CloudInit? Well, it's a service that runs inside the operating system that knows how to determine what cloud it's running in, if it's running in a cloud, or if it's using some local configuration, say, pulling CloudInit data from a floppy disk from CD-ROM or from the metadata of your cloud provider. What is inside of that CloudInit? What is that CloudInit service processing, the data that it's fetching from that metadata service, the data that it's fetching from that floppy or CD-ROM drive? Well, what it's processing is a number of different possible formats. It could be mime-encoded, multipart mime-encoded text. It could be gzip-multipart mime-encoded text. It could be a shell script. It could be another specific to CloudInit format called the CloudConfig, which is very similar to what you might expect, looking at Salt or Puppet syntax or Chef syntax, where you can do things like declare what packages you want to have installed, define what SSH keys and what users should be preconfigured on the system, and enable a host of other features. So what does it look like to use CloudInit? Well, there's not an API that is out there called CloudInit.com or you do anything with. This is, I'm sure that domain exists, but that's not what we're talking about. So the way that you typically interact with CloudInit is through your Cloud Provider. If you're spinning up a VM instance or a bare metal instance, you need to provide some sort of way to provision that resource. And usually, they will provide a field called a user data field. In this user data field, they will make available through their metadata service. When CloudInit starts up, one of the first things it does is detects what cloud it's running in, based on that cloud that it's running in. It knows exactly where to find that metadata service. Within that metadata service, it will fetch the user data. When I say metadata service, like EC2 instances, they have this address that gets 169.254.169.254. And then there's a common URL path to get to the user data. I think in this case, it's just slash user data or it might be slash metadata slash user data. In any case, when system boots up, it is essentially fetching that web address. And what it finds there is some blob of text. The blobs of text that CloudInit recognizes and knows how to handle typically start with a shebang. And if they don't start with a shebang, then it also knows how to handle the multi-partmime format. So when you are provisioning the device, you have some field where you're specifying the user data that you want to boot with. And that's it. Like I said, the user data becomes available within the metadata service. The API that you're creating the device through or the instance through may let you modify that user data later or look at that user data later. Why you might think, why would that be a useful thing to do? Well, the user data is typically used at startup, but other formats of user data, say the ignition format or the kickstart format. In some systems, this data can be used on subsequent boots. Or if the API allows you to always fetch that user data, then you could sort of use it as a way to be bad and send messages from your API, from your CloudProvider's API to your metadata service that your instance can then pick up. Usually these metadata services are isolated so that that instance and only that instance can access that metadata service. Sometimes without a token, which makes it a very convenient way to interact with the machine. So this is cross-plane community day. Why am I sitting here talking about CloudNet and provisioning devices? Well, cross-plane is known for what is known for managing services on various clouds. It's known for managing storage in some cases, like through the REC provider through block storage services on various clouds and all kinds of SQL databases and manage Kubernetes services and any kind of managed service you can think of. But wait, there's more. Cross-plane also knows how to provision VMs and bare metal instances. Now, with the providers that are out there today, I don't know how strong that statement is. I know that the Equinix Metal provider does allow you to provision bare metal devices. I think that the DigitalOcean provider lets you provision droplets. I previously worked on a Winode provider that knew how to provision instances. So there are at least a handful, half a handful maybe. I think you see two instances are probably supported in the AWS provider since that's generated and there's some like 60 plus resources supported. Point being that with cross-plane, there are a number of ways to, number of providers that allow you to provision resources that would benefit from user data. You might be saying to yourself, hold on there, I don't need user data. I've got a host of other ways of provisioning my instance. I can write custom images. I can use cloud formation or whatever my cloud's version of that is. I can use SSH to just hop into the box and configure whatever I want. Okay, yeah, I can use user data. Maybe I can use cloud in it with it. There's also iPixie. Some systems allow you to skip the raw images and just provide an iPixie configuration and then you take the wheel from there. So why user data and why do we need more than just a text field to deal with user data? Well, one of the things about user data, endpoints fields in the various cloud provider endpoints is that there are limitations on it. So you might have a 65 character limit on your user data or whatever the size is, there's still some size restriction out there or you might want to specify multiple files. And if you were to supply this in your cross-plane provisioned EC2 instance or Equinix Metal Device instance, you're gonna have this user data field with this long blob of text that you're going to handcraft because you can because it's YAML because it's easy to get to. And then maybe you're going to MIME encode it by hand to add multiple files and maybe it got too long. So you're gonna Gzip it by hand and then Base64 encode that. No, right? We're gonna stop somewhere between here and there. And we're going to benefit from there being a cross-plane provider to do this work for us. Wait a second, cross-plane provider? There's no provider here. You said there's no cloud init.com, I heard you. And there's no remote API, right? So cross-plane providers have this concept of credentials and provider configuration. And usually there's some sort of status when the resource is done. You don't have any of those things with cloud init, right? So why? Why are we gonna make a provider at all? Why not? This doesn't have to be a cross-plane provider though. I hear you saying this and you're right. You could just make a custom controller that say writes to the cross-plane resource, writes the user data that you want or maybe it will do something similar to what we're gonna do with this provider and maybe it'll write out a config map and then we can read in that config map. So why bother using cross-plane to create this custom controller? And my answers are that cross-plane has an installer for installing various packages and when it installs those packages it takes care of their dependencies. We could go in reverse here and say that if you're going to install the Equinix Metal provider then I expect you to have the cloud init provider installed too. Maybe because it's useful for compositions or my examples or it's just like a recommended package. And one of the other things that that installer does is it sets up some roles that you can reuse and you can assign to give your application deployment teams and operations teams the correct roles that they will need to interact with your services. So if you're interacting with their mental devices you're probably going to need access to these config map resources or maybe some select ones. Maybe we're going to take these cross-plane resources and tie OPA policy checks against the fields and make sure that you're only writing to config maps that you're allowed to write to. And so there's a nice story there with cross-plane and policy. Also why use a cross-plane provider? Because the cross-plane runtime just makes it convenient to do so. We're already managing a or we're already authoring a provider for Equinix Metal. So a provider using the provider syntax the provider SDK the cross-plane runtime makes it a convenient way to write this cloud init provider. Another reason is that cross-plane providers give you a composability. Now you could say, well, anything's composable but at least in the past, I think it's still the case today there are limitations on what you can use in a composition and those things that you use in a composition have to be cross-plane package managed resources. So if we want to compose a combination of a device with a cloud init resource, they're going to have to both be cross-plane managed resources. And the future is so far away that that may not be the case anymore. I'll take the comments. So in the future, we might also want to make these different providers more tied together and cross-plane has a thing called references which allows you to grab one field from one resource and bring it into another resource. So perhaps in the future, we might be able to do that as a form of templating. Why do we need a cross-plane provider? We have XRDs which allow you to make these compositions that make it so that you don't need to make a controller you don't need to make a cross-plane provider for every possible thing you could imagine just create a composition for it. Well, I would really like to see that be the option here and the way out. I couldn't figure out how to make that work today. I did find a, I was referred to an open issue that deals with making available more functions that that compositions could take advantage of. The kind of functions that we would need to handle cloud init would be first off reading config maps, multiple config maps or secrets and then concatenating them all and then base 64 encoding that data and GZIP compressing it and then base 64 encoding that as part of a multi-part line document. So it's not a lot of complexity there. These aren't complex functions but we don't have a way to run these functions right now so we can't do it. So what does the syntax for this look like? Well, we've got our typical preamble of API version in kind. We've got our metadata that I've omitted here with a name, no namespace because this is a cluster-scoped resource as all providers are. So that means that when we're writing our spec here we're going to have to give, we're gonna have to name a namespace for both the cloud init config map that we're writing out and any config maps that we're going to read in as the source values for our multi-part config map. In here I only show the example of a plain text content but there's also a field to specify config maps and secrets baked into this provider and we'll take a look at that in a moment. The moment has come. This is my demo and we'll be running a find cluster with crossplane already pre-installed. We'll install the provider. I'm going to cheat a little bit and I'm gonna use make run which is available from the upbound build get submodule and that's just going to allow for a more rapid development cycle. In the case of this demo it's going to allow me to take a much rougher version of this code than I would like to deploy on the world and demonstrate with that. All right, so let's take a look. This is the project and in here we see an examples directory and here I have some YAML representing the resources that will need to test this out including a simple config map that will be pulled into the cloud init resource. Cloud init resource has a few more lines that we can cat. So there is the definition of what config map we will write out to and I just call that cloud init. We have to give it a namespace again because this is not a namespace resource being a provider resource, a managed provider resource. They are cluster scope today. So our for provider field again this is a field that doesn't really make sense in this provider. So maybe this level will go away in the future but the arguments that are available here are a boundary. We're just going to go with the default of mine boundary and the only reason you would change that is if your text contained the word mine boundary in it somewhere then you would have to set a unique boundary part. So there's multiple parts. This is actually a date, this should be parts. There's a config map key ref. This is the foo that we're going to be reading in from the default namespace. There's a key in there called foo and it's optional meaning that if that config map does not exist that's not going to stop this resource from syncing. And we're also going to put in a simple hello world script and then we're also going to put in a cloud config configuration that says that the users on the provisioned machine should be me and I should bring all of my SSH keys that are public on GitHub with me. Okay and then the other file that's in here is a provider config. Interesting thing about the provider config is that it won't actually load right now. There's no spec because there's no spec to provide and we don't actually need the provider config because this config, this cloud config resource just doesn't load a provider config. It doesn't have a provider reference. It doesn't need one. Okay on a previous view I showed this config mapping correctly. There's supposed to be some spaces here. I've added those spaces and now I can show that I do not have a config or a config map resource. Again config is the cloud init config resource or custom resource that we're provisioning. So now we're going to make run and in this examples directory we are going to first insert this foo resource which is just the one line shell script and then we are going to apply the cloud init resource which combines the foo resource with two other string cloud config parts, cloud init parts and now that's reconciled. So let's show what has happened here. Let's do another get and this time we can see that we have both the cloud init resource we've created. It created a config map called cloud init. We have our foo config map and we have our cloud init config map. So let's take a look at that cloud config init map, cloud init config map. And here we see that it has all of our multi-part MIME contents. It has the empty shell script here. It has the hello world shell script here and it has the cloud config which is all of this business here that gets my SSH keys in place. All right, so now we have a config map with a cloud init key ready to be used by any cross-plane managed resource that has say a user data field that could benefit from this. So now that we've seen the demo, I'm sure there are some alternatives that you're considering like why did you bother with that? I'm sure there's better ways to do it and yeah, there probably are. So what could we have done? We could have maybe gone the other way and told our managed resource, the Equinix Metal device how to read multiple config maps as user data. We could have told it that if it's going to do that then it should also MIME encode it, base64 encode it and that's probably fair. Now, this is something that I think user data is something that's reusable and other cloud providers are going to be able to benefit from it. So that's one reason for making this an independent provider. Another reason for doing so is do we really want to bake all of these things into each provider? No, maybe we could bake all of these things, bake the pending, the base64 encoding, the G-Zipping. Maybe we could bake that into cross-plane runtime as a new field type that's understood like it knows how to take secrets as parameters or some type definitions for that. So maybe it knows how to take a list of config maps and do this thing that a handful of providers could benefit from, maybe, maybe not. Another way that this could be implemented is maybe with just a selector field. So we'd say config map selector and provide some labels to look for to search available config maps for. We'll just take all the config maps that match that selector and we will do the base64 encoding. We'll do the MIME multiparting. We'll do the G-Zipping, all that. So there are other ways that we could do this. And in the future, another reason why I think that this is better to have as an independent provider. So in the future, I think there's a world where we can type templating into this, where we take fields from the various resources and we somehow get this cloud init provider to take those variables and apply them to the config maps that it's going to concatenate and ultimately render. Now, this is an experience that's similar to what you may be familiar with if you've used Terraform's providers for cloud init or Terraform's providers for templates. So they have an example right on one of these pages where they show you having a template file, applying some variables to it, combining it all into one cloud init data that the Terraform provider knows how to render. It knows how to take multiple parts. It knows how to take the options of base 64 encoding, MIME encoding, taking even a MIME boundary field so that in case you need a custom one for some reason. And so Terraform has all of those things already. That was one of the examples that we used in defining this concept of a provider. Well, Crossplane knows how to ingest Terraform providers. So why didn't we just make a crossplane provider that ingests this Terraform provider and get this for free? Well, the reason why we haven't done that is because, well, I think it might be a little complicated. It might be a bit bulkier to run Terraform in a container where now we're just running a little bit of string processing in a container. But the other thing is that the Terraform providers that are supported now and don't believe that they support data sources and the cloud init provider that's out there works as a data source. One more way that we could have approached this is to use the composition to construct our user data and that's an area that might be possible now, it might be possible soon. But effectively, we would need a composition that knows how to read in a config map that may or may not exist yet, take the data from that config map and drop it into the user data field of some other managed resource. We would bundle those two things up as a single composition and then we might have other compositions that we would use to say take fields from that composition and use those fields put them in our config map that we're going to use to provision our device. So yeah, there are alternatives, I wasn't ready to hop into any of those and I don't know how far I would have gotten with those. Writing this provider other than getting stuck on a few simple things, perhaps in part because I tried to copy an existing provider and just modify it to suit my purposes instead of starting from scratch. But in the end, it was pretty quick to turn this around. So I've mentioned that there were some influences going into this that were the Terraform providers. It's interesting to me that the Terraform Cloudinit provider it has been deprecated in favor of the Cloudinit config data source and the template file provider is seemingly deprecated in favor of a template file function which is a an HCL level function. So it would be interesting to see crossplane develop a similar function to serve a similar function. Since we don't have paths to load in, well, I suppose we do have paths. You can, you have paths that are volume mounts on pods but we're not dealing with that with crossplane. We are dealing with various CRD CR instances. There's no pod here for a path to be relevant. So the kind of pathing that is available to us is config maps and secrets and they live in different namespaces and they have different names and they have different keys. So perhaps some way that crossplanes function library or the composition library gives you access to managing, mangling, reading, config maps and secrets. Taking those variables and applying additional functions on them and then putting them back somewhere. So read them from somewhere, hang on to all those variables, use those variables to modify other variables with functions that are available and ultimately deposit that into a resource like a big map or a secret. One of the other inspirations for this was the helm provider for crossplane. There's some similarities in that both of them work within the scope of a cluster. There might be some blurriness here. I think that the helm provider has the capability of connecting to a remote cluster but a lot of the functions that it had made available to it really suited what I was trying to do. There were functions in there to read in, config map and secrets and that's something that I needed to do here. So I took a lot of inspiration from that provider. One of the big differences though is that, again, there's no API, this cloud provider, cloud net provider has that there's no API calls. I'm considering making some pretty drastic I think compared to the normal provider changes to this ripping out the provider config because we don't need to configure a provider ripping out the at provider and the for provider components of the spec and status because again, we don't need that. And so it raises the question like, why is this a provider at all? Or what are providers really built for? Earlier in the life cycle of a crossplane, there were some thoughts about having providers that didn't interact with remote APIs. They didn't have credentials, they didn't have this at provider for provider at provider and for provider fields. And you can tell that that was part of the thinking because of the way that the managed resources in crossplane runtime are named. They're named after external provider, external this. There's no external client needed here. So have we lost sight of what it looks like to make a provider that doesn't interact with an API and how can we improve that experience? Do we want to improve that experience? Taking Terraform for an example, Terraform does have a handful of these providers that just operate without an API and perhaps we need more of those in crossplane. Then again, we have compositions that are allowing us to skirt some of those responsibilities. As long as we can take advantage of functions and compositions, we might be able to avoid the need for API-less providers in crossplane. So I understand how janky this must appear to many of you. And so I'm willing to accept questions and negative feedback on the GitHub provider repo display slash provider dash cloud in it at the time of recording this in the future. Who knows where it might live. Maybe it'll live in crossplane contribs. And if you want to get a hold of me outside of now, the Twitter is one good way and you can find my work on GitHub at display. Thanks for listening to my rambling and I look forward to your comments. All my slides are available in the GitHub project. Take a look there, issues, PRs. Let's turn this into something useful and might actually mean closing this project down in order to make something useful. Thanks, bye.