 Hello all, welcome to the next section, so we'll be having an open provisioning IDM clients in OpenStack and the bio of the speaker, Rob Krityanam is the principal software engineer in that had, working on security in OpenStack, in this session he will be taking an overview of certificate management issues in OpenStack instances. So please, over to you. Alright, so as he said, my name is Rob Krityanam, I speak very quickly, I did a speed run of this talking about three minutes last night, so if I start going too fast, let me know. So, I'm also loose with language, I use a lot of acronyms and things, so let me just go over some stuff so you can follow along, so when I say instance, I mean an instance in Nova, this could be ironic for a VM or whatever, I try not to use the word server because then that gets too confusing. Cert and certificate, I use interchangeably, it's X5 and 9B3 certs, and then IDM, IPA and free IPA are all the same thing. So, what is auto provisioning? What I mean here is I want to be able to launch an OpenStack instance and enroll it as an IPA client. The reason I want to do that is for triple O, triple O is OpenStack on OpenStack, and the idea behind it is that you deploy one cloud and from that cloud you deploy your production cloud, and this gives you a lot of benefits and I encourage you to look into it, I don't want to have a triple O talk, but when I deploy that production cloud I want TLS enabled endpoints on it for TLS everywhere, and the question is where do we get the certs from? I can push them in using Ansible or Heats or Puppet or any number of other things, but the problem is then the private key exists outside the OpenStack deployment and inside, and you normally don't want private keys floating around. Another alternative is to pull them in somehow, and I'm going to use certmonger, and certmonger is a service for requesting certs and managing them, but the thing about certmonger requires a credential, you have to prove who you are, because you're already inside this booted instance and you want to get something from the outside and you have to prove who you are that you're not just some random schmuck. So these are the two questions I had to answer like how do we automatically use the certs, so we need a CA obviously, and then this instance in OpenStack don't last very long typically, right? So when they go away we want certs or votes, not just deleted, because we don't know what happened to them in the meantime, when I'm gone, assuming you're OCSP or CRL, so I shouldn't be usable anymore. So these are the tools I had. I decided the NOVA metadata service, cloud init, free IPA, and certmonger, so let me go into each of those. So metadata service typically has been static up to about Newton. It had things like the instance name, the name, the EUID, the image it's booting from, a bunch of other information, and it also has user properties that you can set when you launch an instance, and they can be whatever you want, it's freeform, just name, value pairs. And they're available by this magic URI from within your OpenStack instance, and networking handles that only that instance can get that instance data. And it's referenced by date, and the date being the date of an OpenStack release typically, or the word latest if you always want the newest stuff, is the way to handle API and backwards compatibility. And there's two things you can get, you can get user data.json and metadata.json. I'm going to focus on the metadata. And they also say there's a new thing with Newton called dynamic metadata. So we had an implementation of what I'm going to talk about in a minute, fully working in OpenStack using this thing called NOVA Hooks. And NOVA Hooks, there were six places in NOVA you could write a plug-in to access internal data and really get total control over the NOVA internals. We filed a bug against one of those things that wasn't working. They were back that this thing existed and immediately deprecated them. So we had to come up with another alternative and they wrote dynamic metadata. So rather than static metadata, which is all fixed and generated by NOVA, they wrote an arrest API. So their client calls arrest server, posts a certain set of functions or methods, and you can get data back. And they just pass the data along, they don't interpret it at all, it's just JSON. And this is how we're going to get information from IPA into the OpenStack instance. Cloud init, it's like first boot for cloud images. It can do all kinds of cool stuff, it can install packages, it can write arbitrary files, it can update your DNS settings, set your host names, and all kinds more. So we're going to use this to do our enrollment. And then free IPA centralized identity. So it comes with a CA, it's optional in our case, it's mandatory to manage SSL certificates. It has host-based access control, I never say that. That's going to become interesting later. Centralized sudo, also going to be interesting later. It has Kerberos, which we don't need right now, but you never know. And clients can enroll two ways. They can enroll using a username password or a one-time password. And if acting as a DNS server, then IPA can handle the floating IP assignments within Neutron. And finally, certmonger, all it does is manage certificates for you. So it can request certs, it knows when certs are going to expire, it'll try to renew them for you so you don't get a 2AM call on Christmas morning saying my server's down. I think we've all been there. So this all led to the NovaJoin project, or the second generation of NovaJoin. It's two servers. One is a REST API, which handles the posts from Nova. And all it does is add host IPA. So it generates a host name based on the instance name plus the IPA realm, or domain name, and then it generates a random one-time password. And it adds that to IPA and then sends the password and the host name back through the metadata. And it's got a notification listener. And so whenever something interesting happens in an OpenStack, it generates an MQP notification for like, you know, I'm creating a host, the host is being added, the host is done being added. And so it similarly with deletes. So when we see that an instance has been deleted, we try to delete that out of IPA as well. And that's going to revoke all the certificates as well. And I have a limited amount of support for floating IP assignment. The last time I tried it, it worked. But things are always changing in Neutron. So I recognize that not everyone's going to want to enroll every single instance in IPA. So I added a trigger. So if you set the property, our IPA enrolls true. When you launch the instance, then it'll enroll as a client. But that's kind of manual. You have to remember to do it. So you can also set metadata in an image. And if you set this property in the image itself, then every instance booted from that image will enroll in IPA. And there's another requirement for this trigger. So right now, you can't pass into the IPA server to use. So it has to be recoverable by DNS. It's probably not a big deal. It depends on what your DNS situation is. And you don't have to use IPAs or DNS. It just makes it nice to get the reverse IPs added automatically. The minimum requirement for an image, IPA client, obviously, it's available in most distributions. I don't think that's going to be a problem. And you need a cloud in it greater than a certain version for config drive because it's broken before that. And it can't read, can't fake drives. So we also have a cloud in it. And this cloud in it first installs the package we need. It fetches the metadata out of NOVA, parses the JSON, and then just passes that information to IPA client install. IPA client install, use DNS discovery to find your IPA server, the domain of the realm, and do the enrollment for you. And so it looks pretty simple. You just do the start, it is this creation, and a few minutes later you have an image enrolled in IPA. And this is what it sort of looks like. So here's a typical OB-STAC deployment. We have a controller, one or more compute nodes, our free IPA server sort of hanging out. Within this controller I'm kind of simplified a little bit because it's just showing the things that I care about. So we have our NOVA API, our Glance API, and then the joint services are sort of bundled together. So here our user is creating an OB-STAC instance and they pass IPA enrollment is true. So NOVA goes ahead and launches the instance for us. And this may take, you know, whatever, two, three minutes. Once it starts to boot, cloud in it fires up, installs IPA client package, and then makes a curl call to our magical URI. And this is going to trigger the rest call. So we get a post with the instance name, a bunch of UIDs, the properties, and this is when the joint kicks in. So the first thing we do is we check with the Glance API. We say, does this image exist at all? Because you never know. If it does, then we fetch the metadata out of the image itself. And if either the image is IP enrollment is true or the property is IP enrollment true, we proceed. Otherwise, we exit right away. So assuming it is true, then we generate the host name in the one-time password and pass it on to IPA as a host add. IPA host add is the equivalent command line to what we do. And then we return all of that to the instance metadata. This all happens in like a half a second. At this point, we parse the output, get the UTP, and do the enrollment. So now we have a full IPA instance ready to go. You can call certmonger to get certificate or certificates for this thing, and then you're done. And so you go on your merry way and do whatever you want to do with that instance. Something to note. So config drive, this procedure is if you're doing metadata on a virtual machine. So in that case, the way the metadata works is it's really not generated until you need it, which you only need it when it's called. For config drive, NOVA collects the metadata in advance before launching the instance, and it puts it in ISO format and sends it along when the instance gets booted. So in this case, the UTP is generated and it's going to sit out there for a while. I don't think that's a vulnerability because you can't do much with this UTP. It basically allows you to bind once to IPA and do one thing, join. So the worst, you could join IPA, which doesn't really get you all that much. And that's it. So when you're done with the instance, you delete it and check it. You don't have to do anything special. So you just call open-sex server delete. Our MQP listener gets a notification that the delete's finished and we delete it out of free IPA. Deleting a host out of free IPA, deletes any services you've created for it, revokes any certificates you have for it, and otherwise generally cleans up. We'll also clean up the DNS entries as best as can. Yeah, I want a speed run here. So something... I'll ask a question. Is one default like one override the other? No. If either one of those is true, it gets enrolled. So if you set in an image that's true, it's always going to be true. So we don't want the user to override that. Images are cheap. So this is something we're using right this second. But the idea is to use host classes to enable HBAC, because I can never say host-based access control HBAC. So IPA is this concept of auto membership rules. So basically if you set a class without having a host, these set of rules are regular expressions. And if your host class matches these regular expressions, then it gets added to a host group. Now you can assign host groups to HBAC and host groups to sudo rules. So basically you can lock down this new instance as soon as it comes up. So only maybe certain people can log into it. Or maybe once they log in, they can only run certain sudo commands. Or any number of different things. Now there's very... I have a white list of allowed user classes right now, which isn't much and it's not super flexible. But in case you had a host class for like always allow people to sudo, we don't want that to be the default on no instances. So you wouldn't obviously put that in the white list. I'm going to make this more flexible in the future. This is still quite new upstream. We hope to get this in Okada. If not, then it'll hit Pike. It opens up releases. It's really close. And that's, yeah, 15 minutes. Yeah. Just like you asked me, what was the thing you were doing with Nova Hooks that was so terribly deprecated? Please repeat. Okay, what were they doing with Nova Hooks that was deprecated? They didn't like the idea of Nova Hooks at all. So I mean literally you had access to the internal data structures of Nova within your hook. And all I was trying to do was inject metadata. And you could also list the files that you pushed into the image. And so I was just trying to concatenate that file list with another file. And all that file did was call IP Client Install. And like I said, they were truly horrified. And didn't like it at all. And it was up to me to come up with a replacement. And I didn't come up with this. The guy who came up with it said something rather rude. He used that conference and he felt so guilty about it that he came up with this as a replacement. So I love the Nova guys because they pulled my butt out of the fire. Anything else? I have a demo. I'm not sure what it would look like up here. It's a recorded thing. It may be too small to read. I can show that if you want. It's like two minutes. All right. See what I can do. I also have to figure out how to get this to play on that screen. It played full screen on my screen. Yeah. All right. I can show it to anyone who's interested. But basically, like I said, it seems really complicated. But from a user perspective, they just pass this separate property and it gets enrolled. And it's as easy as it can be. Now the reason we want this is because, like I said, we want TLS everywhere. And TLS, the previous talk mentioned that he had TLS working on all endpoints. Within the Reddit open stack, we have it working within all endpoints except for Rabbit and MySQL. I have to talk to the previous speaker because he said he had it working with those. And it's always a problem where to get the certs. Right now we pre-generate them and shove them in using Huppet. And like I said, that's kind of a nasty thing. So that's it. All right. Thanks. Based on the diagram, it sounded like it is a pool from the machine to call the URL to trigger the IP sequence. Right. Is that the case? In a way. It has to get the metadata in. Yeah. That's basically it. But you then said that it will seed for some time. Only in the config drive. So if you're using config drive, it basically creates a CD-ROM of the data. And what it will do is, that's generated by NOVA before it launches the instance. So the metadata is slightly different. Okay. Yeah. The OTP is going to be vulnerable to some extent. Right. But it can be stolen only once. You can't do much with it. Right. You can't bind and do calls or anything. Is there any kind of audit trail on like a plan to correlate just to make sure that the right thing actually happens? So I think practically speaking, so you would notice, I think, because when you went to go boot that instance, it would join with fail because the OTP is being used. So you go, hey, why, you know. Now maybe in OpenSec, people used to think it's failing. They would just delete and do it over. But if you saw enough of those, then people would notice. Right. So there should be some kind of the audit trigger saying, wait a minute. It failed to enroll with the OTP. That's the security. That's bad. Yeah. Yes. I mean, I don't know what I can do from my script to set up any alarms. You want to talk to Rich so that he's sort of curious. To get aware of it. I think that's why he had to get a question too. I think the same thing is probably knocking around in his head. Yeah. We know that from the very beginning, when we define this flow, that there is a weakness in this flow, but it can be mitigated by proper auditing outside. Right. OK. Let's see what you say. So we just need to make sure that the loop is closed and that when we do like audit is for what Rich is doing with Alvaro and others, that this is a triggerable event that can create an alert somewhere. Somewhere that, well, this is really bad. Right. Yeah. I think we'd have to augment IPA because IPA really should trigger that, I think. OK. An enrollment was requested with a bad OTP. We could have a special, we could log that separately. Well, the client also can do it. It's like you might not be collecting. You would be collecting new instance logs, I don't think. Well, we'll be collecting instance logs. OK. So that's kind of important. Yeah. So I could just log something, I guess. Right. So that's kind of making a lot of the message into C-Sloga Hotel with Journal. Or whatever, Journal is yet. Yeah. But yeah. Yeah, just, I think that, and then letting the logging guys make sure that it's captured and then on the central hotel. That's a good idea. Because we have an alert for that kind of thing. OK. Yeah. Yeah. I even get free. We have to stay in place. We have to stay in place. So, Steph, your name is not on this presentation, so you're going to do all of your stuff. I don't. Really? Oh, that's... I put your name. Yeah, I guess it's saying there's only one mic. You guys have a clicker? No, that's something which was not passed from the previous group. I just took over, so... I came to ask what kind of introduction you want. Thank you for having me. I'll be right here. I don't do the... Do the...