 I am going to give a talk on a thing called Shade, and I may have promised to wrap, and that might have been a lie. So I'm not actually going to do that. That was really just to get you in the door, and now I'm actually going to sell you on some real estate ventures that I've got going on in Florida that are a really great idea. In any case, before I get started on that, I want to point out a couple of things, because the internet, I guess, is something that people care about. These talks are online. I'm actually delivering them from that URL. So if for whatever reason you want to go home and read them again before you go to bed, you're welcome to. You can also tweet about things, because I think that means that I'm more important in the general world, so that's great. Also if you love this so much that what you really wanted to do was fork it and make your own version of the presentation, the slides are Creative Commons license and in a Git repository that you can clone to your heart's delight. Now, what you also get in that Git repository is the entire source of my website. So I don't really know what it is you would do with cloning that repo or how that would make your life better, but you're welcome to. And I'm not going to judge you much for doing that if you do it. So anyway, that's where all that stuff is. I'm also going to try and not talk about talking about things for as long as I can. So my name is Monty Taylor. I work for a company called International Business Machines, which you may or may not have heard of. We invented, it turns out, the automated traffic signal, which I did not know until about six weeks ago, which is honestly pretty cool. Like somebody had to invent that, right? Like that's a thing. Before that, I guess there were just manual traffic signals and like some person had to operate them. And so we put a bunch of people out of work, essentially, by automating their jobs out of existence. So I'm a distinguished engineer there, which means that I stand up and talk to people, which is distressing for anybody who's on the receiving end of that and work in the cloud division, which probably should be obvious given where we are. Get this to the OpenSack Summit. I also, you may have seen me at other OpenSack summits, because I've been at all of them. I'm on the technical committee and the board of directors and also help run the developer infrastructure. So if you've ever pushed a patch into OpenSack or tried to and failed, that's my fault. So sorry. But it's led us to a wonderful world where it turns out that we weirdly are one of the larger users, end users of OpenSack, and have developed some thoughts and some opinions and some emotions about what it means to do that. So we're going to talk about that a little bit. And I'm realizing now that the screen is smaller than it was when I was looking at earlier. So some of these slides might be weird to read. So anyway, we're going to talk about what shade is, why we did it, a little bit about getting started with it. And actually, rather than just ranting at people, I'm going to show you like code and log output and stuff. It's going to almost be technical, which would be pretty cool. So shade is a library that I started writing about a year ago as part of the Infra project. And it's a Python library to wrap up the business logic. So we've got libraries that know how to talk OpenSack APIs. But it turns out that some of the operations that are there are difficult, or take many steps or take a different amount of steps, depending on where it is that you happen to be doing it. There's a few things that we wanted to do. We wanted a single API that worked across all of the clouds that we have at our disposal. When I started writing this, I had access to three. I now have access to 20 different OpenSack clouds. So I've sort of gotten a little bit obsessive about making sure that that's the case. And I want to hide all the vendor. It's awesome that vendors and deployers of OpenSack can make choices. And they can differentiate their businesses. And they can do all of those things. And almost all of those choices make my life hell. And so I want to try and do my best to make those go away so that I can use things. It's really important to support multi-cloud. It may have come across to the other two points. But really, I need to be able to write code that runs the exact same on each of my clouds. And we do this in production. So it's fine. It's not just like a lofty goal. I also want it to be simple to use. I'm a really big believer in things having sane defaults. I really hate it when I have to fill out like a config file that's this long to get started with something, which is going to make the part where we start off by me showing you the config file that you should set up really amusing if you're into that sort of irony. But hopefully it'd be simpler to use than other things. It's not pluggable. And it's not going to be pluggable. I think that on a client library perspective is a big mistake. It just accepts the thing. So it's got designate and ironic and trove support in it. And if somebody else shows up and adds support for something else, that's great. The library will just have those things. And I don't care that there's eight library dependencies under those other things. It's Python libraries. It takes like three seconds to install. It's not a big deal. There's a asterisk by that for a reason. But I'll get back to that in a second. I also need it to be efficient at scale. So infra spends up and tears down between 10 and 20,000 VMs across OpenStack Clouds a day. So we tend to notice when our usage of APIs is less than perfect. Because usually what happens is we crash the cloud in question. And then they yell at us. So we want to do no, I kicked something. And now it's all blue. And you can't see anything it is that I let's see what I've just done to the fine people. So imagine that there's a slide up here. That has words on it. And let's see what I, this is how this is. How's that? Did it come back? Now it's red. Ah, look at that. This is live technical debugging right there. Turns out don't kick the thing that's underneath the table. So it should be efficient at scale. And this is the other thing that I think is really important. The API should always be backwards compatible. So you just had a deprecation policy talk in the design summit a little while ago. And I put forth the idea that maybe we should never get rid of things out of the OpenStack APIs ever. And so hopefully I'll be successful in convincing things of that. But I have control over this library. So I will not break you. It's more importantly, I will not break myself. But you get the benefit of me wanting to not break myself. So that's a good thing. So currently, you can get the source code for this library from the normal OpenStack locations where you would normally get things. Because it's an Infra project, OpenStack slash Shade isn't going to be where you're going to get it. It's going to be in OpenStack Infra slash Shade. It's also published to PyPy. And I'm very happy to announce that we've just released version 1.0. And that's actually a lie. Because what we're going to do right now is release version 1.0. Oh, that's my password. Let's not show you that. Isn't this exciting? We have now released version 1.0 of Shade. So from this point forward, we will not break any backwards compatibility with anything. I hope that's exciting to everybody. So this is also used in the upcoming release of Ansible 2.0, which just announced its second beta yesterday. We rewrote all of the OpenStack consumption modules in Ansible. And they're all based on the Shade source code. Also, again, because I work on Infra, we have this thing called NodePool, which is the thing that manages all of the build slaves for OpenStack. And we're currently in the process of migrating NodePool to use this. It uses it for image uploads at the moment. We have a giant patch to replace all the rest of its innards with directly using Shade. So why did I write this? We already have Python libraries. You'd think that maybe writing another Python library wouldn't be a thing that we need to do in the OpenStack project, given that OpenStack's in Python, so you'd figure that the support would be pretty well. So there's this thing that Mark Andreessen tweeted a couple of days ago, which is actually, he didn't say this. Somebody else did. And I didn't attribute it because I'm a bad person. But I kind of like it. And so I just decided to use it as an excuse to stick it into a talk, because I can. And basically, the idea that I've long been saying in the OpenStack world is you've got all these people running around telling you that you need to differentiate to make money and to do all of those things. But actually, consolidation and commoditization is where the real profit comes in. So all of that effort doing differentiation, that's cost on you. That's actually not it. It's insane for you to spend energy doing that, when if everybody would just deploy the software in one way, then everybody would win. But we don't do that, because we're not those type of people. What's that? Exactly. So OpenStack, from my perspective, leaks its internal implementation abstractions. To use it, you have to know things about what the deployer decided to do. It also likes to break its APIs. And senselessly, we're going to make a new major version of an API. And we're just going to change the parameter name of a parameter to a different parameter name. But it's the same parameter. And it does the same thing. But we change the name, because it's more nicer, I guess. We like the way that it looks in the new version. And that's really annoying for end users. The basic concepts are needlessly complex. It's insanely difficult to get a VM with a public IP address on a cloud in a consistent way across clouds. There's a bunch of different ways, and I'll show you that in just a second. Libcloud is a pile of garbage. Our client libraries are really designed for server to server communication. That's their main primary use case. Python Nova client is for things like Glance to be able to talk to Nova. It's not for you to be able to talk to Nova. And you can tell that by trying to use it to talk to Nova, because it'll just bunch you in the face. So we had to deal with this, because we're doing massive scale open stack consumption. And so we figured, as part of the why, why not share that with other people? I can encode that into a library. And the problems that I've solved for myself at the scale that I'm running at, hopefully can be useful for you. So I believe that this is what Python code to boot, to upload an image into a cloud, and then boot a VM based on that image should look like. It turns out this is, in fact, functional code. So this is what it looks like. But I think that it should be this simple. It shouldn't be any more complicated than that, because that's insane. This isn't a very complicated thing that it's trying to do. So I think that the existence of shade is a bug, actually. It's a bug in open stack. And I don't know, we're still working on figuring out how to fix that. But this library shouldn't be needed. And the business logic that's contained in it should be basic primitives in the open stack API that anybody can use. Because everything that I've had to do in this library, anybody who doesn't want to program in Python, has to do literally every single thing that I've done in this library. They're all necessary things. And that's pretty terrible, I think. So back to this thing, the differentiation. What shade is really about is I would like to drive towards some profitable homogenization. I'm going to do my best to make the cloud work one way. And if you don't like my way, tough. So with that said, let's talk a little bit about actually using it to do things. Step one, it turns out, in using any cloud is configuration, as much as I hate config files. I wrote a library to do config.mobManagement. And that library is called OSClientConfig. Also, it had a 1.o release a while back. I will not be releasing it during this talk. It's a library to basically handle config information for multiple clouds. So like I said, I have 17 different cloud accounts having open RC shell script files for each of them and managing my connection to them by manipulating environment variables is an insane thing to do. So this is a thing that allows me to describe all of the clouds that I have and use them. It also keeps track of some of the non-discoverable differences in vendors. So there's things that you cannot tell, except by trial and error, as to how your cloud works. And so as much as I would like to fix that basically in open stack long-term, it's not fixed right now. So I've worked around it. And I have some nice YAML files that tell you exactly what it is that you should be able to learn from your cloud. But right now, you just have to learn it from me. It's in use not only in Shade, but also in Python OpenStack Client. So you can, if you install Python OpenStack Client, you can run OpenStack List servers and it'll consume config values from that. It also, and I've got patches up for other things, it'll read not only a config file, but also the normal environment variable. So if you do only have one cloud and you just want to use the environment variables to talk to your cloud, that's fine. Like no config files needed, you can just do the things the normal way. So this is a snippet of a cloud.yaml file. Like I said, it'll read the OS environment variables or you can set up some things here. So right here, I've got a dream host cloud account and here's my auth information for it and I'm referencing a known profile of clouds. So dream host is a vendor, they have a public cloud. That public cloud is the one public cloud that they have. So it has known characteristics and I refer to it here by name. This is a blue box cloud that the fine folks at blue box made for me and this is not a known public cloud. So I've indicated the auth URL directly. You'll notice there's no auth URL in this one because it turns out the dream host's auth URL is the same for everybody. So you don't need to put it into your config file. The only thing you need to put in your config file is your username and password because that's sort of what you would need. You'll also notice that there's different collections of things. So this is United Stack, which is based in China. It turns out that they have problems with their SSL certs which is a little bit sad. So I've told it that we're not going to bother verifying their SSL certs because they're broken. They use v3 auth, Keystone v3 auth. And down here you'll see that we're listing a project and user domain as well. So one of the reasons that we decided to go with YAML for this is that I do need some nested structure which wouldn't really work well in a normal INI type file and part of that is because although I don't have plugins Keystone does have auth plugins and I can't get around that. And so these are where we put the specific auth parameters that are different between the different clouds. So that's sort of the next slide here. Because of the Plugable Authentication, there are different things, different ways in which the cloud works. We default to auto-detecting. So if you don't list any sort of auth type for your connection, that's fine. You give it some username and password and you're like, okay, I know what it is that you wanna do. But if it's something that it can't auto-detect, you can also explicitly tell it, do this and it'll do its best. So if you use this with Python OpenSack client, just on the command line, you can do this based on information that was there in the thing. OpenSack, tell it the cloud, the named cloud from the config file, the region that you wanna connect to and then the thing you wanted to do. That's a little bit too much typing for me. I find that really annoying. So what I do myself and you're welcome to do this or not is I have a little shell function that sets two environment variables and I just then say use racks DFW, which then tells me that the environment variables are set in my prompt. So I know what it is that I'm gonna talk to and then I can just type OpenSack servers list the whole time and it kinda works. There's a blog post I've got on my blog about doing this. It's not really a blog, it's really just more some collections of static HTML but I'm not cool enough to have an actual blog. So there's a command that comes with shade called shade inventory. If you've ever used Ansible, you'll know that there's a possibility of a dynamic inventory for finding what your servers are and that's a really cool thing and I found myself using the Ansible inventory feature and dumping it out to a cache file so that I could sort of introspect the information that was in my, across my clouds. So I went ahead and put the logic for doing all of that into shade itself and then there's a specific command that'll get installed when you install shade and so if you just type it, it's got some command line options and it will give you a list of all of the servers across all of the clouds that you have configured and it'll fill in some initial information that OpenStack won't necessarily show you. An example of that is right here I've normalized AZ to be AZ so if you have availability zones in your cloud, it'll give you a nice one as opposed to the OpenStack parameter name which is O-S-E-X-T-A-Z, colon availability underscore zone which as lovely a name as that is, I thought the other one would be a little bit nicer to normalize to. We also fill in some additional information down here so you'll see the flavor in addition to having an ID. Also we wouldn't ask the cloud what the name of the flavor name that is ID 100 because it turns out ID 100 is not very descriptive to me but standard X small is a little bit better. There's not an image name here because this cloud likes to change and replace their images so the image with this UID does not exist in this cloud anymore but that's fine. Because it's originally driven by some Ansible work, you'll see this interface IP here. Depending on how you had that cloud configured, there's public and private IP addresses that you might be interested in. There's also IPv4 and IPv6 but what you really care about is what address do I use to connect to this darn thing? That's interface IP. It turns out that getting that piece of information is one of the hardest things to get in all of OpenStack at least in a consistent way across clouds and so you get all the other things and you can see it sort of spits out here. You can get this in YAML or JSON format or really whatever it is that you want to do. If we scroll down here to this other one which is in a different cloud that doesn't change its images out, you'll see that it was able to figure out that this node booted on an Ubuntu 14.04 LTS image which is very exciting, I'm sure to everybody. So fundamental building block is a cloud region and this is sort of like a Newton meter if you're into physics except that it's nothing like that at all except it's a compound word but essentially the basic unit in shade is that of a region of a cloud. You create an object that refers to a specific region of one because all of your API interactions with the cloud are going to be directed towards a region of that cloud. So each region of the cloud is basically a distinct top level object and underneath that everything is parameters. So you can either again get to this through environment variables. If you're only using environment variables, there is a nice named cloud for you named NVARs. It's probably not what you decided to name the cloud but that's tough, it's just what it's named. So the combination of that cloud and OS region name will get you a cloud as will from your cloud config, the name and region things. So the absolute simplest way to get a cloud object in shade is this. It's two lines, you get a fully functional cloud, it's very exciting, import shade and then shade opens that cloud. This sort of assumes that you've got some environment variable set somewhere or that you've got a config file and if you've got a config file, it's just gonna pick the first cloud region it can find. That's probably if you have a config file with multiple cloud regions defined in it, you probably don't want just the first cloud that it happens to find unless you're doing something very strange. So you can specify a little bit more if you'd like to get a little bit more specific on this and you can give it a named cloud and a named region to more clearly specify which cloud you would like to talk to. If you want to do more complicated things than that, you can directly construct an OS client configuration option yourself, you can get a cloud option and you can pass an arg parse arguments to override configuration things, you can pass another key value arguments, basically do all the things you would expect to in a more fully featured application or library that's gonna consume this and do all of those things. This is a little bit too much typing for me for simple things, so I usually do the other one but this is how you get to the real meat and potatoes and then you pass in the cloud config object. If you don't pass a cloud config object into the shade constructor, it will construct a cloud config object for you because that's the way that that works. Use the standard Python logging which may not sound like it's a thing that should be brought up but I'm going to. So there is a helper method inside of shade. So if you don't really feel like dealing with Python logging configuration files and setting that up, that like you're just doing a simple script, you can just say, hey shade, set up simple logging for me that's gonna stick things on to standard out and turn debug on or off. If you want to do more complicated things than that, it's just a standard Python logging configuration. So you can do all of the things that you would normally do. We also do things in this simple logging helper function to fix the broken logging in the Python libraries. So the things that spew garbage onto your screen or put warnings out because there are no handlers found for Keystone off base. Those things we suppress those for you. We also suppress the warnings about rack spaces, broken SSL certs and URL libs obsessive compulsive disorder of vomiting stupid warnings that you can't do anything about on the screen. We turn off all of those for you because no one ever really needs to see them. We do a thing which is the most evil thing in the world and I'm sorry but there's really nothing I could do about it. Well, I could have done something different because I wrote it but I decided not to. So we hide all of the underlying client exceptions and we raise new ones. This is the worst programming practice in the history of mankind and everyone who's already done this in OpenSack before me should be shot. I should be shot. I'm a bad person for doing this but the thing is that exceptions are part of the interface. The underlying, I'm currently building this on top of the Python client libraries and they're all terrible. I'm going to replace them at some point in time with something else. Maybe the Python OpenSack SDK, maybe direct requests call, I don't know. I don't want to bubble these exceptions up into the interface so that you depend on catching them or more importantly so that you have to know that for this operation you just did on this cloud it's a nova operation, on this other cloud it's a neutron operation because that makes your exception handling code for every call that you make about this long and that's a little bit ridiculous. So I'm sorry for hiding exceptions. So in any case, putting all of those things together for a very simple script that actually in fact does something, this is a script that will upload a image to a cloud and then boot a server based on it. We're gonna set up some logging here. We're gonna create a cloud, this is on Vexhost which is a lovely public cloud based out of Canada. We're going to create an image in glance by uploading the file name, the file ubuntu-trusty.qcow. I'm leaving how I got that file as a topic for later. If you'd like to find me over beer I will be more than happy to tell you about Disk Image Builder but it's really not necessary this. I'm also telling it to please wait for that to be done because I'm going to boot on the next line. So if I'm not waiting for that operation to be done it's gonna be a very, very short attempt to boot a server off that image. I'm gonna find a flavor that is, has at least five, 12 megs of RAM. I don't really care what the flavor is, I just want that one. Whatever it's called, I'm sure it's called one or a hundred or X small or whatever it is but it's really not interesting to me. And then I'm gonna boot a server and again I'm gonna wait. I'm also telling it to please automatically find an IP for me because thinking about the logic of getting an IP in OpenStack makes me want to stab my eyes out and so it'll do it for me and I don't have to worry about it. This is the snippet from the debug logging of having done that. So you see all of the things to do those things. I first checked to see if the image that I'm trying to upload exists because we actually do some check summing so that if you ask me to do this more than once I'm not going to upload the image if it's already in the cloud with the same content because that would be kind of a waste of time. In this particular case, I was uploading a very small image so it only took 1.5 seconds. Then we're gonna do the image create, the upload because this is glance if you want, it turns out. Then we're gonna check again to make sure it's there, find a flavor. Then we're gonna go through a sequence of creating a server. We have to get the server again after creating it because the metadata that you get out of a create call is not useful. So you have to get it and then the poll list here is actually we're using server list which is an optimization weirdly enough that we do from node pool where we're spinning up thousands of nodes at a time and so actually just getting the list from the server and then iterating in Python over the list to see which ones and then they're active. Turns out to be more efficient on the cloud involved. So this is gonna sit here in a poll loop. Look at that poll loop. Waiting for it to be done. I trimmed a couple of waiting five seconds is off and then we're gonna finally succeed and then we're gonna do some introspection of the networking stack. In this case on Vexhost, Vexhost is a very nice cloud and did not require you to get a floating IP for your server so we're done. No more things involved. We got a public IP on first boot which is how all of it should work but it doesn't. So a couple problems that we've solved in this code. One of them is the image API version. There's a few ways you can upload images into clouds. One of them is in the V1 API with the put interface. This is what HP, Catalyst, IT, Datacenter, D and InterNAP all use to allow you to upload images. The vast majority of clouds use V2 put. That's, I'm not even gonna read them all off but you can see that it's more than the first one. So that's sort of the basic or the main thing and then there are two clouds out there that use the V2 tasks interface. And so all three of these are different interfaces to uploading images to the cloud. And I think that's a problem. So you'll remember this from the Vexhost example. This is the sequence of operations that I had to do to upload this image into this cloud. If I go to a different cloud, say Rackspace, that requires the task interface, the API call is the same in shade. This is it, create image, here's the file name, go. This is the sequence of operations that it has to do on the backend that you don't have to know about. So in this case it's also gonna check the glance image list and it's gonna see that the image is not there. Then it's going to go to Swift and it's going to do some introspection to see if the object is there in Swift. In this case it is found that the object is in fact not there. So it's going to upload to Swift and in this case actually you'll see that it doesn't say manager ran because it's slightly different because we're using Swift service which knows how to split out multiple threads and upload image chunks in parallel on the backend. So then we will have finished the object, create which shows up here which is weird but that's because of the threading. So sorry about that but that's just the way that it is. We're gonna check it again and then we're going to run the glance image task, create and then we're gonna pull on image task get to see when the image task is done importing the image and then finally that'll be successful and we will consider that we've finished that. So thank you for everyone who made incompatible APIs. But make something fun to talk about, right? So there's another problem. That's that there are five different ways to get a public IP on your VM and OpenStack. Your cloud can have externally routable IP from Neutron. This is my favorite model myself but it turns out I'm not the ruler of the world. That's run above an OVH from France have that model. Your cloud can have externally routable IP from Neutron and also optionally support private tenant networks if you're into that sort of thing. That's Vexhost. Your cloud can have a private tenant network by default for you and require you to go through a floating IP to get a public IP address. HP and Dreamhost are the public clouds that go that model. Your cloud can have private tenant networking provided by Nova Network and require floating IPs for external routing. Auro it turns out of Canada is a cloud that does it. There's a lot of clouds in Canada by the way. I'm not sure if everybody knew that. There's like five public clouds in Canada. And finally your cloud can have externally routable IP from Neutron but not expose the Neutron API to you. And that's rack space. So again, we'll go back to sort of the same kind of code we've been looking at. And you'll see here we're doing a, this is on a cloud that does the floating IP requirement in this case HP. We'll do the create server call which looks identical to the other create server calls that we did because that's kind of the point and all of those things. This is the sequence of things that it did in the background. It created the server and did those things. Then it did the network list which is how it starts to introspect the qualities of the networks in Neutron. It will then list for, and then it'll figure out, oh, I don't have a private, I don't have a public network that this server was connected to. So it'll then look for floating IPs to see if there are any available. In this case it found one and so it updated the floating IP and attached it to the server and then pulled the server because it turns out that floating IP attaching takes a while and so you have to wait for the floating IP to actually attach after you attach the floating IP to the server. There's another thing you might wanna do which is you might not want to reuse the floating IPs because amusingly enough, as much as everybody likes to talk about how clouds are for ephemeral workloads, they've also become obsessed with these floating IP things. The thing about floating IPs is they're actually only good for long-lived pet servers that you want to reuse the IP on subsequent long-lived incarnations of that server. So it's sort of an impedance mismatch there. But if you are actually doing an ephemeral workload, you don't want to reuse the floating IP on your ephemeral node after you've deleted it and created it again. So in this case, when I'm booting this, I say please don't reuse any of the floating IPs out there because this is an ephemeral node that I'm booting up. So glad that that was difficult. So we're gonna do that introspect. Again, we're gonna look for floating IPs. We're gonna do a different thing here in the floating IP create. And actually that's the wrong, nope, I lied to you. So that's the wrong thing. So what we actually should do there, and I don't know why I'm showing you the same slide twice, is we'll actually do the floating IP create with the port of the server in it so that it creates it and attaches it to the server on a single call rather than a create and then subsequent update of the floating IP, which is slightly more efficient. Although what would be really great is if the server get had the neutron port ID attached to the IP address that it returns to you in the server record so it didn't have to then go back to the cloud and get a list of ports and look for the ports that have the IP address of the fixed IP of the server that I got. Because I'm pretty sure that to get the IP address on the Nova side, it had to ask neutron for the port. So at some point, Nova has already asked neutron for this port information and it just wasn't nice enough to put it into the server record to give back to me. And they wonder why things have problem scaling. So some advanced topics. And I promise not all of these will be me ranting about floating IPs. Although I promise I can rant about floating IPs all night long. So you've seen all of these log lines, manager, whatever ran task in so many seconds. On the back end, Shade has an implementation that we call task manager. This actually came out of the node pool work. Every single API operation that we've got is encapsulated into a task object. And there is a task manager that runs. The default task manager in Shade is basically a no op task manager. It just runs the API call that you requested it to make. This is what most of you want most of the time. And you don't have to know anything about it. Although it is nice because it gives us a nice place to put in logging wraps around each one of them. So we consistently log every call that we make without having to remember to do it. There are other things, node pool being one of them where we need to client side throttle our connections to the cloud. We know what our quotas are. We know what our API limits are. And it turns out that it's not particularly useful to a application that's trying to spin up a bunch of VMs all the time to spin up and spin up and then start get errors. And then have to wait until the errors because once we start getting errors, if we're not rate-limbing in ourselves, we're now just gonna slam the API really hard because it's not returning us things and we're gonna try again because we have amounts of VMs that we wanna get. So in node pool, node pool has a threaded task manager that it passes into the shade constructor which makes sure that there is only ever one, there's only one and only one API call going on to the cloud at a given time no matter how many threads we have running inside of the application. And that it also keeps track of timing. So if we know that this cloud can only take one API call or 10 API calls a second, it'll keep the timing and make sure that it waits until it's appropriate to make an API call again. But the nice part about that is that your programming interface to that is the same. You just make calls and it handles all the thread safety issues for you which is kinda neat. We also have caching built into this which is another thing because it turns out that talking to clouds is expensive. And so we actually have, there's sort of two layers of caching going on. We have dog pile cache built in and that defaults to no cache. Again, sort of similar to the task manager. The default null cache does nothing. It turns out the default memory cache in dog pile leaks. So if the default was to just do some nice friendly caching in memory, you would quickly hate me because I would destroy your servers and that's not friendly. So we default to null cache. It turns out that we've got support in the config in the clouds.yaml file for expressing cache settings that you might want to use across your clouds. This gets especially important if you have say eight or 10 clouds and you're doing lots of operations across them, being able to define that in there. So you can pass in dog pile cache settings, cache classes, expiration times, things of that nature, and the calls that we've got that make sense to cache things like list images. Well, list images we will aggressively cache. If you create an image we'll appropriately invalidate the image cache inside of shade. So all of those things should just work and be more efficient for you. There's a few things, and that'll get in a second. There's also in this one, so DBM, it turns out doesn't leak. It's a decent one. It's really good for your local machine. It's not really good as a shared thing if you've got multiple processes running. The DBM driver is not really great but just for doing local operations, it's a nice thing. And it also persists across script invocations. So if you're just doing lots of little small scripts you can still get benefit of the caching layer. There's another thing which we're sort of just starting to roll in. We have one specifically, which is the server list. I mentioned earlier we use server list as a way to poll for readiness of a server. And that's because what we do in node pool is we fire off a thread for every single server that we want to create at that moment in time. And we might have 500 servers we need to create right now. And so we'll have a thread that's sitting there polling waiting for the thing. If each of those were hitting the cloud with a get call it would kill the cloud. It has killed the cloud before. We've crashed clouds that way. But so what we do here is we actually have a mutex protected server list on the inside and an expiration time on that. So you can set this to zero and it'll just happily poll immediately. Or you can tune this up so that it'll do and all of the server get poll calls also know how to participate in that. This is a general cache. The only one that we're doing anything with right now is server. But it's structured so that any of the resources that you're doing something with you could associate a particular API semantic. So at some point once this is fleshed out a little bit more other than server you should be able to say like hey, so for images or for flavors you know what cache those until the end of time because they really never change. For images, same thing. We've got full cache and validation and that's a fine cycle for us in the way that we're doing it. But for servers or for something else that some other things may be interacting with it. Neutron port list is another example of this. The ports get updated anytime you're doing operations on the servers, the networks. So you can't just cache the port list results themselves because it's gonna get invalidated from other actions you don't know about. Same thing with the server list. If you just do the sort of normal dog pile thing you're waiting for the status of the server to change and so you don't know when to invalidate your cache and it's kind of bad. So in this case we do the polling thing and all the things we'll just sort of spin on that. Finally, and it's possible I might have gotten through these slides in the appropriate amount of time which is weird. I mentioned earlier that this is in the new Ansible 2.0 things will be released very soon as well. If this is pretty much the Ansible Playbook version of the script that I've been showing you. You'll notice that we have a named cloud referred to here. We're going to upload an Ubuntu image to the cloud and then we're gonna boot a server based on that image which is kind of fun. There's a couple of differences here in that we don't have an explicit flavor call because it's an Ansible Playbook not a Python library. And so in this case from a user interface perspective you really only care about a flavor when you're booting a server. And so we've got that auto looking that up for you. But the same sort of semantics and both of those have weight equals true. All of these things with the weight equals true setting will happily just fire and forget if you tell them weight equals false. The Ansible module I believe also defaults to auto IP equals true because, well I like to type yamlows little as I can. It also, this is sort of an example of using some of the multicloud support. So I have a set of clouds that I want to make sure that my key pair is on. I have my key, it's the same key. I want to use it on all of my clouds. So this is a little Ansible snippet that will use shade to, for each of these clouds, make sure that my key is on that. And it's really nice because I don't have to have really complicated things with passing in loops of auth dicts to do this. Which just, it would be the ugliest playbook of like really long things or really up to stuff or whatever. So in any case, that's basically what I've got. That's shade, it is out, you may use it. I promise to not break you, although I can't promise that your cloud won't, but I'll do my best to fix it if it does. And I think I'm a couple minutes early, so if there are any questions I'm happy to take them. Oh, there's a question. Oh, you're just gonna follow me. Yeah, so that's a real good question. So the question for anybody who couldn't hear, Spencer, this is great for Python, but what about other languages? There's, I have two sides of that answer. I've started chatting with the Gofor cloud people at Rackspace about that Go client library, not to do all of these things necessarily, but at least to support the clouds.yaml configuration file, because that's at least, it's yaml, right? Like everybody should be able to support that. I'm gonna try and get ahold of Adrian with J Clouds and some of the other folks, probably the RubyFog people, and see if at least we can get that supported. I've also got a spec up to add clouds.yaml exporting support to Horizon so that you can just go grab a yaml snippet. Greg wrote some code that we need to get fully integrated into Python opens that client so that you could say take a yaml file snippet or an opensrc shell script snippet and say, please add this to my local clouds.yaml, so like an import cloud config kind of thing. So that's one side of the thing. That's easy to do multi language. The problem is is that there's just a crap ton of logic in here to deal with all of the differences in the different clouds and the vendors. So as much as I've got indications in the vendor files in the, so the OS client config thing, when you install it, it will install a bunch of yaml files onto your system that contain the description of each of the clouds that's out there. So those could be consumed by other languages as well. And if people start doing that, they'll probably split those files out into their own thing. So you don't have to install a Python library to get config files for a Ruby library for instance, because that would be kind of rude. So that's one thing, but all of the logic then, even to consume those settings would have to be replicated in each of the things. So how we get from a sane library interface to not needing all of that logic is some of the work of Def Core and some of the work that we're doing with the other things. But ultimately, I think that the best way to get that interoperability there is to drive OpenStack itself to not need complicated business logic at the library level. So, yes? Maybe it is part of this, but it was kind of what is the compatibility with non, like we saw Glantz and Nova and Neutron and Nova Network. Yep. But what about the projects? It is, it's basically open to any OpenStack project. I've added support to the projects that I use, or that I want to use, or that people have sent in a pull request to Ansible and said, hey, I want to add support to this project. I'm like, oh, well, funny story. We need to go add support to Shade first because we do not accept any OpenStack modules to Ansible that don't talk to OpenStack through Shade. So I'm like, oh, hey, you want support for, oh, great. Let me tell you about the fun we're about to have. And that usually goes pretty well. But so, well, those will be, but it's basically as, you know, as people have an interest, if somebody comes up and says, hey, I really want to add, you know, Solometer support, I haven't added that because I don't have any clouds that give me a Solometer API. So I don't, you know, that isn't a thing that I've personally needed to mess with, but it's all open. We are pretty obsessive about making sure that functional tests, all of the Shade patches go through live DevStack functional testing. So we do, we'd spin up three different DevStacks, one with Keystone V2 enabled because V3 is now the default, one with Nova Network and one with Neutron to make sure that we're testing against as many configurations as we can approximate with DevStack. There's some public cloud configurations that it would be exceedingly hard for us to approximate with DevStack because, you know, it's their deployment, but we do our best there. So we try and get functional tests to those as well as unit tests. But we actually prefer functional tests because it's hard, okay. Yeah. No. So the Blue Box cloud that I mentioned there is one of the clouds that I currently have. It's a sort of managed private cloud that I was given. We also have Shrews and I got accounts on a cloud from MetaCloud, which is also a managed private for that thing. And yeah, we want to make sure that this supports all of the, supporting private clouds is harder for us because we don't have them. Whereas it's easy to go out and stick my AmEx on a public cloud and get an account and spin up a server occasionally and make sure that things work. But I don't have a data center of hardware to run RDO or Fuel on or whatever. So, but we definitely want to support. Yeah. Yeah. Would absolutely. Yeah, I would absolutely love. So my goal is that both Shade and Ansible and anything else that uses Shade should be able to work on any open stack cloud that's out there. And in general, I would hope that, so it turns out a lot of times when I get my hands on a new cloud, I don't have to do any work. The amount, as much as I like to pick on differences, there's actually only three or four different ways that the clouds, I've found most of the ways that the clouds are different at this point. So it's kind of nice when you go in and like, somebody's like, hey, here's a, you know, like the MetaCloud folks, we spun it up and the same with the Bluebox one. I was like, yep, everything just works. It's great. No need to add new support for things because it's a thing. But yeah, if for whatever reason there was something about a fuel deployed cloud that we weren't picking up on right or that we'd made an assumption about an interface that wasn't a real assumption, that would definitely be grounds for a fix. Mm-hmm. Yeah. Yeah, so the open stack inventory plugin in Ansible uses the same code as that Shade inventory command. The main difference is that the Shade inventory command does not, does not, the Ansible inventory command also creates groups. So in the YAML output, the main list of servers is sort of shifted over under a grouping and there's a whole bunch of like, like it puts everything in the same AZ in one group and everything in a region in a group and it creates as many useful groups as it can. When you're just running this on the command, but it's not really particularly useful if you're not running it in the Ansible context. So this runs without the group creation logic and just gives you the list of servers. Other than that, the fundamental code that it's running inside of Shade is the same in the additional information that it introspects about each server is the same as the Ansible inventory stuff. So it's more for like, hey, I kind of like that extra information I get from Ansible, but I'm just sort of poking around on the shell right now. You know, it's a sort of good way for informational poking for me if it's useful to other people. Neat. Cool. Oh, one more. Yeah, so yeah, so this is the thing where one of the reasons we default to null and sort of would need you to opt in to doing it. If you have multiple actors changing the image list, if you have multiple actors that are changing the image list, for instance, but they're all using Shade to do it, then you can use one of the shared caching backends for dogpile like memcached or redis and then you would actually be able to have a shared cache that would do that. If you, on the other hand, have another actor that's changing the image list that isn't associated with that same caching infrastructure, then yeah, you're gonna get false negatives. The cache hit and miss is gonna be sort of off. So that's one of the reasons we can't just assume that it's always right, but if all of your things are going through that, then it should do the right things and if it doesn't, it's a bug and it means that our testing of the cache invalidation flow is buggy and we should fix it. Anybody else? What's that? Beer! Yes, excellent. Beer or whatever else it is that you like to drink if you are gluten intolerant. So cool, thank you very much.