 All right, I'm going to get started here. Hopefully it won't be too loud out there. So my name is George Reese. I'm here to talk about the OpenStack APIs, both from a perspective of what exists out there with the OpenStack APIs, how you approach them as a developer and get into some critiques of the APIs as well. I'm also going to close out the conversation to touch on the whole OpenStack versus EC2 APIs that people seem to like to fight a lot about. So my background is, first off, I was a former CTO of Instratius, which was acquired by Dell about six months ago. And in my role over the last six years of building out Instratius, I've built out the open source Java Cloud Abstraction API design cloud, which what it does basically is provide a single Java interface that talks to all the different clouds out there. And there's support for two dozen different clouds. So in terms of working with cloud APIs, whether they're restful, whether they're soap, whether they're pretend to be restful, I've done a heck of a lot of it. In addition to that, I'm very opinionated on the whole REST API thing and eventually got so sick with the state of REST APIs that I wrote a book on it. It's five bucks on Amazon. It's an e-book. But if you want, if you go to the Dell booth, they have little USB sticks that have the PDF of the book on it so you can get it for free here. When I say I'm a general critic of the OpenStack APIs, I mean that not in the pejorative sense. I mean it in the sense, so I'm not out there throwing tomatoes at the OpenStack developers. I actually am trying to help improve the state of OpenStack APIs. And perhaps as I get into some of the more critical comments in this talk, it is important to keep in mind that I do think that the OpenStack APIs are some of the best architected APIs in the cloud computing world. I can't think of anything that's a close second from API design. Now that's a very low bar. There are some really bad APIs out there in the cloud world, not the least of which is Amazon's APIs. But there's a lot more to an API than being elegantly designed. And I'm going to talk a lot to that particular issue as we go into this. And as a critic of the APIs, my objective is to improve those APIs. When we were in Stratios, we were not aligned with anybody, so I throw, and even now at Dell, I just throw tomatoes randomly. But I criticize everybody equally out there in the API space. With Dell, we're obviously behind OpenStack. And so obviously that criticism comes with love. But to understand any criticisms I have, you sort of have to agree with this basic premise that I hold around RESTful APIs. And that obviously, first and foremost, they need to expose the underlying functionality of the system they're supporting. So in the case of OpenStack, that means anything that you should want to be able to do as a third party, you should be able to achieve that through the APIs alone. And you should not have OpenStack developers sitting in judgment over your use cases and say, well, you shouldn't be wanting to do that. But more than anything else, and this is super important, is that it should be based on abstractions that hide the implementation of the underlying thing that you're trying to access. And any changes to that underlying thing should never, ever, ever result in breaking client code. Now a lot of us may be used to coming from worlds like Java APIs or .NET APIs or C libraries or whatever, where you version things and then you say, OK, I'm deprecating this and moving on with life. That can't work. That model doesn't work with REST APIs. And to put it in a fine point on it, I guess, is that with Amazon, I've never had code that I've written break because of an EC2 or S3 or whatever upgrade. I have code that I wrote in 2007 that works today against EC2. I doubt I have code from 2011 that still works against OpenStack. So I'll start off talking about what the OpenStack APIs have structured before I get into some of the nitty gritty. The first thing to keep in mind is that it's not the OpenStack API. There is no OpenStack API, just like there's no Amazon API. Instead, and this is a good thing, it is a suite of APIs. So an API for accessing Keystone, Nova, Cinder, all those different things. It allows the APIs and the services they represent to vary independently. And if clients are written in a proper way, they should be tolerant of those variances. And again, you've seen it successfully done with Amazon. You look at competing products out there beyond Amazon and OpenStack, and a lot of them have a monolithic API. And that creates a heck of a lot of problems as you try to scale the product and scale the API. It is one of the most faithful APIs, or the whole set of them, or one of the most faithful sets of APIs out there in the cloud computing space with respect to restful principles. And what I mean by that is that it's fully HTTP based using the HTTP verbs in the way they're specified in the HTTP specification. There's not a lot of making up custom HTTP status codes. For example, Cloud Stack, to pick on somebody that irritates a hell out of me on this particular topic. Whenever, if you submit something wrong, you'll get a 531 or something like that. So not only is it a completely made up error code, it's not even in the right HTTP status class for user submission, bad submission data. Another thing that I really like about the way OpenStack does things is that it supports both JSON and XML. And that is not without a huge amount of fighting on occasion on my part with the mailing list. There's definitely a bias among the developers to support JSON. If any of y'all have read my rest book, I have a whole section on JSON versus XML. And my philosophy on that is that you should support both. Because when you pick one or the other, you start judging people's use cases. Because it turns out XML in enterprises, people have a heck of a lot of tooling built around XML stuff. And so when you go in with your API that's aimed at the enterprise and say, all those tools you've already got, well you need to re-tool them for JSON, that doesn't work so well with the enterprise. But on the same note, if you go into a more modern set of developers and say, oh yeah, you have to start parsing XML again, they'll throw you out. So OpenStack does a great thing by supporting both. And the whole API suite gets started with Keystone. You cannot, no matter what a given install, and there's a big asterisk that's going to come a little bit later on this, but in general, any install you have is going to start with Keystone, and you need to interact with Keystone to do just about anything. Certainly authenticate, but also if you're going to build a client in a proper way that is going to discover an infrastructure, you also need to understand the service catalog nature of Keystone. And so as a service catalog, you can write a client that will talk to two different OpenStack environments and will automatically discover what the difference is between those two OpenStack environments. Because I promise you there are not two OpenStack environments on Earth that are alike. And so you need that programmatic discoverability of your OpenStack infrastructure. If you contrast that with something like the EC2 APIs that don't have a service catalog, unless you build special voodoo logic into your code, you don't know the difference between a eucalyptus environment and a true EC2 and an OpenStack environment with the EC2 APIs. You have to write this voodoo code and then cache that information about what cloud you're actually dealing with. Keystone, no voodoo, you just discover it and do whatever you need to do. Then of course Keystone is an identity authentication service that acts as a secure token service, not something I don't like secure tokens for APIs of this nature, but it is the least of my criticisms of OpenStack APIs. So once you've authenticated with Keystone, you can then go out to any of the other services. Now you've generally queried the service catalog and you know what services that are being supported in this OpenStack environment. And they run the gamut from the standard Swift, Nova, Glant, Cinder, Neutron. Actually, that's probably not the best way to put Nova. Nova plus extensions you don't really get to discover. You get to discover that there's Nova and you have absolutely no clue whether there are any extensions in there or not. And that's an artifact of the way the OpenStack reality was pre-Keystone. Then at the bottom there are custom services there. That's really important. In an OpenStack install, you can go and stick in your own whatever-as-a-service, register it with Keystone, and any client will see, oh, there's this whatever-as-a-service thing in there. And if it knows how to interact with whatever-as-a- service, it'll do so. If it doesn't, it'll just ignore that service and deal with the things it understands. So I'm going a little bit of detail here on how Keystone actually the authentication process works because that's really important to understanding everything else that I'm going to go into here. First is authentication is really simple. You post a JSON or XML payload that includes the username, password, authentication model into Keystone. And then Keystone's going to give you back a nice little token. You can then use that token when you're making calls against any service. And the service then checks the token with Keystone to verify that you are actually supposed to be doing what you're doing. So going back to the token thing, I don't like these token-based approaches because when you're dealing with clouds that are installed behind the firewall, people very often don't install proper signed certificates. So you end up with not really being able to trust the communication model. The virtue of the way EC2 does authentication is you can actually run EC2 over plain text. And you'll have a secure interaction with the server. You don't need SSL. With OpenStack, you not only need SSL, but you need SSL with a trusted SSL certificate. So that means either signed by a trusted third party or self-signed with the trust pushed out to the clients. The problem with REST APIs is that pushing the trust for the self-signed certificate is not very scalable. So in general, I like my APIs to be able to run over plain text because of the realities of what happens behind the firewall, but it is a very minor net in the scheme of things. Once you have the token, you need to cash that token. And then, as I said earlier, you will use it. When you're fetching servers from Nova, you'll include it in the header as xauth token header, same with any other service out there. But with the one caveat that you always need to be prepared to re-authenticate, because it is valid for any OpenStack service to say, you need to re-authenticate again because the token's expired or somebody's gone in there and manually forced the issue or whatever. So your logic not only needs to be able to handle a proper 200, 202, or whatever response, but it also needs to be able to say, oh, I need to re-authenticate and then re-issue whatever query or post or whatever I was doing. So the things that suck here. First is there's no standard payload. Real quick, bring up the, you notice here I've got this auth, racks, ks, key, colon, api, key, credentials. That is one of the ways in which Rackspace allows you to authenticate with their cloud. On the other hand, HP has a different mechanism. Other installs have yet another mechanism for authenticating. And the others, the one that particularly bothers me because it's username and password. And so you are compounding what I consider the weakness of STS with the weakness of username and password credentials. And that is sort of the standard within OpenStack. And so you've got all the problems associated with username, password, authentication. But as a client, I really can't know when I go into an environment what kind of authentication that environment expects. So I end up having to write some voodoo logic to say, is this HP, is this Rackspace, is this some sort of custom environment? Oops, I've been installed in yet another environment I haven't encountered before. So I need to write some new logic to deal with that. And then another thing is sometimes you need a tenant ID and sometimes you don't. And sometimes you need a tenant name instead of a tenant ID. And again, there's no magic way to deal with that. I get people actually within the OpenStack development community that insists no, it's always one way or another, but just between HP and Rackspace, it's different. The other thing is that Keystone, maybe if all you care about is not even SX, if you care about, let's say, Grizzly and beyond, then this slide probably doesn't matter that much to you. Dozzine Cloud and Stratius, don't multi-cloud manager, care about Bear, Cactus, Diablo, Essex, Folsom, Grizzly, Havana, and Icehouse. And so we actually have to deal with environments that don't have Keystone. That's a minor problem. It means that code that was written against Bear no longer works when Keystone's in place. So that's breaking one of my primary rules. But that's not the biggest problem. The biggest problem is there's no way to tell without doing a lot of bad things, or not bad, but not good things, which I alluded to at the bottom there, to go and figure out what kind of environment I'm dealing with. And it's not even as simple as saying, OK, do I have Diablo or do I have Essex? I've got Folsom environments that don't have Keystone against them. It's because during Essex and Folsom, people screwed around with their configurations in really horrible ways, and actually mixed Essex, Folsom, and Diablo code into one sort of monster open stack environment. So my trick to programmatically discover this stuff is, if the endpoint ends with one or 1.1, try the Nova authentication first, then if that fails, try Keystone. Otherwise, I'll try Keystone first than Nova. And it's assumed to be a failed authentication if both don't work. And ideally, you're caching tokens, so you don't have to worry about doing that over and over and over again. Yeah? What's that? I'm not aware of it, but I haven't tried yet. The question was, are we expecting any changes in 3.0? So versioning. So how the versioning negotiation works is that, now the good thing is, is that services are versioned independently, just like they are in the EC2 APIs. And so when you go in query service catalog, you get all the versions of the Nova APIs that are supported. And then you can pick an endpoint that supports the version of the API you understand as a client. And that's a very good and useful way to do version negotiation. And then you make the calls against that endpoint, and it responds in a manner consistent with the way that version of that service worked. And then you can use things the way you expect them. Now, the reality is that it doesn't work that cleanly. It's actually horrid. One of the things I'll say on Twitter a lot of times is that OpenStack isn't even compatible with itself. And what I mean by that is that ideally, keystones should be standardized across installations. And it should give you the authority on everything that's going on in that environment. As I mentioned, older versions of keystone are a mess, including the ability to determine whether or not you even have keystone. And the biggest problem actually is the issue that the same service can have different identities within a keystone. And a lot of that has to do with the way new services get built out in the OpenStack development process. And I'll talk more about that when I get into Neutron and Cinder in a little bit. What's that? Keystone is its own independent service and can be installed on its own, as far as I understand it, its own node or on a node with other things. I'm not an OpenStack architecture person, though. But it is where all services get registered and where all authentication happens. So talk about core Nova extensions and the custom services now. So supposedly OpenStack supports a core set of APIs with the ability to easily integrate APIs in support of non-core services. This is not at all the way things end up working in the wild. It has its core set of APIs. And you have the non-core services. A real example of whatever is a service is something like DNS is a service, like Rackspace, Cloud DNS, or Database is a service, like with the HP Database is a service. But when you try to actually build a system that can actually navigate an OpenStack environment without having to know a heck of a lot ahead of time about that environment and then survive upgrades as time goes on, it just, it's not possible, it hasn't been possible at least through Havana. So here's an example of a core API. So we've got Nova and don't worry if you can't read the actual code here. I called out a couple of important things. So you've got the catalog entry. So when you authenticate, you get your service catalog. And if you're going against Rackspace like this one is here, you'll get a catalog entry for Cloud servers. And this is one that supports version 1.0 of the Nova APIs. And so at that moment, if I've got a piece of Dozen Cloud code or some Python code that's talking to OpenStack or whatever, I can tell, I can start listing servers, provisioning servers, all that sort of stuff just based on that catalog entry. And the key thing there is the type compute. I know that because I have a type compute that this is Nova that I'm talking to, essentially. Then I'll go out to Nova and make this request response. Notice that the endpoint here is slash server slash detail. Don't get hung up on the slash detail bit. That's a little bit of a bizarreness that goes back to Rackspace API days. One of the things I don't like. But the bottom line is, is you've got the servers endpoint that you can go and grab servers information. And then in the payload that I get back from Rackspace is a server's data element. So here's a non-core API example of Cloud DNS. So in this case, my type is racks colon DNS. Now, if my code doesn't deal with DNS as a service, then I just ignore this catalog entry. If my code is designed to work with HP's DNS as a service, I just ignore this entry. If with Dazine Cloud, for example, I need to work with both, I know that since this is Racks DNS, the APIs I'm going to use to talk for DNS purposes are going to be the Rackspace DNS as a service and not the HP ones. So again, request response. I go against slash domains. And I get a data element back there that's domains for the list of domains that are in there. So here's where things start to get ugly. Nova volumes. And this stuff really irritated the heck out of me when the whole Nova volumes to sender migration was occurring. So you'll notice here, first off, there's no catalog entry. So that means that my code cannot know ahead of time whether or not I am dealing with Nova volumes. I can make the assumption that if I've got sender in the service catalog, that I probably don't need to worry about Nova volumes. But other than that one assumption, if I don't see sender in the service catalog, I could either be dealing with an environment that supports Nova volumes, or I could be dealing with an environment that just doesn't have Nova volumes turns on, or it could be one that predates Nova volumes. I don't know which. And the endpoint here is os dash volumes, not volumes. The response still has volumes, however, as the data element. So during the Nova volumes to sender migration, real quick, under sender, sender's a core API. So it has its catalog entry. And by the way, the request is slash volumes, not slash os dash volumes. So first and foremost, the problem here is that the APIs exposed the fact of how volumes were being treated under the covers of Nova volumes versus sender. You don't care about that as a developer. You should not care. The APIs actually between Nova volumes and sender are very close to identical. That's the even more maddening thing. All you have to do is essentially change os dash volumes to volumes. And voila, you've got code that's working against sender. But again, because of the way the implementation has been exposed, you have to go and discover that on your own. And if you were in an environment that upgraded from Nova volumes to sender, your code probably broke unless you did some bizarre pinging for invalid endpoints to determine what was going on. And so that's one problem. Things are much worse with neutron. I referred to it time and time again as franking quantum. That's because with sender, either you have sender, you have Nova volumes, or you have nothing. With neutron, you've got nothing potentially. You've got Nova networks potentially. You've got some nonsense that Rackspace put out there. And then you've got standard straight out neutron. And so to start off with, here's some example of the most innocuous code to deal with that in Dozine Cloud. And this is where we determine what endpoint we're going against. So if we're quantum, we're going against slash networks, standard core API thing. If it's Rackspace, we're going against os dash networks v2. And if we're going against Nova networks, it's os dash networks. Turns out that actually to make this all programmatically work without us having to do a lot of ahead of time configuration, there's a lot more ugly code that I didn't decide to expose here that actually goes out to the, and navigates the service catalog, goes out, and then starts hitting endpoints and guessing which one it might be, and then eventually figures it out. So that's the bulk of the structure of the OpenStack APIs, the good and the bad along with it. Before I go into the API war on EC2 versus OpenStack stuff, the important thing is that these aren't things about that are inherent in the API that are problematic. It's just stuff that's inherent in the way we roll out changes and manage change within the OpenStack community that create these problems. There is no need for these problems ever to arise, and so we can address them without having to re-architect an API. The same can't be said for the CloudStack APIs. To fix everything that's wrong with the CloudStack APIs, you have to go in and re-architect the entire set of CloudStack APIs. And with the vCloud APIs, you just want to blow them up. So start off with facts. So the facts about the AWS APIs, and the most important one up here is that AWS, regardless of anything else you want to say about those APIs, has a huge ecosystem of code built around those APIs. The reality out there in the real world is most people who are doing any type of cloud computing are in some way or another dealing with OpenStack, and with AWS, and are starting to build internal tools around the way they're using EC2, in addition to all the public open source and commercial tools that are built around it. As I mentioned earlier, AWS is never, to my knowledge, broken people's correctly written code. And actually, AWS is really smart about doing things that even deal with people's improperly written code. I'll deal with an example down at the bottom there, but another interesting thing about the S3 APIs in particular is just about every cloud storage solution out there has Amazon S3 API support. So if you've got a tool that's written against S3, you can talk to a number of cloud storage vendors, both for private cloud and public cloud storage. Some things that start to cause problems though are first off, this isn't necessarily a bad thing, but it's very opinionated on how it models cloud. When it comes to being somebody else who's trying to model cloud and solve cloud from a different perspective though, that opinionated viewpoint becomes a problem. It's not a problem for S3 because cloud storage is cloud storage and that's that. It does get to be a problem. It's not a problem for Eucalyptus because Eucalyptus is trying to model cloud the way Amazon models cloud. It is a problem for OpenStack because OpenStack is not trying to model cloud computing the way Amazon does and so when you look at Amazon from a EC2 perspective, you are limiting the perspective that people can get of what's in an OpenStack environment. It's a god-awful API. It's not a restful API and they don't even call it that. It's the EC2 query APIs or if you're really up for punishment, you can use the SOAP APIs. It's also complex to mimic in a transparent manner and going back to the point I made earlier about not breaking code, a good example of that is a lot of clients out there actually violate the EC2 documents and present time formats in ways that aren't documented in the EC2 APIs and Amazon actually honors those invalidate formats and then somebody else comes along and implements an EC2-compatible API that is perfect according to the documentation and these clients suddenly don't work against it because it's expecting date timestamps in the way that Amazon documented them but the ecosystem is using different one. Also how Amazon does a terrible job with resource identifiers across the board and that terrible way in which they do it ends up leaking into people who try to build compatible APIs, especially ones that are designed to have globally unique identifiers like OpenStack. OpenStack APIs on the other hand are elegant, largely consistent with each other, even aside from everything I said, the OpenStack APIs really are very consistent. OpenStack APIs have a decent level of ecosystem support but as much as we like to talk about all the stuff we're doing here against OpenStack in this community, it dwarfs in comparison to what EC2 has out there. On the downside, as I mentioned earlier, it's constantly breaking existing code but it's controlled by the OpenStack community which gives us the power to properly model OpenStack concepts and where this becomes a real problem, a lot of people say, well, Amazon's done everything the right way so why don't we follow that model? It starts to fall apart when it comes to networking. EC2 networking is horribly modeled and it's reflected in the VPC APIs and the other APIs for interacting with networking. We do not want Neutron to look like the EC2 networking APIs. So the control we have lets us model networking better and provide programmatic access into that model appropriate to the OpenStack model. So from my perspective, we actually need to support both APIs and that's a lot of work especially because of all the nuances of Amazon supporting things that aren't documented in those APIs. But that means that we can take an OpenStack environment to put it into a company that is using AWS that needs to do a private cloud infrastructure and their existing tools can start leveraging OpenStack. Their new tools may use the OpenStack APIs, they may not be able to get access to some of the superior ways that OpenStack is doing networking, whatever, but they can leverage their existing tools to get started and the biggest problem the OpenStack community has is people have a hard time getting started. But the OpenStack APIs have to be there, we cannot just defer our modeling to Amazon and Neutron being the best example, but anywhere we are innovating about how cloud should be abstracted, we need control over the end API model and the OpenStack APIs give us that. So I've got about two minutes, 20 seconds for questions. Yeah, so anybody with that token now becomes able to do anything that that user is able to do within any OpenStack service. So if you were to put that communication channel over non-SSL, it would be easy to sniff that token and then start doing nefarious things. Whereas with an Amazon does request signing, so you're never past the secret across the wire. It requires HTTPS with trusted certificates. So yeah, the trusted certificates aren't there for the authentication process, it's to trust the channel that the authentication is occurring over. Well, I mean, that's why, so the question is this is gonna get better or worse with scale. If we don't address the problem, it'll get worse. And we don't have, even for the clouds that we're interacting with at this time, the ability to move the same code from cloud to cloud to cloud without a bunch of configuration is problematic. If we had to be able to support 200 arbitrary clouds, then it becomes a problem. And that's why we have to address this compatibility issue now. Okay, 30 seconds. Yeah. Anything that you can support with EC, with OpenStack that you can't support in Amazon? Well, I can't think of a good example of that, but you cannot describe the neutron networking model in the way using the easy two APIs at least not in a natural way. And similarly, you could not express Amazon's networking using the neutron APIs. That would be my best example, but it's kind of weak. I'm out of time, but you can ask questions afterwards. Thanks.