 Hello, everyone. My name is Yoram Warnweb, and with me is my colleague, Ran Ziv. We're both from a company called Gigaspaces, who are the creator of Cloudify, a TOSCA-based orchestrator for the cloud. And basically, what we're going to talk to you today about is about APIs, how to develop for the OpenStack API. And primarily, our experience as a people who developed orchestrator, who has extensive usage of OpenStack and put OpenStack into quite a bit of challenges from the way that we use it, and what we saw, and things that we learned from it. So in terms of the structure of our session today, I'm going to start with a short introduction to OpenStack API and how to, basically, from a user perspective, how to interact with OpenStack, mainly programmatically, but also in general. Later, we're going to go into how we use the APIs in Cloudify in our orchestrator and how we represent the functionality of the APIs and the OpenStack resources in our orchestrator. Then we're going to go into some of the quirks and pitfalls with OpenStack APIs that we experienced. And we want to share that information with you and hopefully get you to avoid some of those pitfalls. And lastly, we're going to talk about testing your application in an OpenStack environment, challenges of different versions, different distributions, and so on. So basically, when we're talking about interaction with OpenStack, we're talking primarily on two things. One is sending commands and actions to OpenStack, basically doing operation that will make changes on your OpenStack environment. For example, booting a VM, creating a network, deleting a router, these sort of things that basically change the state of your OpenStack Cloud. The second thing is collecting information. Collecting information can be get a list of all the servers on the certain user, get a list of networks, check that the server that you just booted really started and is in active state, and so on. If you look at how OpenStack interaction kind of look, we have at the core, we have the OpenStack Endpoints RESTful API. Basically, for each endpoint, if we're talking about Nova, Neutron, and so on, each endpoint provides the RESTful API endpoint that you can interact with and maintain it. Above that, you have the different SDKs, the different client libraries to work with OpenStack. Primarily, and supported officially by OpenStack is the Python client API libraries, set of API libraries. And in addition to that, you have SDKs for many other languages. You have SDKs for C, C++, Erlango, Microsoft.net, Java, and many, many other languages that probably have included here. Another, both the SDKs, really what they do is they provide an object model and classes that give the user an easy way to programmatically to basically send RESTful API calls and get the information from the RESTful API calls, present them again in objects and programmable entities. Then we have Horizon. Horizon basically doesn't go through the client libraries. It goes directly and talk to the REST API of OpenStack to whoever doesn't know. Horizon is kind of the web front end of OpenStack. And we have the OpenStack CLI tools. For each client library, there is the CLI tool. And recently, there is also the OpenStack CLI tool that basically provides you a more unified CLI to the most common services in OpenStack. So when we look at the RESTful API endpoints, we have some things that we need to take into account and be aware of. First, there is the versioning. For each endpoint, we'll support the versions. There will be the version that's currently the version that is the current for the release. But it also will support most likely older versions. And that's very important when it comes to upgrading your application that you develop versus upgrading OpenStack. Sometimes you may want to upgrade OpenStack, and you don't want to start to be worried about each of my applications that I developed with OpenStack. How will that behave with the upgrade? So by specifying the version that you're going to work with, it's supposed to maintain the same API contract even when you upgrade it to a new version. Next, there is the different interfaces, the public, internal, and admin. If you are writing an application, a user application that will start a VM or will create a network, do things that are from the user perspective, most likely you're going to work with the public interface of the API endpoint. The two others, the internal and the admin, the internal are really what the internal components of OpenStack use when they do the operations. All the interaction inside OpenStack are done either through message queue or through going to the rest API endpoints. And you have the admin interface, which let you expose to more, in some cases, to more admin set of, super set of calls, for example, that basically serve the admin of OpenStack versus the normal user. And the interfaces sometimes will be separated because you want to give different access levels, protect them with firewall. Perhaps the public one will be available outside the firewall. The admin and internal may be behind the firewall and will only be exposed to internal users that are running within the firewall boundaries. The last point I want to make here is about format. Currently the most common and actually the only one that is going to be, suppose I'm moving forward is the JSON format of inside the HTTP calls. There used to be also an XML format and now it's deprecated and in the newest releases it's even not available. So if you look at some example of endpoints and basically you'll see for each endpoint that you're familiar with, you're gonna see that it has an API endpoint representation and I listed here the current versions for each of the different endpoints. So these are the core, like identity keystone, compute nova, image service glance and so on. You have also partial list, that's not everything, but just a partial list, give you an idea of some of the additional projects and services that also each of one of them will have their own API endpoint. And by the way, in order to get all the API endpoints, you go to keystone. Keystone, when you authenticate with keystone, you'll get back the list of all the different endpoints that are available and you'll be able to start from that. You don't need to have intimate knowledge of each one of the endpoints which port is on and so on. So when you were talking about debugging for, when we're talking about developing for the API, there's also the debugging and how to check and feel how certain endpoints behave and do the operation that you want to later incorporate into your application beforehand, from the CLI later in a sample code and then maybe put it in your application and how to debug it. So you have the CLI will give you the flag like the minus minus debug flag which will give you a lot of output of what the CLI did and which REST API calls, it's actually triggered in order to do, for example, an OpenStack server list or a Nova boot or whatever other CLI commands you're gonna run. The Python SDK, once you write in Python your code, you'll be able to add logging to your code so that you'll see exactly what each one of the client libraries is doing, the same way as you've seen from the CLI and you can match how the behavior happened on the CLI versus how it happened in your application and maybe find differences or why things went wrong if they did. Then you have the VM logs. Once VM booted doesn't mean that it's accessible and that you can interact with it by accessing the VM logs either through Horizon or through the CLI or actually also from the Python SDK. You're able to see how the boot process of the VM happened and find out if something went wrong in the process. In addition to that, sometimes the information coming back from the RESTful API calls will not give you enough data to understand what's really went wrong when things went wrong. So in certain cases, you will want to also look at the back end of OpenStack to see how the services of OpenStack behaved and responded to your API call. In order to do that, you want to get access to OpenStack service logs. If you're a user, sorry, if you're a user, sometimes you do not have access to those APIs on your production or even on your test environment in your company. And therefore, I recommend also getting familiar with DevStack. It's kind of a small development-sized OpenStack that has a similar API and you can see how it responded to your application calls. So let's demo a little bit what I talked about. So this is the OpenStack command line. And this is the new OpenStack CLI tool. And I added the minus-minus debug server list. I'm running here locally inside the Vagrant box, running Linux. And inside Linux, there is a DevStack. And basically what I do here, all the OpenStack credential and everything is in environment variable, so it's not explicitly stated. And when I hit the command, you can see here, let's scroll up, you can see there is quite a bit of information coming from the command line. And I think the most important, you can see here things like which version of the API was contacted in this area. And which parameters were used. But I think maybe the most important piece here is this piece. And you'll see in certain cases, you'll see several of these. These are the actual restful API calls that were made from the Python client library underneath the CLI utilize. And basically it translated it into a command line curl command that you can then copy. And basically paste it here. Just if I just remove the minus i, which will skip the headers. And actually let's run it like this. So basically what I see here, I triggered the same command as was triggered by the CLI. This command here just went to and just did a get and gave the, did a get. And you can see here the response back coming from an open stack. You can also pretty format it with one second. Okay. So here you can see that all the information formatted a bit nicer at the JSON information that was responded back from the API and endpoint. Similarly, we can see it in the Python code. So in the Python code here, what you can see is that I'm basically just starting a Keystone client, authenticating and providing the token to the Nova client. And then before actually triggering the command of a server list that is down here, what I'm doing here is I'm basically adding the logging that will make sure that I get the information. Similarly to what we've seen from the CLI. And that's the logging.debug in the level. And also the HTTP log debug true, which makes sure that it's gonna log all the restful API calls and responses from them. So if I, okay. So you see the same information as what the CLI has shown. Last I wanna show you here is from the command line, I can also check on a certain instance that I have that my instance called test. And basically what I'm triggered here is a console log show command. And that show me all the information of the VM boot. And I can see here, let's say if it did not get an IP from a DHCP or something like that other that might have caused the VM not to be accessible. You can also trigger the debugging with an environment variable. This is the command line that what we've been in the demo is using the OpenStack CLI. This is the Nova, which is actually now included in the OpenStack. And you can see a console log that will get you the log that we've just seen. Last thing I wanted to show here is the DevStack. When you start a DevStack, if you want to debug it, use this configuration in your local RC file so that when you start the DevStack and DevStack you just git clone DevStack, you configure the RC file and you put there in addition to the credential, you put this piece, this information and that will log all the services information to the log file. But because we are going to show some examples of the Cloud Invite. Okay. Should I repeat everything or like? Right, so I just mentioned that we're going to run through the way we've exposed the OpenStack APIs in Cloudify and that I'm gonna have to tell you just one slide about Cloudify. So basically what's Cloudify? It's an open source pure play orchestrator. What it lets you do is basically just deploy, orchestrate and manage task-based applications. If you're not familiar with the task standard, it's an open standard which lets you describe your application topology, its components and the life cycle events and relationships. And beyond the topology, it also lets you describe the workflows that can help you manage your applications, day one and day two operations like healing, scaling, monitoring, things like that. And basically you write it in YAML or if you want you can also use Cloudify Composer which is a Nifty UI for where you can just like drag and drop components and connect them to one another and it auto-generates the Tosca Blueprint for you. And beyond that, Cloudify has a very modular architecture and that's going to be relevant later on because basically the way we support OpenStack in Cloudify is we have an OpenStack plugin and you can pretty much write a plugin for any environment or tool and you receive lots of them out of the box as well. So yeah, so Cloudify again it supports OpenStack by means of the OpenStack plugin and the plugin was designed with some like the things that we kept in mind when designing this is that we want the plugin to support any use case not actually impose any restrictions on the user, whatever its use case is. And while doing so also be simple and offer ease of use and like syntactic sugaring where possible and things like that and it also has to be robust at dealing with cloud errors because those are very common. And beyond that, the last thing we kept in mind was to make as much abstraction as possible where you have like the ability to make things that are common to also other cloud environments. So the way plugin exposes types is resources, I'm sorry, is through types and what we did in the plugin is basically for each type you have like the more prominent parameters, the ones that are always being used and those are like exposed explicitly by the plugin and the rest are also available but only through direct override. And beyond that, sorry, beyond that you can also configure the OpenStack client that are used to make these calls against OpenStack. So we're just going to run through a really short example. Here you can see how you would define a subnet for that matter and basically you can see, for example, CIDR that's like a parameter which is required by OpenStack and because of that, Cloudify also exposes it directly here so you have to input it. On the other hand, what you don't see here if for example is the IP protocol which is also required by OpenStack but Cloudify simply provides you with a sensible default which is IP version four so you don't have to input it. DNS name servers, for example, that's something that's also, it's not required by OpenStack but it's very often configured, at least that's what we found so we also expose it but enable DHCP, for example, that's less common so we didn't actually expose it but then if you want to actually override it, that's an example of how you would do that. Yeah, this is another example for a server. Basically all I wanted to show here is that you've got the image and flavor and they're passed by name so in the OpenStack API you're supposed to use like an ID and basically just one of the things an orchestrator can do is just do syntactic sugaring and translate the name to an ID for you. Okay, some considerations that you want to keep in mind when writing an orchestrator is, for example some server operations, they take a while to finish so we're talking both about like just the time it takes a VM to boot and like the cloud in its startup and then all the time if you've contributed your server with the password it needs to retrieve from the metadata service and the SSHD service needs to start up if you have to like operations to run against it and in general an orchestrator should like have some mechanisms to support asynchronous operations so in the OpenStack plugin we have some of them Yup, beyond that also the same is true for volumes they also like when you create them, attach them, detach them again you need some sort of handling for asynchronous operations and beyond that volumes they often have like a specific set of operations that are common if you want to make them usable like formatting and creating a file system, mounting them so in this case the plugin also takes care of these sort of things out of the box something else you might want to consider when exposing the OpenStack API in an orchestrator is like in security groups you always have a default security group in OpenStack and the default security group it might not be what the user wants to have on this VM but it always gets attached unless explicitly mentioned otherwise so if a user for example or like had other security groups that he wanted to connect to the VM the orchestrator might must be aware of this at creation time of the VM so he doesn't actually create the VM and only later on connects the security groups because then the VM will also have the default security group and that might not be acceptable maybe it's got rules that the user isn't interested in one other thing to keep in mind is that when you create a new security group there's permissive egress rules by default so it basically lets out all IP version foreign IP version six communication and this might be something that the user does expect because that's like the default of OpenStack so what we did basically is just expose one parameter which just by having it we think that it like pulls more attention by the user to the fact that these default rules exist but we also set it to false like we deferred to OpenStack's better judgment on that one and set the default to false last thing I want to show here is just we've mentioned before that you can also configure the parameters for the OpenStack clients that are used to make the calls against OpenStack so this is an example of how you would do that again it's just basically just another way of making sure that we don't prevent any like make any restrictions on the user whatever your use case is next we're going to go over some quirks and pitfalls in the OpenStack API basically just you know sometimes not all of the APIs are necessarily as intuitive as you would like them to be whether it's due to historic or legacy reasons or just random bugs whether they're already declared as such or not and the orchestrator can definitely help with handling these sort of stuff so just gonna run through a bunch of them so the first one I want you to talk about is basically a network in OpenStack can have more than a single subnet and when you actually create a server or a port you only like actually have to define which network it's gonna sit on but not the subnet the subnet is gonna be chosen arbitrarily so if you want to actually place a server on a specific subnet one like the way we do it is you create a port and when you create a port beyond sending over the network ID you can also define which subnet it's gonna sit on by using the fixed IPs parameter which might not be that intuitive once you have that you can also connect the server to the port and then you have a server on a specific subnet next key pairs they're the only resource that's actually managed on a per user basis rather than a per tenant one and it can just lead to funny behavior sometimes for example like heat stack if you get like a heat stack on one tenant which was created by one user the other user might not be able to delete it actually because if the stack also includes like a key pair from which belongs to the first user then the second user won't be able to delete it cause it doesn't get access in general it just breaks isolation between tenants because an action on one tenant might affect the other so that's just something to keep in mind. Next, floating IPs in open stack they must first be allocated and then attached to a server report however there isn't actually any validation when you try to do that that the IP isn't already attached somewhere so if you create a server you create another one you touch the IP to the first one you try to attach it to the second one the first one is just gonna get disconnected and basically it just brings up race condition scenarios another thing that like an orchestrator should keep in mind because if you check if an IP is already allocated and then you try to attach it it doesn't mean that it's necessarily gonna work cause maybe it's already attached to another one in between the time that you checked and the time that you attached it. Onward, so the Nova API for adding a security group is actually it's not thread safe so if you run like if you try to connect to security groups at the same time to the same server as sometimes only one of them will get connected and not necessarily both of them and like the way we solved it in our orchestrator is basically just verify after you add in a security group that it really got connected otherwise you just retry it and the neutron API for adding security group to a port is actually it is thread safe however the API still isn't concurrency friendly because what happens is as opposed to the server API for adding a security group here you don't actually only mention the new security group that you want to add but you also you have to like declare all of the security groups so if another client connected a security group like a new one to the VM you also have to be aware of it when you create your own security group. Another one is ICMP rules in security groups so a security group rules they're usually defined over a port or a range of ports but ICMP rules don't have this sort of association to a port instead what they have are types and codes okay and basically just like we wanted to create a simple like a rule to allow a ping and which is pings type and code are zero and like there's no actual normal API for it apparently what happens is you're supposed to use the port range max and port range min parameters of a neutron security group and basically just translate it to code and type so this is an example of how you would create a rule which allows a ping from anywhere. Next one is about adding right a security group to a server so back in the day you used to have Novanet and while you can have many security groups with the same name if you want to actually add a Novanet security group to a server you can only use you can only do that by name and not by ID so if you have multiple security groups with the same name it's just gonna be ambiguous and the API is gonna fail the API call so that's like both a problem in that sense and the problem for an orchestrator because suddenly like when you like the code for the server where you try to attach a security group has to be aware of whether this is like a neutron security group and then it can do that by ID or rather a Novanet security group. Next one is about ports so actually this bug was already fixed but after the killer release if anyone's still using a killer or before so ports used to be like if you created a port explicitly and then attach it to a VM and then took the VM down, terminated it the port would actually get deleted too from an orchestrator point of view that's just problematic because again it's very subtractions one resources lifecycle affects another so the orchestrator must be aware of it but again it got fixed in the killer release so it shouldn't happen anymore only a port that you create implicitly by creating a server and connecting it to a network would get deleted when deleting the server. Next one is about keystone roles they are assigned per tenant but when it comes to admin roles actually if you set any user as an admin on any tenant it becomes immediately an admin on all tenants so there's no like an admin for a specific project you're an admin across all projects and that's both problematic like security wise and then there's also the matter of we actually noticed that if you try to list resources using an admin account on one tenant you sometimes you'd be able to see resources from another tenant and basically how we ran into this is just we had this test and we wanted it to be like an admin like we ran into both of these issues because we wanted the user for the test to have an admin on one project and suddenly like at the cleanup stage it listed all resources across other tenants as well and deleted them too so just something to be aware of. Okay, next section is about how we test both our OpenSec plugin and just in general against OpenSec APIs. So we separate our tests to three sections there's like unit tests for unit tests you can use either like the standard mock library if you're using Python or custom mocks or there is also the Mimic project for those of you who are familiar with it it's a rack space project it basically lets you Mimics some of the OpenSec services so basically you get like a really good mocks out of the box beyond that you've got integration tests which test the plugins operation against like a real OpenSec deployment but like it only checks like a specific operation for example creating a server right or like creating a security group and then you'd have like system tests which both check the plugin end to end so whether it's like both creating a server creating a security group and connecting them together and also like we use the OpenSec plugin to simply run all like most of Cloudify's tests and to end tests on the OpenSec environment so this is just some code sample from like this is for creating a volume so I just wanted to show that in this case you have like a bunch of open sorry a bunch of Cloudify code like before and after the create volume call to using the Cinder client so these sections are probably something you're gonna be testing using normal unit tests right and you can see that we oh sorry sorry so you can see that we actually inject the Cinder client using a decorator up top so it's really easy to just mock it and make sure that the unit test doesn't ever communicate with any OpenSec deployment and then like you'd have an integration test which actually checks this specific method just including the call to the Cinder client and again you then want to have like a system test which also tests this together with other OpenStack resources creation and orchestration okay so regarding our test environments so both of our integration tests and system tests they run in parallel over multiple OpenStack tenants and that's the reason for that is first of all it gives you pretty good isolation it's really easy to clean up there's something called OS purge which is a really nice tool basically just really easy to delete an entire environment of like a tenant of OpenStack using it for those of you who haven't tried it so maybe you're not aware of it but OpenSec actually imposes really tough restrictions on the order where you can delete stuff like sometimes if you want to delete like an interface from a router it won't let you do that because there might be a floating AP somewhere that needs like the connection to the router all kinds of weird things like that so OS purge can help you do that in our case the tenants are pre-existing it just lets you be able to both like you can actually set up an environment for multiple tests to run as serially so then what you can do is actually take a resource snapshot before and after each test and clean up only the delta the new resources created by a test and beyond that if you'd want to have the test create and delete the tenant on their own like automatically they would require clean up anyway so there's not much benefit to that you can't actually delete a tenant without first cleaning up its resources so moreover on testing environments we actually took our tenants and spread them across multiple OpenStack deployments the major reason for that is just simply Cloud sometimes have problems there's like VMs just starting in error state connectivity issues, maintenance a bunch of those, the list goes on so we just don't want to rely on a single environment and beyond that that's just a really quick and easy way to test against multiple OpenStack versions and distributions right, so one more thing about cleaning up your environment when you're running tests like I described is keepers make an exception when cleaning up resources because again they're per user and not per tenant so the way we handled it is basically we have each test clean up its own keepers as opposed to having like an infrastructure clean up for it, for like all of the resources but then also like if a test was stopped abruptly so there might be additional leftover keepers on the, for the user on OpenStack so we also have an independent process which runs when the tests are inactive and cleans up all these leftovers okay, actually for the last couple of sites I'm gonna give it back to Yoram, so thanks Yoram? So basically what I wanted to show here and very quickly because we are getting close to the end of the session is in order to do the version the different version tests and the different distribution tests we, for the versions there is a DevStack which I think is a good tool to get you to a sense it's not replacing the need to have also an OpenStack environment but it's very easy to do the developer testing on the DevStack and very easy to change versions on the DevStack, another tool that I recommend looking at for changing distributions and even changing versions and distribution together is Revello, Revello basically allow you to create virtualization on top of virtualization with blueprints that kind of start an OpenStack environment in a matter of a couple of minutes and you can choose which hypervisors it's gonna virtualize so you can have KVM, you can have other hypervisors if your OpenStack deployment is running on different hypervisors, you can have different versions and then of course different distributions created in different blueprints and use that for easy testing We use it a lot also for after the development process if you want to check with different customer environments try to mimic customer environment this is a good option to create sandboxing of the OpenStack in a way that give full control full access and yet very easy to start and stop quickly. So that's the end of what we wanted to deliver Now we'll open the floor for questions Any questions? Okay, so thank you very much