 Okay, good morning everyone actually good afternoon. Sorry. It's a little bit jet lag there Thanks a lot for making it to this session. I know it's at the very end of the very last day and Everybody's probably if you're like me you're probably getting tired as well So thanks a lot for making it to to this session. I hope everybody had a good time with the open stack summit so right now I'd like to talk about what it takes to make an application cloud aware and use some of the experience that we've had to building applications for our customers building applications for ourselves to Illustrate what that means and what how we can leverage different tools to be able to build such cloud aware applications So first of all a little bit about myself. My name is Sebastian. I'm the founder of the skater open source project, which is cloud management platform at CMP specifically built for the enterprise to be able to To create and force the controls for it to be happy And still enable DevOps to be productive so my colleague Thomas is here with me and He built the cloud bench library that will will be talking about in just a moment All right So what I'd like to start out with a story the story of cloud bench, which is an open source project and use the the process and how we built this application to illustrate the problems we ran into and We've got a blinking screen right there All right, so it all this this whole thing has started with a few a few benchmarks back in I think it was back in February of 2000 of this year 2013 Where we had some customers that were asking us Whether Google Compute Engine Sorry, whether Google Compute Engine was actually much faster than Amazon or whether they were on par or or What cloud they should choose based on Based on pure performance criteria, and we didn't really know what to answer so As a good scientist would do we went out and started benchmarking things so what we did is a series of fairly short benchmarks to compare instance in instantiation times the speed of the storage systems like EBS or Google Compute Engine's persistent disks and And and we we ran all those benchmarks and we published the paper that we submitted to Giga home that got many many hits tons of comments and And the overall conclusion if we can read it when it's not blinking is that the Is that the Google Compute Engine isn't just fast it was Google fast and of course Google love that quote and has been using it in a lot of their sales materials But if there was one person that wasn't very happy about this it's Amazon themselves as you can imagine in many of the tests we saw a five-to-one ratio of of improvements that Google had over or for Amazon and so Every time you I'm sure if you've ever done benchmarks in your life every time you publish one There's always some guy out there. That's going to criticize your methodology. Hey, you didn't run it on enough tests I don't I can't reproduce these results Why isn't this thing? Why aren't the actual numbers? Open why are you just Sharing a summary of all this so we had a fair amount of criticism a lot of people that were encouraged To do their own benchmarks But again Amazon by far was the most upset by these results so we set out to actually professionalize our benchmarks and to make make a A Benchmarking tool that would allow you to test the performance of your clown And now the community to run that tool those benchmarks and cells to be able to assess independently based on their workload What cloud they should be choosing for performance reasons alone? So we built a disk benchmarking library called cloud bench that does a couple things and We'll get into that in just a moment, but mostly it's there's a client. There's a server and In the process it provisions all the resources you need he runs a lot of tests and it reports all those on on a centralized service This was open-sourced a couple months ago. Is that correct? And you can find it on github.com slash scaler slash cloud bench All right So let me talk a little bit about what we understand as a cloud aware application a Cloud aware application is an application that leverages the integration of That integrates with the cloud platform. It's running on and in our case We were using that for configuration purposes. What that means is When we were deploying a benchmark in a particular environment say on Amazon or Google it would have to automatically discover where it was running and based on where it was running it would have to provision a certain amount of volumes a certain amount of capacity And all the the parameters that you're running your benchmark on and that you're running your test on So what he needs it needs when initializes he needs to identify the platform. It's running such as location instance type Whole thing a whole bunch of things like that and then it uses all that data to be able to report Aggregates and and send that to a reporting platform But there's actually a lot of different Use cases for cloud aware applications not not just that Any autonomous bootstrapping process like if you're building an application that you want to be portable an application You want to be able to move from one cloud to another or if you're interested in cloud bursting if you're interested in Hybrid clouds then having applications that are aware where they're being deployed will help you achieve that portability So some examples is when your application needs to hit the user data APIs the meta data APIs to be able to understand that configuration data and And in in some some cases you may need to hit the actual cloud APIs for some some actually some data That's not available through the meta data Now this can be really really challenging like if you're running this on a single cloud like if you're just benchmarking open stack You can probably do it with greater ease Then if you want to actually achieve true benchmark across multiple different clouds And if you think about it Open snack cloud stack you see to Google Google Compute Engine even rack space which has an API That's slightly different from from open stack each of them have different sometimes often inconsistent responses That just make it very hard to to build such a system So we kind of identified two ways of building these cloud aware systems Obviously, there's the cloud native way And then there's a another way that abstracts some of those some of those differences, but let's talk about the native way first The native way is to write different code for every cloud that you're deploying on and that's kind of like if when you're doing mobile developments you're going to want to build an iOS application an Android application and You're going to have very little Incommon between the two platforms, so you're going to have to write a lot of code maintain that code separately and That's a fair amount of work That code when once it's once your application initializes it needs to identify Where it's running on so it needs to hit the metadata server and made it a data server needs to hit the your clouds APIs to be able to get all that And also you needed to sing was just to write what sort of code are you do you need to execute to begin with if you're going to be deploying on EC2 and your application Uses the wrong block of code the wrong and makes all the the calls thinking that it's an open stack They're going to be in a world of pain So that's what cloud bench does and Thomas if you want to elaborate a little bit about the on the native way Yeah, so the general idea is that we want the application to hit the APIs and realize which cloud is running on So it's going to try the metadata API so multiple clouds and from there conclude Oh, well, if I if I'm able to talk to the open stack API, I'm probably running on open stack At that point then it's going to call the actual open stack API student figure out Well, what kind of volumes are attached to me because it's a desk benchmarking library So that library is gonna ask basically the open stack API. Hey, can you tell me what volumes are attached to me where they're attached? What's their size? There's gonna run the benchmarks on those and then it will report those results He's gonna report those to the server to the reporting server until well I was running with two disks and those are my results and So this is what it looks like At a high level the the logic is the same across the across all the clouds But the implementation is actually very different Thomas. Yeah, so that's the general logic So again, it's just a snippet the goal isn't for you to work off of this or anything But the idea is that we have the we've got the same logic that's gonna be oh, well I'm gonna ask my cloud. What are the attachments? I'm gonna ask it if my volumes are persistent But of course the attachments the format is gonna depend depend on the cloud itself And then we're gonna follow the same logic and check every volume check if that volume is acceptable that's basically just about Checking the volumes actually attached if you don't get it didn't get an error and also if for security reasons We didn't tell the library a please don't benchmark this one because we know it's the local volume So no way you're gonna benchmark this one And then it's gonna actually run the benchmark. He's gonna check the file names and then we'll just move out And so this logic will be the same but the actual implementation is gonna be different because if you check there We had that like that cloud that attachments method So it's like that looks simple, but actually it's not because all each cloud is gonna have it's on logic So this one's gonna be rock space and rock space will use rock spaces library to connect But of course if you're gonna connect to GCE. Well, then we use GC's library to connect And that means that eventually we need to have all these blocks of code one per cloud that we want to use and the problem is Well, figuring out which platform you're using isn't actually gonna be that easy The thing is it's what we did. Maybe there's another way. That's the one we chose is We just try to hit the various metadata servers and see if they're responding Now the problem we had is one this like EC2 was the first so there's lots of compatibility APIs and If you are an open stack and you actually hit the metadata API It might actually be compatible something on cloud stack an example an example of that is if you're running on Say cloud scaling where they they pride themselves in having fully compatible EC2 APIs That means that if you you take your cloud aware application and you put it on cloud scaling distribution of open stack You might test That you're an ask your environment if you're in EC2 and it actually responds that you are but you're not Yeah, and that's gonna bring all sorts of issues because then in your benchmark This record is gonna show up in EC2, but you weren't on EC2 So that's not what you want and the thing is afterwards I called the APIs and the calling conventions aren't gonna be correct and you'll actually crash the whole thing So the other problems we had to face and then there's worse than this sometimes They're actually no metadata API if using rack space you actually have to call that xen thing That's gonna tell you you can't eat the actual API itself You have to eat the local agent which is gonna give you the same answer But that means that you have yet another branch of code to end all that specific corner case And so it's gonna look like this again The logic looks simple, but you've got that thing is present that you have to implement for every single cloud That you intend to support so to summarize this the that that was When building a cloud-aware application building it the native way has some advantages some disadvantages The the pros here are obvious. You have a lot more control. You can do exactly what you want And no external tooling is required Now the so actually yeah, and there's one last thing that I probably you probably meant forget to mention They'll see just the fact that if you're gonna build that application That's probably the first thing that's gonna come to mind. You'll probably be oh well I'll just implement one thing for every cloud. It's gonna look very easy. They will find out it's not less solid in that case Yeah, kind of like when you're building an application and your requirements change and yeah, you then need to rearchitect things And then the there's a lot of disadvantages to building cloud-aware applications using native the native APIs Like we just talked about it's Sorry Like what we just talked about it's actually not easy at all to figure out what platform you're running on because of the these EC2 and compatibility that you have So that that leads into a lot of complexity a lot of testing and you do a lot of Extra code that you have to write and maintain and troubleshoot Yeah, so and so as Sebastian was saying that means more code And the thing is at that point what we thought is we just figured well We might as well use one of the formal leap cloud libraries It won't just use that and that were just obstructed away all these clouds So we'll just call the first API we find is just gonna work Now there's a few problems with that The first one is you are even if you use one library that does everything you still need to figure out Which cloud you're running on if I need to find the actual host name for the API So that problem isn't solved and the other problem which isn't solved either is these are usually Restricted to shared functionality across clouds and there might be cases where we want to benchmark a few things that are very specific For example, we get rack space We run our benchmarks for a specific set of volumes and we want to compare that with say EC2 But then EC2 is gonna come to us and tell us hey You didn't benchmark EBS optimized instances, so you must be hiding something That means that we have to be able to from one instance identify if it's EBS optimized and these Multicloud libraries may not always give all these pieces of information that we need Which is why we ended up having to basically have one implementation per cloud and not one that uses that shared library And and finally There's the obvious problem of having to manage the credentials that you have across all the different clouds and Of course, we did this in a sandbox test environment So it was pretty easy for us to include the keys and all the credentials In other environments, but if you're if you're going to be building a production ready applications that are portable across multiple clouds then key management becomes quite an issue And that's something that we didn't have to touch upon with with cloud bench But for production workloads, that's something that you you very much have to care about And manage You don't want to be packaging every single server with every single key that you have And so the conclusion of doing this the native way is that it was just not very scalable from a developer Perspective from the amount of time that we were doing And every additional cloud that we were adding Took more time than the previous one and It was just generally very painful So it's not scalable based on the number of clouds and number of platforms that you're running on and it wasn't scalable to the number of Applications you have we just wrote cloud bench But if we were to build lots of hybrid cloud or cloud bursting Applications, then we would have to re-implant this every single time and there'd be there'd be a lot of wasted time and effort So we kind of thought about like what if we could delegate the responsibility on building that that multi-cloud abstraction to another To another layer Yes, so our objective there was we figured there's still some bits of logic So remember I say that there are some pieces where one common library isn't gonna cut it because you need to identify very specific But if you look at the entire thing that cloudbench does it starts by provisioning those volumes And then it also is going to make the reports where we figure we could do is Take at least one part of this the volume provisioning and delegate that to another code base Which is what we did and like delegated this to another app Well, of course the reason being well, we're the company working on that app So we were interested in using it, but now I still it looks like And so When when leveraging our own tooling to build this in our own multi-cloud abstraction it made it much easier to to come up with the Primitives to get this to happen and this is what it looked like in the interface What you have here is you have a certain amount of storage that you've you've defined at at prior a design time prior to To run time And you what you can see here on the or probably can't see on the left is that we decided that on this particular Instance regardless of what cloud it's running on it's going to have two volumes running with the the xd3 file system with a hundred gigabytes of persistent disk and And they're both going to be attached to to the device automatically So and the idea with that was there and that by the way this happens actually scale it's built by the three good folks right there in the back and The idea that what we'll only look at and what they actually built is they built this app that we can use To take those common pieces of functionality and obstruct them So that actually when our server starts running when our app starts running well as the volumes are already there So we don't actually have to take care of how do I actually attach those volumes? And what they also did is you still get the option that if you're running on a specific cloud you'll actually get extra options So in this case you could get for EBS We could get a PI OPS volume although there's no equivalent on another platform But of course and we'll see later This isn't the ultimate solution for all cases and we'll conclude by saying like when should you use each of those? But before we do that, let's look at the pros of doing it that way Yep, so when you're doing that with a multi-cloud abstraction layer There's a lot more code that you can reuse across the different applications that you're that you're you're deploying So remember when we said that building it the native way wasn't scalable to the number of clouds and number of applications Having this layer in between your the cloud and your application allows you to reuse a lot more code and just Overall be more productive. We also have a single code base To manage all your all all those libraries and all of those deployments. So it reduces the amount of efforts and errors you can have and finally When you have to write tests for all this As a corollary of having a lot of less test cases less ifson and Parameters you also have to write a lot less tests. Yeah, and if I could touch on this There is a we actually both had to write the tests what's on the so on the We did two ways as you guys mentioned That's the one where we actually run the benchmarks and the one we provisioned the volumes and the big difference is if you do everything The native way what you end up doing is that you have to multiply every different logic that you have by every different cloud that you use Whereas if you actually do it that way the key advantage is that these guys will only have to write their tests against every cloud they use and you have to write your tests Your logic and tested just against their API so that you don't end up multiplying You just have to sum buff integrations and that makes it more scalable When you actually have to implement that for multiple apps and multiple platforms That are still cons and The obvious the obvious disadvantage here is you actually need that layer so depending on the size of your company you can either decide to build that layer yourself and The amount of resources you want to put into building that layer yourself is a function of how many clouds you want to Deploy on how many applications you want to be portable across those clouds? And so there's the classic builders is by decision. You have But but in our case Skater being open source. It's a good candidate for being used given that there's there's not that cost associated with it So What we're trying to say here is that when you're building applications that That you want to be able to cloud burst Applications that are going to span across multiple environments multiple clouds in for example private cloud scenarios You're probably going to prefer Writing it using the native APIs if you're going to use a single cloud single application meaning if it's a if it's a one-off But if you're if you're this isn't a enterprise corporate-wide initiative to Increase portability of your applications then you probably want to have a dedicated layer for for cloud abstraction and for for Facilitating building these cloud-aware applications if you if you think about it the real thing we're trying to say here is If you're going to do a very small thing on your laptop You might try to bash grip for it because it's just the fastest way to do it But as soon as your project is going to grow bigger you want to project around more machines Well, then bash grips very quickly become less efficient And that's why we get all these things like chef for example Well, you just tell chef hey This is what I want to be doing and chef is going to be doing it Well, I got a platform is whatever the specific integrations are and what we're trying to say is well The same thing exists for cloud and I think tried both ways What the chef cloudish way is actually the one that worked the best for us using this Substruction tool scale out we thought we showed just a bit before Yeah, so that's some of the takeaways that when when you're building these that generally if You're going to be building a lot of applications over a lot of clouds the scalable way is to have that that middleware that middle tier in between your cloud and your application to facilitate the building of these and And So that's the first thing But if you're going to be building a single application on single cloud then of course the native is just faster kind of what like Thomas was talking about with Writing a bash script is is fast, but if you're going to be building a large application, it just doesn't scale as well so can I get a quick show of hands and who here is Planning to build like a hybrid cloud an application that's going to spend two clouds or cloud bursting About 20% of you something like that any questions. Is this relevance this this help Does this resonate with your experience in building these cloud bursting scenarios? couple people Any questions? Yep. Go ahead Cloud agnostic. Yeah. Yeah, I'll repeat the question Option right you have actually made your Yeah, let me let me repeat the question so I didn't get your name Nick so what Nick was saying is that In this particular Application cloud bench We need all the applications to be aware of their Where they're being deployed to be able to leverage? Whatever is offer offer to them, but what Nick would like to is when he's building an application is not to have to care about the implementation details and being able to specify at a high level What he wants to to his application to to look like did I get that correction correct? yeah, so repeating once again that the the application he was saying that The application itself shouldn't be the one that's making the API calls What should be making API calls is the management layer. Is that is that what you're saying? Yeah, and that's kind of what the the what our overall Assumption is here is that you want some middle tier whether that's a management layer or built into the application in our specific case You basically want to reduce the amount of efforts. It takes two To achieve for the built across those workloads. Does that answer your question? Actually, yeah, there was the idea when we say that we wanted to delegate that responsibility is that in the end We want to do is we want to have those volumes and want to benchmark them and we started off and when the old native way we Could it every API start in provisioning the disks and then well, that's exactly what you were saying What happened to us is we are well, why are we doing it that way? Wouldn't it be simpler if when those instance launch everything happens on its own and then the app starts and everything's just working already and that's exactly the The idea we are having that adding that management layer adding that delegation that's gonna make that happen Thanks for any further questions. Yes, gentlemen The the so the question is what's the current state of the cloud bench benchmarking application? Yeah So so far we've we are benchmarking all the public cloud out there And we're now rolling it out to to help actual users of the cloud to benchmark their own private cloud so for the public we've got support for rack space you see an EC2 and We're gonna start with open stack for private clouds taking what we've got from rock space and making that available on open stack And now the thing you might ask is why are we not delegating this to and removing all that code and we're writing it The reason is we want to let people benchmark their apps without needing that layer that delegation layer We had it so we'll add support for rack space for open stack inside the cloud bench itself Checking like is the middle of the API looking like looking open stack Thomas you want to show some of the results of those benchmarks perhaps Any other questions One more so if I I'm gonna repeat the question So the question is in the for the use case We don't care about the clouds functionality and the only thing you care about is raw performance Whether that management layer actually helps or not. Is that what you're saying? Probably not Again, this is a cloud abstraction layer And if you if the application only cares about raw performance, then you're not going to benefit anything by having it's Portable that being said you can definitely run Cloudbench and see for yourself what actually what cloud actually yields the best performance Or if you're if you're using open stack, and I certainly hope you are You can start playing around with different storage backends and different things and benchmark each one of those To to to easily get what configuration options yield the best the highest performance one more So the the question was what happens in the future when for example the easy to The easy to layer API layer on open stack is fully functional and oh I didn't get that right So all open stack compatible cloud if you're well So so the So I'll answer in two ways one is abstraction layers generally just yield you more productivity. So I Don't see any problem having an additional one if that yields productivity But for for this specific case The question was what happens if you have decided to build a hybrid application. That's going to spend only identical Open stack clouds correct. I didn't did I get that right on the identical parts? Well, there's lots of different open stacks. Some of them might have cinder some of them might not have cinder Some of it so so that actually matters quite a bit Yeah, so yeah indeed if you're going to be building an application that spans a lot of identical open stack clouds then The multi cloud abstraction is not going to help you at all Yes, one thing that I wanted to add is Thing is that all the different clouds you're going to have if you don't control of them There's a good chance that especially they are public They're going to try to differentiate Badding new tools adding a replacement for a certain component And the risk is that if you don't have that abstraction layer if one of them changes a component They sound like you either have to add an extra if close in your code and that's gonna make everything ugly Or you're going to have to just stop using that one And then what we'd say is that the abstraction layer is going to allow you to this is what the decision is gonna change So all these apps you deployed you're not going to need to change all of them And it kind of takes us back to what we're saying Well, it's not scalable in the number of apps and the number of platforms that if you have numerous numerous apps You can't really afford to take the risk of having to change all of them if just one of your providers Decides to change one subsystem in the open stack cloud that makes sense Wallace Not so much Perhaps that's what it's going to move towards. I would hope so But that's not what we're seeing right now Where with if you're looking at if you look at the rackspace API, for example, there's a lot of things that are just not in open stack itself I can imagine that if you're building an open stack cloud for your own private internal reasons Your private internal Use you're going to be deploying things differently or perhaps on Cisco You see us or something that and and sure it's subtracted away via the API But your performance might be different and if your performance is different and you auto scale from one cloud to another Your load bouncer if it's going to be sending Traffic equally among all the different numbers of servers If a certain class of servers just higher performance than the other then you're going to have poor load balancing between the two So yes, ideally I could see this Possibility that that sort of that might happen I just don't see that to be very likely Any last question? All right. Well, thank you very much for your attention and we'll still be around if you have any questions to ask us privately Thank you very much