 Welcome everybody, I'm Brad Topol, this is Tong Lee, Mark Volcker, we are here to talk about the Open Stock Interoperability Challenge and the interoperability work group updates. The adventure continues, so if you look at the interop challenge, what were we looking at? Wait for vendors to demonstrate, this is bigger, I can't read it on your screen, I can't read the half size. You want to step over, see that? That would be brilliant, I can do that, much easier. So I can actually read this now, so wait for vendors to demonstrate their clouds the way they work with core applications, basically it was a holistic approach, in the original interop challenge, what do we want to prove? Well, we wanted to prove that we had interoperability with OpenStack and we wanted to prove that one of the things we were getting dinged on by the analysts was, well, let's see us run some enterprise applications. So we generated reference applications, workloads, these are all publicly available in our repository, we were starting to work with the app ecosystem work group and do some work there, and as we had challenges that were identified, as we found other people that wanted to be participating, we included them as well. So we grew a really nice community way back when, again when we first started this out, what did we look at? We looked at some Docker swarm and we looked at LAMP stack, just some nice applications and we hit a lot of bumps, right? You can see on the right what we were trying to make, a lot of the key ref stack capabilities we wanted to retest and make sure everything was working, so all the things you'd expect to see in an enterprise app, the VM provisioning, security rule management, volume creation attaching, detaching, public and private IP addresses, load balancing, network security rules, lots of interesting stuff and of course software install on top of OpenStack. If you look at what we proved in phase one, we showed that the cloud ecosystem was interoperable, we had 16 companies, they took the stage to demonstrate live portability of the workloads, we had a lot of good press, a lot of good analysts from Barcelona when we did this. So then again when we started phase two, we went to the community for input, that seemed to be the most fair way to say really what do we need to focus on for interoperability and you can see where all the greens are, right? The community came back, the folks that were interested really wanted to see Kubernetes, really wanted to see NFV. Then we came to phase two and we got a lot of requests, a little shock to the team, we had people merely on a very calm road map of Kubernetes and very quickly we were asked to jazz it up, jazz it up meaning why don't you try a CoroS image and get out of your comfort zone of an Ubuntu image and why don't you think about running multi-cloud and cockroach DB on top, wouldn't that be cool? And you know the foundation, it's their keynote stage and if you want to be on their keynote stage you're going to take their input in their direction. So they set the bar very, very high and I know it caused all of us a little bit of stress. I don't know, Mohamed, you seem pretty calm, nothing stresses you, but some of us were very stressed and kept very busy. Now in the background and there was another presentation on this, there was the NFV work. So there was still NFV work, it didn't make the main stage, there was a separate presentation for that here, but a lot of cool stuff going on in the NFV space to show interoperability when you take open stack and you apply it to that domain, those workloads. So it didn't make the main stage but incredibly cool stuff going on, very cool voice over IP demos kind of things and calling people on the, a lot of cool stuff going on in the NFV work as well. Again, how do you do this? So you got layered things, you got cockroach DB, it's a multi-world distributed database that gives you full transactional capability. So it's a little different than your traditional NoSQL databases that get scaled by giving up on transactions and having eventual consistency or what have you. So it's a different model, it's pretty interesting. And of course, Kubernetes, the key things for making sure open stack is interoperable, we got to find the right tools. There's a lot of different tools out there for automated deploy. And we can all have a theoretical conversation, a philosophical conversation on which automated deploy tool is best. But when it's time to get this gentleman up on stage and this gentleman over here up on stage and these two great folks up on stage and say what's gonna work, we all gonna run the same automated deployment. What's gonna work on all 16 clouds? Ansible really becomes the answer that works on all the clouds. There are some other options out there that we're happy to talk about. So we can kind of do a little stream of consciousness. Well, somebody say, well, why didn't you use open stack heat? Well, you go to your 16 clouds and you say how many people have deployed heat? Maybe a couple hands go up, not a lot. And again, when your measurement is 16 people are gonna have to run this on stage, well, you toss it out. There's also another great tool called Terraform. Terraform is nice if you need a layer to connect to different cloud infrastructures, but it had some issues. So on some of the clouds we had, they had multi version endpoints for their compute services. Well, when Terraform saw that, it kind of freaks out and goes, I don't know what to do. There's version two, version 2.1, yeah, right? So we had several clouds that they couldn't use Terraform and that bug had been around in Terraform for quite a while. The other thing that I know Tong kept explaining to me over and over and over was, well, I can do this work in Terraform, Brad. But what about the software config on top? I'm gonna end up using Ansible for that. And so now I've got this one tool underneath that doesn't run on all the clouds and then I've got this Ansible on, and they're still gonna do this Ansible on top. If you really liked me, we'd use Ansible for everything. It runs on all the clouds. I can do the one piece this way and then I can easily put the Ansible on top. And I'm sure Tong told me that five, six, seven, eight, nine, ten times. I got it on the first time, but it's good to hear and have to force to explain it to me again. So a lot of lessons that we learned and a lot of choices there, but beginning everybody up on stage forces you to make realistic decisions on not the theoretical of how what could be interoperable, but what has to be interoperable. There's some other details here on the NFV. Tong, do you wanna cover some of that real fast? Cuz you were a little more involved in just that piece. I don't know too much about NF, we'll upload. Yeah, so just basically- We have another session though. Yeah, they have a whole other session. But with the NFV, there were just certain components that the NFV folks run. There's certain extra tools that they use cuz you wanna make phone calls and stuff like that. So you gotta get the right stuff set up. It's a little more complicated, but it's very interesting. A lot of great participants, some were from the first one and they came back, we didn't scare them all off. How many participants are in the room? One, green shirt guy? Yeah, anybody else? Only two survived, right? So if you look at this group, nice mix. We had room for some new folks. We had some continuing folks, veterans, like the folks you see on the stage here that just wanted to do this again. And it was kind of neat, as you get better at what you're doing, it's easier that we could have some late comers come in and actually be able to participate. Wouldn't have been as possible to do that in phase one because it was a little rougher patched learning and getting all the things worked out. But with phase two, we'd figured out a lot of the best practices. And you'd gone through that first level of triage of people being able to run. So we didn't have some late comers where people showed up, pick on some of my platform nine, came in really late. Wind River, wind came in incredibly late and said, hey, you got that one last spot, we can do this, we want it. And we said, sure, right? So that was kind of fun to get all the numbers up. Just an overview of the workload. Again, you got your classic, I assume everybody familiar with Kubernetes? Anybody not familiar with Kubernetes? I know Tom in the back there knows Kubernetes incredibly well. Kubernetes is a nice model. You've got a master node, you've got worker nodes. And you essentially are scheduling these things called pods. Pods contain containers. Think Docker, right? So these pods contain containers that all kind of are related to each other. And they can share a mounted volume and share an IP address that comes in. And a really nice way in Kubernetes to be able to have a declarative model to say I want eight replicas. And if something fails, Kubernetes will recognize it. And if a couple fail, it'll bring your replicas back up to eight. You don't have to do a lot of work. A lot of built-in nice features with Kubernetes and auto scaling. And so if you look at the model here, you've got your master node running in a VM. You've got your other VMs provisioned. And they have what are called the Kubernetes worker nodes. And they've got the pods that are running there. And you can attach your volume storage to the VMs, giving the pods the volume storage there. So a lot going on there. And those are the types of characteristics that we were able to worry about and test. So you can see Kubernetes install and config. The database system deploy, the dependency install. The neat things here that were a little different, we used a CoreOS image. So you get a little bit of the Docker for free, but then you lose the Python, right? So there's something to think about. When you're dealing with Ubuntu, Python's typically there, depending on the version. You go to CoreOS. And differences, we're gonna make sure the Python's there, but the Docker's there, so let's beat certain things up. I don't know if you wanted to add anything to that. Any other wonderful Easter eggs with CoreOS? All right, well, good. Got a nice wiki page. Keep all our details out there, meetings and what have you. I can assure you we're gonna take a little break after the summit for a while. We'll assess after that and go after that. So we get a nice wiki page. Everything's done out in the public. Workload repositories, you get to see the Ansible scripts. You see how much it is just the same script and a few configuration pieces for the endpoint, credentials, etc. And good best practices here that if you wanna deliver things across multiple clouds, you can go look what's there. We've got bug tracking and these types of things. So lessons learned, I'm gonna hand over to Tong, Tong instrumental and making this real, so Tong. Thank you, Brad. Cuz we have done two times the interoperability challenge. We have learned a lot during the course of a year, I think, right? Two times. In the first phase, we actually learned a lot of stuff. On this slide is actually most of things we learned from the first phase. We were able to very quickly to do the second workload is based on what we have learned in the past. Otherwise, that wouldn't be that fast. So I just iterate what's on the slide. Ansible actually is a really nice tool to do this type of work. I'm sure a lot of cloud operators use Ansible. When you're trying to actually deploy software and do some configuration, when you have your machines already up running, it's a very good tool to use. Brad mentioned Terraform. I mean, what we find is that Terraform is very good for provisioning for your cloud, regardless if it's Amazon or it's an open stack or even Google Cloud. You can use Terraform provision a lot of nodes in parallel very fast. But when you come to configuration and software that you want to run on SVMs, it's a little bit difficult. So that's the reason we pick actually the Ansible. We did try both in the first phase. The other problem we actually find by doing this from multiple clouds, is that the neutral implementation is very different from cloud to cloud. We find to use shade actually can help you a lot, not only on the Narokan side, but also on some other components. I think VMware actually taught us this next lesson. We assumed that when you have a volume attached, it's a VDB, but that's not the case for all the cloud. So we added some parameter in the configuration file to allow the workload script actually to specify, I mean to attach the right volume to the watch machines you created. Actually, I'm not sure if OpenStack actually can be configured to make sure that the volume name will be always consistent, no? You can't? Nope. No. No, so part of that's a guest OS behavior. And some of that is attributed to the hypervisor based on, like if you're using Libbert, Libbert actually talks to the guest OS to plumb that into the VDB. On other hypervisors, that may show up as just a regular vanilla SCSI adapter, like an LSI logic adapter. So the way that shows up to the guest kernel is a little different. Okay, that's great. And I mean, this sort of little things you will have to actually do it to find that it could be different from cloud to cloud. The other thing we learned that, you know, we're trying to use many different open systems for VM, right? I mean, even for bundle, you have many different versions. In the early version, when you have your VM up running, your network is ETH0, but now when you use the newer ones, now suddenly the network interface card is not called ETH0. So you have to be prepared to figure out all this dynamically. And, I mean, especially when you're trying to set up some security rules, not the neutron security rules, you know, let's say you set up actually proxy, then you probably need to know your VM's network card name. So those sort of configurations, you can assume you always get that. What's interesting about a couple of those is that they're actually not open stack problems, per se. You know, part of what we were doing with this interoperability challenge was showing hybrid cloud workloads. We actually took a workload, ran it on a bunch of different clouds, and then made them all work together, right? So when I talked to folks out in the field that have maybe open stack in their data center, and they're also using a little AWS, or a little GC, or a little, you know, whatever other cloud, in most cases, they have those same issues, right? If I run a Ubuntu 16.04 guest OS image in a cloud somewhere, it might use a predictable network interface naming. And so now I have ENS32 instead of ETH0, right? So these are problems that are actually pretty consistently something that you'd run into when you're trying to orchestrate applications across any set of clouds. And what we've done with the arm-ability challenge was just kind of flush those out for an open stack world. But these are actually things that are pretty typical for people that are writing cross-clad applications no matter what cloud they're using. Right, so those lessons we learned actually are very valuable if you really develop your applications based on various different cloud. The other thing we learned is the network virtualization. You know, not all clouds support floating IP, the concept. So you have to be careful that, you know, if you assume, okay, when I develop my application, I'm going to have a tenant network. Well, that's not always the case. I mean, that actually makes me just remember something that we didn't put, I think, on the slides. When we create those workloads, we actually purposely create a template for the configuration file. Okay, then we provide the examples of how the configuration file should look like, what should be in there, and what kind of value you should have. Now, when we run this thing on multiple clouds, we just tell the operators that, okay, well, for your cloud, you just need to make a copy of this sample configuration file that makes a little change. Then just use that configuration file. You don't have to, you know, go in there, change any of those workload scripts. So you just create a new configuration file and the workloads start running. So I think that actually is very nice. So you don't really touch the code. You just make a configuration change. And all the command that actually you run different phases of the workload is absolutely identical. I thought that was a very good decision we made. Okay, wow, move on. We have more lessons learned. Most of those items, at least here, actually from our Phase 2 workload. In the Phase 1, we actually didn't try and we used the CoreOS images. Then, as Brad mentioned earlier, the Foundation asked us to, you know, try different OS than we started using CoreOS. Of course, we're still trying to support Ubuntu and other distributions. That I find that when you use the cloud in it to set up the host name, you have to be very careful. You want to use the fully qualified domain name, which is just one little period at the end. If you don't do that, then you get different host name if you use different images. So that actually gave me a lot of grief and it creates actually quite a lot of discussion among, you know, operators. And funny that later I find on an OpenStack mailing list that actually operators start talking about, okay, now when we configure OpenStack, do we use the DHCP domain parameter or not? So those things are actually very valuable to operators. What is the right thing to do when you configure your cloud? Okay, so then when you use the CoreOS, there's no Python. What are you going to do when you're trying to use Ansible, right? So you have to try to figure out a way and the CoreOS also doesn't have the, like the Ubuntu software repository, then you can just do a PT install and it will go, right? That's not the case. You have to find a different way to actually bootstrap your VM and then your Ansible will start working, right? I mean, luckily there are a lot of bootstrap code and open source code available. It just happened and I find the one actually contributed by VMware that worked great. So all the instructions also in a document as part of our workload. Yeah, these were, a couple of these examples are kind of interesting in that we made the workload a little bit more complicated than it necessarily had to be by supporting multiple different guest operating systems. And that was partly just user preference and trying some new things. You could be very prescriptive on the images. We could have actually picked a standard image for everybody to run. But what we see in the real world is that everybody kind of has their own little spins on different images. A lot of folks maybe have security patching requirements. So they want to run the very latest security patches for Ubuntu 16.04 or maybe they need an older Linux kernel for something that they're doing. So these are, again, kind of real-world problems that aren't necessarily specific to OpenStack. But things that we put in the interoperable workload because there are things that you're going to see in the real world if you try to do cross-cloud workloads. Yeah, thanks, Mark. The other thing we actually... I don't know if other as well script developers actually use. This is taking me a while to figure this out because in the phase one, we want to create... We create actually multiple VMs, but we didn't really paralyze this process. We just create one VM after the other. We use the Ansible OpenStack cloud module. You can imagine on some of the cloud to create a VM actually itself takes a lot of time. So that's part of the reason why the first workload, the LAMP stack we did took about 8 to 10 minutes to finish. Most of the time, I just spent creating VMs. Then for this second interoperable cloud, we actually paralyze that. We created the virtual host, and each virtual host actually runs exactly the same script to create the VMs. Now, we basically paralyze this process. So we can create 20 or 30 VMs, use pretty much the same time you create just one VM. So I'm pretty pride about this finding that we use for this workload. How we actually do that, we can take a look at the workload, the code. Of course, you can also talk to me if you're interested. Then our colleagues from China actually contributed the next patch that added the profile timing plug-in to the workload. So if you enable that, when you run the workload at the very end, the workload will produce a very nice report that shows all the times actually spent on particular tasks. By default, we'll list the top 20 tasks that spend the most of time. So this is a very nice tool you can use to figure out actually where in a process you spend most of the time. You can fine tune that if you find, okay, well, when I attach the VM, I attach the volume into the VM, it takes most time. You probably can take a look at it and say, hey, why is happening that way? So it's very nice. You probably know that the workload we did for this time is the Kubernetes on top of OpenStack plus the cockroach DB. So what I find is that, you know, Ansible OpenStack plus Kubernetes can give you a very nice cloud environment, right? You get the VM from OpenStack, you get containers from Kubernetes just running on top of that, and Ansible give you automation, right? It's the tool you need to really run your cloud. Of course, you know, we just prove that you can use those open source tools to create the workload that can run on multiple clouds, which produce exactly the same result that we have witnessed actually yesterday on stage. So I think I'm running out of time. Did we talk about this already? Did it deliverables? Pretty much. So let's talk real quick about the REST stack piece of this. So the inability challenge is really looking at workload portability. So can a real-life given set of software be run on multiple clouds, right? There's another piece of the puzzle here which is consistent API behavior between OpenStack clouds, because OpenStack clouds, it turns out, can be configured a lot in different ways. So as part of this, we actually asked all the participants in the REST challenge to go run REST stack against their clouds and verify what APIs they support and what tempest tests they pass, and whether or not they adhere to the most recent interoperability guidelines from the OpenStack Foundation. And so that was kind of a good data point for us because we actually got information on 15 real-life clouds, public clouds, private managed clouds, and private distributions. And those have been posted as well. So that's kind of the other piece of the puzzle is looking at the consistent API behaviors. You know, we have one workload here. Turns out there are a whole lot of other workloads. Some of them use different APIs than the ones that we did. So it was kind of the sort of next step in looking at interoperability there. And for those interested, there's a couple more sessions on the interoperability guidelines coming up. I think the next one is at 240 today, if I'm not mistaken. So be sure to tune into those. We'll skip over the rest of the time. You want to talk to China? I think, yeah, Tom uses yours. Okay. Well, I think, well, this year, I have traveled to China a couple of times each time. I was there. I feel like the OpenStack is heating up. A lot of people are really interested. They want to run OpenStack for their company or enterprise or government. A lot of interest. That's part of reason, actually, with Mafei. He's sitting right there. It's help. We created this OpenStack interop challenge, China chapter, the past February. Then we have people from 10 different companies to join at the chapter. We used the very first and abstract workload. We run those workload on those clouds in China. Then we used that, actually, in the April Beijing Global Open Source Summit. You can see those pictures. That was very cool. Oh, you want to talk about it? Yeah, I'd like to take this one. I just wanted to give some credit to a lot of people who put a lot of effort into this. Obviously, Tom was the rover, making sure everybody had any questions they had. He helped them out. Then when the Cockroach DB Surprise came out, he had to go figure out some Cockroach DB. Obviously, Mark as well and all the others, Mark was instrumental as he was in the first phase, making sure things worked and good decisions and being a liaison to the ref stack side and making sure the whole overall view was consistent between the workload and the challenge and what else was going on in interoperability and leading the work group. A lot of good faces, names here that a lot of folks did it twice and Egley and Roman, new folks like Muhammad, stepping up, right? And it was quite a thing. So thanks to everybody who got involved, a lot of extended work effort that people did to make this happen. That's up to... We want to cover Mark, this is your turn. Yeah, so we wanted to throw in a couple of updates about the interoperability working group, which is the group that produces the interoperability guidelines that I mentioned earlier. Like I said, there's a full session on that later today and a couple more tomorrow. But a couple quick blurbs. We asked the participants to run the ref stack suite against their clouds. Most of them were looking at the 2017-01 guideline, which was approved by the board of directors earlier this year. We rolled those guidelines out on a six month cadence, so the next one is due out in August this year. And we're working on the scoring of new capabilities for that right now. We're also looking at a couple of interesting things. One of the workloads that we talked about today was the NFV workload. Traditionally, what we've done in the interoperability work group is focus on sort of core capabilities for almost all open stack clouds. What we're actually seeing now as open stack is such a flexible system that it's finding a lot of interesting use cases and niches that it turns out to be a really good match for. One of which, as an example, is NFV. So we're actually looking at creating new programs for those verticals, because the way an NFV cloud is built and behaves is maybe a little bit different from, say, a general purpose compute cloud, right? There may be things like PCI pass-throughs that have new malware scheduling, maybe advanced data plane performance, other things like that. So we're actually looking at some new programs to create interoperability for those. We're also looking at add-on programs. If you look at what's in the interoperability guidelines today, it's really the components that are super widely deployed across most open stack. Public clouds, distributions, private managed solutions, and what people are actually using it, even if they roll around, right? Turns out there are things that are less used, and interoperability for those components is still super important for the people that actually do run them. So if 16% of the population is running Designate, for example, they really actually care about interoperability for Designate. But it's kind of one of those things where we're looking at creating add-on programs for components so that there's interoperability for those individual projects as well, rather than sort of just the core set. And then we're also looking at, we talked a little bit today about, part of the interoperability challenge today was embracing a larger ecosystem. We're working with folks like CoreLS, working with folks like CockroachDB, working with the Kubernetes community. We're also looking at doing that with some of these vertical programs and add-on programs as well. I know we had some discussions with folks and those are continuing now. I think we're pretty close out of time, but hopefully we made it or two for questions if there are any. And there's mics here. Four minutes. Rocky. I don't know, we'll have to see. I know someone you may know, Annie Lai has some ideas, and so I promised to meet with her and see what we're looking at. So we're going to take a breather unless you all have suggestions off the top of your head. There has been a lot of interest in NFV workloads, and we didn't get a chance to show it on stage because we only have about 10 minutes. When Edward Sudden is the next act, turns out people just kind of give you the crook. So maybe we'll do something in that realm. NFV is especially popular right now in Asia-Pac and we are going to Sydney for the next summit. So nothing decided, but it's something that we've already done some legwork for and would certainly be geographically pertinent. So we'll see. Monty had a session yesterday about pulling deployer configuration specifics out of, you know, on a static URL out of the cloud. I'm just curious how that would, how much of that information would be useful to automatically pull into the configs that you had, right? Things like we do floating IPs, we don't do, you know, I'm just curious how much of an overlap there is there. Yeah, so discovery has kind of been a hot topic for a while now, because like I said, one of the strengths of OpenSack is it's a very flexible system. You know, you can adapt it to a whole lot of things. We kind of make the analogy, it's maybe a little overused, but we make the analogy to Lennox Curl. You know, Lennox started off as kind of a PC operating system, and then it morphed into a data center operating system, but it's also in seatback entertainment units, it's also on my phone. The Lennoxes that you see there are different, even though they're all derived from a common code base. So there are a few things that have been changed in between, right? So as we see OpenSack kind of moving into these more niche places as well, discoverability of how a cloud acts has been a very hot topic with, I know the interop group as well as some of the folks like Monty for a while now. Part of what we're doing here is we chose to use Shade and Ansible, and a lot of that's actually reflected there. Shade will actually make some queries into the cloud to pull some of that information itself. When it gets down to things like, say, IP addresses, or what images are available, or some of the other, like, finer grain details, we'll have to wait and see first when those APIs come into fruition if they do. It's certainly the kind of thing that makes sense for tools to use. And then, too, we actually want to see when that's picked up by the tools ecosystem. Because that's something that we can offer to, say, the Terraform folks or the Ansible folks or the Chef folks or the Puppet folks or the Kubernetes folks or whatever is going to ride on top of OpenStack. Once those things are available, they're actually going to have to consume them, and there'll be a couple of iterations on that. But it's, discoverability is certainly a kind of thing that we would like to do in the future. That also will raise a little bit of concerns on some of the cloud operators about how much information they want to expose through something like that. So those are kind of like some of the tools we have to work through. Anybody else? Sure. Yeah, and if you actually look at the Ansible variables files that we use for these, it's like 20 lines or something. That's horrible today. It could be better, certainly, but it's actually not that bad. Yeah, because it contains some of the Kubernetes configuration parameters as well. So that makes it longer. Okay. I think we are about out of time. So thanks for coming. Thank you.