 All right, we are live, I think. So welcome everybody to the almost last talk of the summit. Congratulations to those of you that made it this far and are still standing upright. We're going to talk a little bit about Interoperability Day and the mechanics of how we go about determining what goes into OpenSec's Interoperability Guidelines. So let's start off with a little background about why we care about all this in the first place. So if you've seen some of the material published by the TC, you know that OpenSec aims to be a ubiquitous open source cloud computing platform that will meet the needs of public and private clouds, regardless of size, yada, yada, yada, right? Perhaps inherent in that ambition is the promise of some level of interoperability. So if we're using this thing to build lots of different clouds and lots of different kinds of clouds, part of the promise there is that we'll have some level of interoperability between them. And that's one of the benefits that you get by using it. So hopefully a code that you write up against one set of capabilities, APIs, et cetera, works on many different places, right? Well, as it turns out, clouds that on some level are the same, that are built from the same code, can wind up acting and looking very differently from one another. OpenSec has a whole lot of nerd knobs. This stat is a little out of date now, but as of late 2015, there were over 4,600 config options in just the TC approved release projects. So not counting some of the ones that aren't as widely deployed, but are still important to some people, right? And there were over 1,000 policy.json configurations, and that's just the ones that ship in the defaults, right? In default packaging, we can add more. So obviously a whole lot of flexibility in the way you control your cloud. And then it turns out people also put things in front of their clouds and on top of their clouds, things like firewalls, load balancers, API gateways, all kinds of stuff that goes between you and a cloud. And then mechanics of what's under the hood obviously can change behaviors as well. Different hypervisors support different image formats. Different storage has different capabilities. Different SDM platforms have different capabilities. And of course there's many different kinds of workloads that use different clouds in different ways as well. And at the end of the day, what developers want is to build a right code against all of that. So let's talk about the vendor perspective first. So most people that are using OpenSec Clouds today are getting them from a vendor, whether it's a distribution, a managed service, or even a public cloud. So vendors increasingly need to be aware of the interoperability concern. And it's something we've seen starting to crop up in some of the talks we've had with consumers of OpenSec as well. First of all, it's good for the users of the products. It helps promote the product. And it helps applications to be developed on the platform. Fostering an ecosystem around OpenSec is probably the most important thing that we can do for OpenSec in terms of its traction in the rest of the world. And by the way, it's now required if you want to call your product OpenSec. This is what we do not want developers to have to write ever. If OpenSec Clouds have no level of interoperability, this is what you wind up doing. Just to get a list of images, you might have to go through a five or six line deep if else loop to figure out what works on your given cloud. And keep in mind that that may change over time as well. As new API versions roll out, as different vendors move faster than others and adopting new releases, et cetera, et cetera, et cetera. This is a bad developer experience of what we do not want. So what's in a guideline? We mentioned interoperability guidelines. That's kind of the premise of the talk here. Interoperability guidelines are produced by the interoperability group of which I'm the co-chair. They list the capabilities that products calling themselves OpenSec have to support, the tests they must pass in order to prove it, and a list of designated sections of the OpenSec code that they must use to provide those capabilities. So basically, there's three important facets here. One is a list of things that a cloud has to do that end users can use. Two is a list of tests they have to pass so that they prove that they do it in the same way as others. And third is a list of code that they have to use to back back capability so that users know they're actually getting OpenSec code and not some strange Java fork of OpenSec that implements the API. And there's also a few other things in the guidelines. There's a list of exceptions, things that we found problems with, and things that might be required in the future so that you can start looking ahead to what you may be able to consume in an awkward way in the future. Keynes guidelines, they roll out every six months. They are offset from OpenSec releases by a couple months so that each guideline covers three and a half releases. Basically, because of that offset, when we roll out a new guideline, we will approve it for the current head in master and that will become a new release before the next guideline rolls out. The newest two can be used by vendor if they want a logo or trademark agreement from the OpenSec foundation. So when we talk about what's in the interoperability guidelines, we also have to talk about what's not in the interoperability guidelines. So some things that you will never find in interoperability guidelines are stuff that end users don't see or can't use. These are primarily focused on end users of clouds because that's the purpose that it was created for. So you're not gonna see admin only APIs in there, you're not gonna see RPC APIs, you're not gonna see database schema, you're not gonna see HR guidelines. These are things that users can't see and don't consume directly. Obviously, it supports what they're doing, but these are not things that consumers of a cloud would touch. So you won't find those in the guidelines today. You also won't find stuff that's intentionally made pluggable by OpenStack. So for example, you will never see an OpenStack guideline that requires KVM or requires OpenV switch or requires some particular storage platform. Those are all things that OpenStack intentionally abstracts from the user. So when we look at capabilities, if we find out that gosh, of the 80 sender drivers out there, this capability is only supported by three of them, probably never gonna make it into a guideline. Needs to be stuff that actually works across what's under the hood. Because again, that's not directly exposed in users. You also won't find stuff that doesn't have tests. Remember earlier, it's great to say that you support these things, but we also need a test to ensure that that behavior actually works across clouds. And you also usually won't find stuff that's being deprecated. Occasionally what we find is that stuff that is on the path to deprecation stays there for a long time to give the sort of tooling ecosystem around OpenStack a chance to adapt to whatever's coming next. So these may stick around in the guidelines temporarily, but they'll generally have some kind of flag or disclaimer in there, a warning about that. So how do we decide what goes into these things? At the end of the day, it boils down to three important things. First, there's a list of 10 guiding principles. There's a list of 12 core criteria, and then there's this giant list of tests. So all the tests that you see run in the OpenStack gates are potentially a fair play for this. We look primarily at what's in Tempest. We've also had some interest in using in-project tree tests that have a Tempest plug-in interface, but primarily we look at Tempest first. And we'll also mention that new capabilities have to go through an advisory phase. Basically the first time they go into one of these guidelines that rolls out every six months, they're marked as advisory, and that's kind of a signal to the rest of the community. Hey, if you're a vendor or an OpenStack employer or an OpenStack end user, this is something that's gonna be required in the future. So if, for example, you are making an OpenStack-powered product, this is something that you're gonna be required to expose your users at some point in the future. If you're a consumer of clouds, this is something you can count on being interoperable in the near future. So the most important of those is probably the interop criteria. So let's take a look at these. Like I say, there are 12 criteria. They're kind of lumped into four different buckets here, and you can see kind of in the circle in the center there what they're sort of attempting to prove. So for example, one of the things that we want is capabilities that show proven usage. So we're not sort of requiring stuff that nobody ever touches or cares about, right? There are three criteria related to that bucket. One is that it is widely deployed. And for that, we look at things like what products support it. Do we get user survey feedback from people that say this is things that they use? We also look at whether it's used by tools. So things like, say, SDKs, like, say, JClouds or Fog, or maybe cloud providers for things that you would put on top of OpenStack, like, say, the cloud provider for Kubernetes. We also look at what's used by clients. If a new API is introduced in OpenStack and it doesn't have support in OpenStack Client and it doesn't have a horizon interface, then the only way to get to it is directly through the API it's probably not ready to go into an interoperability guideline. Maybe somewhere down the road when that gets done. So you can kind of see the universe of things here. Basically, we're looking for things that have proven usage in the real world. They align well with the technical direction that OpenStack is going in. They take a system view of the entire sort of whole cloud and they play well with others. We'll spend a little bit of time on the systems view one because that one's one that tends to need a little bit of explanation. Fundamentally, we're looking for things that are foundational. And what we mean by foundational is there's sort of core capabilities that everybody's gonna want, need or use. And when I say everybody, sometimes I'm talking about other pieces of OpenStack code. So for example, very high-level abstraction. It's awfully hard to use Nova if you don't have the ability to get an image into your cloud somehow because then you can't actually boot anything. So there's a lot of stuff that's built in around things like basic operations of creating images, creating servers, sort of those kind of things. We also look for things that are both atomic and proximate. Atomicity has to do with sort of having a single small piece of functionality. Rather than being sort of a bunch of operations kind of lumped into one. And proximity has to do with what other things call on that. So for example, it wouldn't make a lot of sense if we put create resource into the OpenStack guidelines and didn't also have delete and read and update. So those things approximate to one another. If you have one, it kind of makes sense to have them all. So how do these things get written and how do we decide what goes in there? Phase one is we start talking to each other. We'll talk about the timeline for all this in a minute, but usually the first thing we do when we start looking at capabilities for a new guideline is we go talk to the technical community. Typically the way that works is the interoperability work group kind of hands out assignments. We have volunteers that volunteer to work with say the NOVO community or the Glance community or whoever it is. They'll go talk to the PTLs and the developers and other folks from those development teams and say, hey look, what do you think is important? What's missing from the guidelines today? What do you think has been introduced in the last few cycles that people really are fundamentally gonna wanna use in the future? That we need to consider a sort of core capability for interoperability. And that's where the discussion starts. We also generally talk to end users. A lot of us happen to work for vendors, so we talk to people that are using our stuff in the field as well. We try to get as much perspective on this as we can. So phase one is really all about talking to each other. Part two is we actually start writing things down. The way this works is that we create patches in Garrett. So the interoperability work group works with the same set of tools that most projects in OpenStack do. And I thought we might just swap over to a browser real quick and take a look at what that looks like. So this is a patch that went into the interoperability work group a few months ago. And you can see this is pretty much the standard Garrett stuff, right? So we have approvals, we have plus twos, we have plus ones, we have a long string of discussions. You can see in this case the Nova PTL chimed in on this patch. What it starts with is a simple text file. So we have a text file and you'll see over on the kinda right hand side in the green there, there's columns of numbers and they're binary, they're ones and zeros. Those columns, if you look further up the text file, those column headers are the 12 criteria that we just talked about earlier. So basically somebody will go in and put in their opinion of yes, this is proximate, yes, this is atomic, yes, this is widely deployed, no, it is not foundational, blah, blah, blah, right? And then we have a little script that tallies up all the scores at the end of the day. And if we meet some minimum threshold that we determined for each cycle, it's gonna wind up going into those guidelines. In this case you'll see one here was marked with an 82 which is high enough to get in. And we'll see in the next file, oh this is just a CSV view of the same thing. We'll see this advisory status here. So basically we have three things that we looked at, we talked about, sorted out with the community and decided that these are things that should go into our next interoperability guidelines. First they go through an advisory phase, so that's why they're added here. And then toward the bottom you'll see there's actually one that we dropped here. For those of you that are familiar with Keystone, you know that they've long since sort of moved on from V2 of their API and that's a V2 API capability. If we look a bit further down, why should I see tests associated with each of those? So we'll see things like listing flavors and these are actually pointers to Tempus tests. They actually have, every test in Tempus actually has an idempotent ID associated with it. It's kind of like a UID, just a numerical identifier. So if the test changes names or changes locations, we can still figure out where it is. All right, so once we've written it down, we've agreed to it. We put it through scoring. Communities had a chance to give feedback on it. We decided to add something in. Once all that is done for all the projects and we're getting toward the end of that six month cycle, we bring it up to the board of directors. So I've kind of glossed over the governance model here a little bit. But the interop working group was actually created by the board of directors of the Open SAC Foundation and that is ultimately who asked to approve everything that we do. That's a little different from a lot of the development projects in Open SAC. But the reason is because the board of directors is who controls the trademark. So if you want to use the Open SAC logo, if you want a legal agreement giving you the right to license that logo and put it on your product or call your product Open SAC, you have to talk to the Open SAC Foundation. So this is one way that the Open SAC Foundation has put that governance process back into the community hands. So anybody can come participate in the interop working group and ultimately whatever we send up has to be approved by the board. So let's talk a little bit more about tests. There are a few requirements for the tests that we have today. First of all, they have to be under TC governance. So generally for the core projects today, we don't accept tests that, you know, some vendor wrote and runs on it on the side. And that's primarily again, because we want the testing to be in the hands of the community. If the community decides that a test isn't useful or isn't testing a capability in the way that it should, it's probably gonna get kicked out of Tempest, right? And if you look through our interoperability guidelines, most of the capabilities have multiple tests associated with them. And so we can add and remove from those over time. All tests today are in Tempest. This is kind of per the TC's request. The TC gave us some guidance awhile back that said anything that qualifies for the future direction criteria should be in Tempest. I'll mention briefly that we are considering some new programs for vertical spaces and less widely deployed projects that may or may not have that stipulation on them. But those are a topic for a whole nother talk. So we'll gloss over those for now. They must work in all the releases covered by a guideline. So remember I mentioned that a guideline generally covers at least three releases. That means that the tests must function against clouds running any of those three releases. And that also of course means that whatever capability we're testing has to be present in all three releases. Part of that goes into the fact that one of the criteria we have is that we're looking for things that are stable over time. So if an API is removed after six months of being in the real world, probably not something that we should require everybody to run. Never actually happens anymore, by the way. Anymore. It's not 2011. Typically we have folks run these via a RefStack wrapper. So there's another OpenStack project called RefStack which is a sort of thin wrapper around the Tempest project. It will run the test for you and then report them upstream to a server somewhere. So you can actually go out to RefStack.OpenStack.org and see which guidelines you meet. And we'll look at that now. So a brief look at what RefStack looks like. So this is the public RefStack server. It is an OpenStack project, so you can go get the code and spin up one internally inside your firewall if you want. But this is the public one run by the OpenStack Foundation on its infrastructure. What we're looking at right here is what's in a guideline. And again, you can see kind of the same things that we saw earlier just in a more human readable format as opposed to the JSON we saw earlier. And this tells you a little bit about what's in those guidelines. You can see each of the capabilities listed. Each one will have a set of tests associated with it. And then you can see which criteria it's stacked up against. And also which project it's related to. This is a view of an actual test run that somebody ran with RefStack. I happen to run this one, so I can speak a little bit about it. What you can see here, there are a couple of dropdowns for the guideline version. 2017.01 is the most recent guideline. And we can see right here, there's a nice green yes that says this cloud passed all the required tests. And then for each of those, you can actually go into each individual test and see those as well. Okay. So the next kind of natural question is, what if we get it wrong? What if it turns out we thought something was really widely deployed? And then we got a whole lot of feedback after it became required that, oh, hey, we don't pass that test because we don't ship that in our product. Or it turns out we don't do that thing in a bunch of roll your own clouds either. There's always gonna be circuit breakers and feedback loops in any process in OpenStack. And this is one of the ways that we can do that. Tests can be flagged at any time, which means as soon as that guideline goes out and it's required of vendors. If we get feedback that it's not a good test, maybe it doesn't test things the way we thought it did. Maybe it tests a different API than we thought it did. Maybe a test has been deprecated and removed from Tempest at some point. Whatever the reason is, if we get feedback about that, we can actually insert a flag into the guideline at any time without going back to the board for re-approval. The flag marks the test is not required, and then we can kind of decide that test fate in the future. It may be that frankly this capability was just a bad choice and we should never have chosen the first place, in which case we'll drop it out of the next guidelines. In other cases, it turns out, well that test actually had a bug in Tempest, right? So nobody's gonna be able to pass it and it's not that they don't support the capability, it's just a buggy test. In which case that bug may get fixed in the future and at that point we'll drop the flag and it'll go back to being required. So there's sort of a couple of bullets there talking about reasons that we might flag a test. So let's talk about timelines. Typically the timeline, it's a little bit loose, but the timeline usually looks something like this. Three months before a summit, there's kind of a preliminary draft. It's usually a carbon copy of the last guideline with a few numbers and names changed and that's kind of our starting point. Two months before a summit, we'll start talking about new capabilities, kind of going through that process of saying who on the interoperability work group is gonna work with Nova this time around or whatever it is, kind of dividing and conquering so we can get all the projects that we need covered. About a month in, we'll start scoring capabilities and actually posting those patches up to Garrett and by summit we'll typically have a solid draft. If you look out at the repository right now, excuse me, you'll see a file called next.json. Almost all the patches that we have outstanding for that have now landed. So you got a pretty good picture of what the next interoperability guideline is probably going to look like. So now that we've got a solid draft out there, we'll have vendors who are interested go start testing. You can consume that next.json file anytime you want but now it's kind of a good time to go because we've actually got everything in there that we think is gonna make it into the next guideline. And that's again, additional feedback loop. We start letting vendors run this and see, oh gosh, that test as a bug or this isn't a thing we support. Maybe we should get on that and fix that. So those kind of things. Two months after summit, we'll probably, we usually do the test flagging. Flagging can actually happen at any time but this is kind of the point where we stop and take a look and make sure that we don't have any outstanding requests before we send something up to the board of directors. And then finally, three months after summit, we'll have a vote by the board of directors that hopefully ends in approval if it's not too controversial. If you look back in time, just kind of a note at the bottom of the slide, if you look back in time, 2015 was a little bit weird that we had a really accelerated schedule. That was kind of the origin of this whole process. So we kind of moved fast and had a couple guidelines in six months rather than just one. So the next thing folks want to know is why isn't this thing that is really important to me in those interoperability guidelines? Why can't I depend on it to be there across all the different clouds? There's not really an easy answer. So it's a little dependent on each individual thing. So you remember the requirements that we laid out earlier? It could be that it just didn't meet criteria. It may be super important to you but it may not be applicable to, you know, 60, 80% of the rest of the clouds out there. In which case it's not a thing that we would probably feel comfortable requiring of everyone. It may just not have been scored in time. Turns out there's a whole lot of research that goes into the scoring usually. I know I have spent a whole lot of time digging around it like J Clouds and Fog and Kubernetes and trying to figure out how they're using clouds and what capabilities they're using. So that actually does take some time. And again, there's a lot of time spent conversing with the rest of the community as well. It may be that it was an admin only capability. Again, we're focused on the things that consumers of clouds can use. So if it's only available to administrators, probably not a thing we're gonna get here. And kind of an important corollary to that is that there are some tempest tests that test capabilities that end users typically can use but they're written in such a way that they use admin credentials to do so. And so there's no way that an end user of say a public cloud could actually run that same test and get it to pass because he does not have admin access to the cloud. So in those cases, we can include the test. What we can do is start a feedback loop there, file a bug on the test and say look, this uses admin credentials for no good reason, let's fix it. And that has happened more than once. It may be that the project itself wasn't widely deployed. People ask about designate a lot lately. Designate is, as of the last user survey I think is deployed at about 16% of prod and non-prod open-site clouds in the user survey. Again, important to some people, and we do have some new add-on programs coming that will help guarantee some interoperability for them, but probably not applicable for the whole rest of the universe. It may be that there are some things that don't have tests or maybe it didn't score highly enough across all releases covered in a guideline. So again, we're looking at at least three releases in each one of these guidelines. It may be that in the earliest of those releases something really didn't have good client support yet. So people running, you know, that version of the cloud may not have good support for it yet. And sometimes it's a matter of nobody's brought this up. You know, we're a bunch of users in the community and we try to have a good read on what things people are using, but sometimes there are gaps. So again, there's a lot of feedback that's built in there and we just hope that the feedback comes. All right, so if I'm an exact developer I've got a really cool feature. How do I get it into those guidelines? There's a blog you should read. The link's at the bottom there, but to summarize it, what we need that capability to do is meet all the guidelines, all the capabilities that we use to determine what goes into the guidelines. So you can do things like document it really well, help foster adoption of it. You know, if the feature is completely undocumented there's a good chance nobody's ever gonna use it. It may be that you can help foster adoption by doing things like helping other people use it via an SDK. Maybe you can submit a patch to say J Clouds or work in Kubernetes or some other thing that uses Clouds. If lots of tools are depending on it it makes it easier for us to include. We've already talked about tests and then of course you also need to be a little bit patient. Again, this has to be in three releases. So if you have a new shiny object that just went into the latest release it's gonna be a while. So feedback loops. Again, lots of feedback loops built in to the scoring cycles. Anybody can come comment on our patches in Garrett. You can throw in your minus one plus one. The board of directors also gets a chance to offer their feedback. Most of this originates with talking with the rest of the community and talking with the PTLs and the development groups to see what they think is core. And then even after something becomes advisor or required there's another feedback loop built on there where vendors can ask for flags if we find out that there's problems. So plenty of ways to do that. Also kind of important to mention one of the things that we use when we're looking at some of these criteria is the user survey. And we also look at ref stack to see what test results are out there and what they actually support. So things that you can do are if you're running a cloud answer the user survey. And that way we'll know that a lot of people are running designator or whatever project it is, right? And if you submit your test results via ref stack then we can also see oh there's a cloud that supports all these other capabilities that we haven't even thought about yet. And if it turns out there's a lot of those maybe that's the thing we should think about. And of course you may also buy me a beer and then my year about that anytime. Quick notes about ref stack because we're running a little low on time. It's actually not very hard to make ref stack work. There's some instructions. There's a link in the slides. Basically you go download the ref stack client run a setup script, configure tempest for your cloud, point tempest at it, it executes the test and it uploads them to ref stack. There are now capabilities in ref stack where you can upload results anonymously or with kind of an association to your user ID so that you can only see them and make them public later. One thing that we've been encouraging people to do is run not just the tests that are required in the current guidelines but run all the tests. Frankly if you're gonna take the time to run a couple hundred tests, what's a couple hundred more? It's maybe an extra hour of your time, right? And you're probably going to have a cup of coffee in that time anyway. The more data we have about what's out there and what other tests clouds pass, the better decisions we can make about scoring later on. If you don't pass all the tests, it's important not to panic. A lot of times we figure out that some of the problems that people have are environmental. Maybe there's a timeout because their storage was slow or on their particular test bed. Turns out test beds a lot of times are not built with like first class gear. They're built with like the server you found in the closet, you know? So sometimes it's just purely environmental. Sometimes we find bugs in tests. Sometimes we find bugs in OpenStack. There's plenty of reasons why tests may fail. We actually have people who will come help you with that. One of them is the interoperability engineer at the OpenStack Foundation, whose email address is right here on this slide. So don't panic. A few links where you can learn more. Our two most recent board-approved guidelines are at the top there as well as the next guideline draft and links to all the stuff that I showed you in the slides here is there as well, including how to submit patches if you're interested in adding capabilities to future interoperability guidelines. So that's it. Do what I do, hold on tight, pretend it's a plan, and hopefully we'll have some interoperable clouds in the near future. And that's all I've got. Any questions? I think we have about one minute. All right, in that case, thanks for coming. Have a safe flight home.