 OK, so good morning, or good afternoon, everyone. We're just going to take a second here to introduce ourselves and just introduce the talk in general. So what we're here, hopefully everyone is here for the when one cloud is not enough and overview of sites, regions, edges, distributed clouds, and more. So sorry about the long title, but there's a lot of different terminology and a lot of different words that people use to sometimes describe the same thing. So that's sort of how we ended up with this title. But I guess the main point is that if you're deploying more than one cloud, you have a lot of decisions to make. And what are those decisions, and how do you make them? Personally, when I go back to my home in Emerson, Alberta, Canada, I've got some thinking to do about how to do 10s or even 40 or 50 clouds for the same organization. So that's what we're here to do today is just give you an overview of that information. So my name is Curtis Colquitt. I'm an OpenStack architect, so to speak. I am working on a working group within the OpenStack community called the OpenStack Operators Telecom NFE Working Group. And we're working to help people like myself do their day-to-day work, which is not a telecom operator, but an OpenStack operator. So you don't get too confused with those terminologies either. And I work at a company called Interdynamics, which is out of Emerson in Canada. So sorry. Hello. I'm Tau-Yu-Hwan. I'm the current PTA of OpenStack Tri-Circle Project. And I'm also the PTA of OpenNFB multi-site project. So good afternoon, everyone. My name is Adrian Lebe. I am the chair of the new Massive Distributed Cloud Working Group that deal with the Fog and the H-Competing Challenges. And I'm also the chair of the Discovery Initiative, which is an open science initiative that tries to revise OpenStack in order to make it Fog and H-Competing compliant. So as Corti introduced previously, the goal of our meeting actually is to try to clarify, let's say, major pieces of the OpenStack ecosystem when we talk about multiple location clouds. So actually, that's concern and challenges we are sharing with the NFA Working Group and the Fog and H-Massive Distributed Working Group. So we have this common interest about this multi-site, multi-location deployment. So the goal of the thought is to give an overview of if possible an exhaustive one of whole solution and whole building blocks that are available in the OpenStack ecosystem in order to operate such a multi-site deployment. Then we'll make a focus on the OpenNFV side. I will give a short overview of what we are doing right now in the Fog, H-Massive Distributed Cloud Working Group. And finally, Corti will conclude this talk by giving the takeaway message. So when we start to make this presentation, we try to identify some keywords, let's say, that have some links with this notion of having several clouds with the goal of operating them through a single OpenStack. So we put some word we have in mind. And actually, even during this morning, I heard new word that I can put on that slide. This morning, we have a great talk about eBay folks that were presenting the Mestmaster solution. We have also this talk about Verizon. So that's mean that actually we can continue to complete these slides with new words. And the goal of our talk is to try to clarify and to make this simple. So to this M, we propose to use as a red thread three use cases, a simple one where actually you want to provide a VCPE solution for the NFV world. Then some use cases around the Fog and Edge infrastructure. And finally, what we call the Cloud Federation, which is something that is quite useful for the academics. So the goal of our presentation is to give and to identify what are the building blocks that can cover these different use cases. So mostly, the idea is to check how you should deploy the main OpenStack components through the different locations in order to be able to orchestrate, operate, supervise, and use such distributed clouds. So that's the whole vision of the OpenStack when you want to present the architecture of the OpenStack. What you should consider is that actually it's not just five components that should be distributed across different locations, but actually it's a set of mechanisms. It's a set of demon process databases that you should deploy across a different location in order to be able to operate your clouds. And the goal, once again, of our talk is to try to give you some premise, some hints about what you can do. So as I mentioned, there is different solutions to cover these three use cases. We decided to divide in two classes. The first one has the one that are only OpenStack-based. And the second one is the other solution that you can find, for example, on the OpenNFS side. So we are going to start from a simple and naive solution, where actually you put all the control plane in one site and you only deployed remotely compute nodes. And these compute nodes can run VMs, containers, or bi-metals. And then we are going to start to increase the complexity by introducing the different building blocks. To facilitate the discussion, we just define here, let's say key elements that we are going to use during our speech. So as I guess, everyone is familiar with the DC. So DC for us is one facility that delivers either compute storage or network resources in one location. This DC can be micro, nano, which means it's a few couple of servers. Or it can be mega DC, where you can find actually thousands of servers. When you want to deploy this DC across different geographical location, obviously, you have wide connection. So wide area network connection. As I mentioned, you can have a fog or edge computing infrastructure. And to facilitate the architecture of OpenStack, we are not going to dive in two details. And we are going just to identify all services that are necessary to run an OpenStack infrastructure through the name Services. So this includes Nova, Neutron, and so on. And we are just going to make a distinction with Keystone. So as I mentioned, the first way to operate a multi-site location deployment is to keep all the control pane in one center, one centralized center, and to only deploy remotely the compute node. So this use case, for example, covered the VCPU use case for the NFV world. And the idea is to check how you can orchestrate such an infrastructure. So it's quite simple. You deploy OpenStack as you are used to deploy in one location. And actually, we have the RabbitMQ, for example, that will be across all locations. And if you want, you can segregate this infrastructure by using the availability zone and your segregate. So once again, in the first DC, in the master DC, you will find Keystone services, all services that are usual, Neutron, Nova, and so on, and the API. And on the remote side, you will find only the compute nodes. So this scenario, this deployment, is quite simple. This is the usual one. And obviously, it has a lot of cons. So for example, what about the network? What about the impact of the latency? What's happened when my compute node talked with my remote services and reciprocally? What are the bandwidth constraints? What about Neutron? Do you need to deploy DVR? What about Cinder? What's happened if I want to attach a remote volume? So what does it mean that all my traffic will go through the wider network links? What about the high availability? What about the security management? So you have a lot of cons. And maybe the first one is about the scalability. So the idea will be to check how you can segregate your cloud in order to deal with the scalability issue. Curtis? OK, thanks, agent. Before I get too far into this, I just wanted to say thanks, everybody, for coming. And also, it's a pleasure to work with Agent and Joe and Chow Yi, and work with people from around the world, from China and France. And it's a real pleasure to be able to do this. And I think that's a big strength of the OpenStack community. So another thing I wanted to point out is that we're not necessarily prescribing any solutions today. We're just giving you an overview of all the potential pieces, the things that you could do if you wanted to deploy more than one cloud, which is what I would imagine most of the people in this room will end up doing in some fashion. I also want to mention that some of the diagrams are very simplistic. At least my diagrams are, because in some cases, we tried to keep them as simple as possible. So they're not necessarily representative of what you would do in a real production deployment, but they're just sort of abstractions that you can take a look at. And then another point that I wanted to make is that OpenStack, we want to do a lot of different things with OpenStack. We want to be able to deploy very small clouds, and we also want to be able to manage very large clouds. So there's a lot of work to do if we're going to deploy a couple of hypervisors in 10,000 places versus 10,000 hypervisors in one place, those are pretty different use cases, but we're trying to support them all. So nova cells is a method of segregation that we're sort of, that's the term that we're using in this discussion anyways, is a way to break down some of the failure domains and also make it more scalable at the same time. So nova has had this concept of cells for quite a while. And it has recently, the version two has been created, and that is actually going to be the default for every OpenStack deployment in Okada and above. So what this does is essentially allows you to move the nova database out of the centralized API. So you can have a centralized nova API database, and then you can create a cell that's made up of a smaller number of compute nodes, and then you can move that database into that same cell. And you can do the same thing with the messaging queue, and that's going to allow you to reduce the size of the failure domain. It's going to allow you to make it more scalable. And it also adds a couple of additional features, like you can make it into a grouping mechanism. It's also helpful for when you're deploying more hypervisors, you can add a cell and test it out. And then once you're completely sure it works, you can bring that into the entire system. So this is an example of what services would be running in a particular cell. So in these diagrams, I use a little smiley faces, like the end user. But in this example, we just have the one endpoint, but we can have multiple cells. And in those cells, we just have the message queue, the actual nova computes and conductors, and then the nova DB, and then the actual hypervisors would be there as well. So that this sort of gives you an idea of what services are running where. So as I mentioned, there's some pros and cons with this. So we can reduce the size of failure domains. It's helpful to scale nova. And then when used with other tools that are becoming more available in OpenStack, it becomes a very powerful piece of technology. However, it's also kind of new, at least V2. So I'm just talking about V2 in this situation, but it's being worked on a lot. And because it's default, it's gonna get a lot of use, right? And that was sort of one of the problems that the developers had mentioned previously with version one, is that not everybody had used it. So I think in a lot of, one of the things we're gonna find out in this presentation a little bit is that we want to be able to use the same code paths and configuration that everybody else uses. The more you stick to the mainline kind of stuff, the more able you are to upgrade and maintain your systems over time. So it's great that Cells V2 is coming out and it's gonna be default so that we all use it. And it's just a great example of how to segregate and scale and separate out important pieces. And then when you use it with technologies such as what I believe are called right now, routed provider networks, you can really start to see how you can segregate your network, segregate your compute nodes, and it becomes very powerful. And there are several large users of version one, such as CERN and Nectar, among others. So as we progress through the presentation, we sort of, we keep, we're gonna go through different technologies and methodologies. So we talked about a baseline deployment where you have centralized everything and you're willing to put compute nodes and other pieces in different, across the WAN. Then we talked about Cells and now we're gonna talk about regions, which is sort of where my background is because previously I worked to deploy a public cloud in Canada and regions made a lot of sense for me at the time because with a public cloud you might only have one or two or maybe three regions. And that makes a lot of sense when you're doing a public cloud, when that's the kind of numbers of regions that you're looking for. So generally speaking, what you do is you're gonna share the Keystone database. You can do a couple of other of the pieces of OpenStack like Glance and Horizon, but generally speaking it's Keystone. And what you do is you share that across each of your regions, usually having some sort of private, secure, wide area network. But there's some limitations with that as basically it just allows you to do authentication and authorization, but it doesn't generally allow you to do quotas and SSH keys and images and all these extra pieces of functionality that you need in order to run two or three regions, let alone 40 clouds. So this is the start of some of the diagrams that I'm gonna sort of use to discuss this. So the first option is to simply just have one Keystone and put that in one place. And all of the other clouds and data centers that you have are gonna go over the WAN and talk right back to that centralized Keystone. So this is a potential deployment methodology that you could use. And in this you could do all of your sort of creates and updates to the central Keystone and then but only do reads in the other endpoints in the other data centers, not do updates there. And as I found out while we were researching for this presentation, one way to accomplish that would be to change some of the policy JSON files to in the other clouds to say, well you're not allowed to do creates and updates there but you can do them in the centralized Keystone. So this is one methodology. And also I should note that I'm not showing sort of any HA capability, you might be running three, a cluster of MySQL Galerian instances to manage this Keystone and there's all kinds of, but I'm just not showing that right now. So another possibility that you could use to do this is to have, still have that primary centralized Keystone but have secondary databases that are in the other two clouds and they are asynchronously replicated to the other regions. So you still do all of your reads and writes or all of your sort of creates and updates in the centralized region and you can do reads in the other ones but you have a secondary copy of your centralized MySQL database that's asynchronously replicated to the other regions. So this is another model and in some cases this might be the best model if this is the kind of thing that you wanna do because it still gives you that high availability of the Keystone database in each of the regions but you don't have sort of this brittle, maybe brittle's not the right word but you don't have to have the shared fully synchronous cluster across all regions. And this is something that the multi-site team for OPNFE looked at and was is there a recommendation for deploying multiple clouds in this with this particular methodology. And then the one that's kind of near and dear to my heart because this is what I would have done in a public cloud is to have a shared database cluster across all regions and this is kind of like historically most common deployment if you're doing a regional deployment is to have a Keystone or MySQL Glare cluster that is shared all the way across all of your regions and it's synchronous so you can read and write to all of them. So some pros and cons. So what this gets you, no matter which one of those ways that you deployed is a shared authentication and authorization across all these clouds and from my standpoint with my sort of history of doing public clouds is that it looks a lot like what you would expect a public cloud to look like with different regions and different endpoints and then that same shared authentication and authorization but in order to do this you have to make some decisions on your architecture like do you want to do centralized? Do you want to do asynchronous? Are you going to do the sort of more commonly done synchronous method with a large Glare cluster? Are you going to do how many data centers do you have? Do you have two data centers or three data centers? Are you going to do a Glare arbitrator? There's some technical decisions to be made here. You also have to have some kind of or typically you have some kind of secure private wide area network that you can access these do all these database actions over in a secure manner. However, and this is where we started to get into the real problem for when I was working on a public cloud like two or three regions that's fine that's probably as many as we needed to do even in a country as big as Canada. But when we start to get into some of these other use cases like telecom use cases we're not talking about three clouds we're talking about 10, we're talking about 40 we're talking about 100 and this sharing a Glare database across that many regions just doesn't, it's just not going to happen. So the asynchronous might work in that situation but there's in general it's probably not the best solution for this large number of clouds. Also, obviously as we've talked this doesn't really cover things like quotas and networking and images and things like that. So you still have to do more work on top of that. And doing this can also make it more difficult to upgrade clouds because you have to maintain like a similar version of Keystone across all these systems and of course you have the additional operational complexity of doing all this. So in this example we sort of, in a way it looks simpler and it kind of looks like we're going backwards but what we're saying here is in this use case we're going to just have completely separate clouds they're all individual there's no shared Keystone there's no really shared anything. And while this looks simplistic it's actually probably one of the more interesting or complicated versions because we have to think about having all these different clouds and then we're going to probably put some sort of layer of abstraction on top of them some sort of cloud management system. And so there's a lot of interesting work that could go into this and we might have time to talk a bit more about that particular use case. Also what I wanted to talk about is Open Sack Federation which is another technology that can be used for identity, for authentication and authorization. It was kind of mentioned by Adrian at the beginning there where this is a kind of a common desire for usually non-profit or academic organizations who have a cloud and they want to be able to use somebody else's cloud or share their resources they want to be able to share resources across different clouds. And what it does is essentially allows you to establish some kind of trust with another cloud. So you could say, well I have users in my cloud and we're going to another cloud is going to trust those users to use resources in their cloud. So like the canonical example of this is say we had a Canadian Open Sack cloud and a European Open Sack cloud and they wanted to share resources and allow users in one cloud to access resources in another and so forth. So Federation is how you would typically do that. So in this example, users of the Canadian Open Sack cloud would be able to access resources in the European cloud and in this example they would be using what they were calling Keystone to Keystone Federation. And what I like about Federation is that it seems to me like a very good way that you could do some sort of centralized identity, right? So in this example we're talking about multiple organizations but even with one organization you could use Federation as a centralized identity system. So the pros with this are that you get shared authentication without the need for doing the shared Keystone. We don't have to synchronize everything and some of the cons are that you have to understand it. There's some complex configurations. There's a lot of work that you're gonna have to do to get a good understanding of how it works and how to map users to groups and things like that. And there may be some additional work that you have to do in terms of cleaning up resources. So with that I'm gonna hand it over to my colleague. Hello, I will talk about a typical Open Sack multi-legion deployment. We have built one environment in OPMFV for a multi-site cloud. In this multi-site cloud we use shared Keystone to serve for a three Open Sack cloud. The shared Keystone is installed in region one and we also have another region. The central neutral and tricycle is running in a machine provisioned in region one and the neutral and tricycle, the central neutral and tricycle is a central region in the multi-legion cloud. In this environment we also provide the class version compatibility. We use Newton version Open Sack cloud but for central neutral and tricycle we use the pack version, it's the latest code. This environment is to support the VNF, the telecom application to realize cloud level high availability. Two applications planned to deploy in this multi-legion cloud. One application is called VMS. VMS is to provide VoIP service. Another application is the video conference. This demo will be shown in OPNFE Beijing Summit in June. And now we are doing the application onboarding. To realize the cloud level high availability the VMS will be deployed into three region. The non-HA component of the VMS will be deployed in region one and in region two and in region three the VMS high availability component will be deployed in these two regions. If one region crashed or one region is in planned downtime or unplanned downtime then the VMS in another region can still provide the service to the end user. This is to support application cloud level high availability. That means even one cloud is not so high availability or one cloud only can provide a sleep line but from the application level the application can still build a five line services. To realize the cloud level high availability there is one requirement to provide the networking between the HAA components. So the data in the HAA component will be shared in different region. So the networking for the East-West traffic is needed for the application and at the same time the application will provide the service in each region respectively that means in each region the OpenStack need to provide the floating IP service to the application. That means that also means the North-Storch traffic will be handled in OpenStack separately. To finish the networking topology Tri-Circle will help this to be done. Tri-Circle has two parts. One is the local plug-in installed in local Neutron. For example in region one, region two, region three we will install local plug-in and in the central Neutron server we will install the central plug-in. So when you create the networking topology you can create the topology in the central Neutron server in the step one. On the line you can put the virtual machine in NOVA in different OpenStack region to use the network provision in the central Neutron. And the NOVA will talk to local Neutron and the local Neutron plug-in will intercept the request and talk to the central Neutron. Central Neutron with Tri-Circle central plug-in will coordinate the multiple OpenStack Neutron service to establish the network between different OpenStack cloud. There are four basic networking elements in Tri-Circle. One is the local network. Local network means the network can only be in one OpenStack cloud. And local router also only work in one OpenStack cloud. For local network it supports V-line, V-X-line and flat network. So when you create a network or router in the central Neutron server if you specify the region name in the availability zone hint then you will create one local network or local router. Except the local network and the local router. Tri-Circle provides the magic networking element class OpenStack L2 network. A class OpenStack L2 network means the network can be structured into multiple OpenStack cloud. The class OpenStack L2 network can also be a flat network or V-line or V-X-line. Line for the non-local router. Non-local router is one logical router. This logical router will be distributed into multiple OpenStack cloud. And this distributed router will be interconnected with one class OpenStack L2 network. We call this class OpenStack L2 network as the bridge network. So this logical router will be consisted of multiple router and bridge network. When you create the network or router if you specify more than one region name or availability zone name or no parameter was specified line, class OpenStack L2 network or non-local router will be created. So let's look back what's the networking element used in the VMS network. For the bridge network, it's class OpenStack L2 network. And for R4, R1, 2, 3, they are consisted of one non-local router. This non-local router is for the east-west traffic. And network, network 1, network 2, network 3 are local network. This network will be used for instance to be attached. And for router R1, R2, R3 they are local router. Even for external network 1, 2, 3 they are also local network. So this topology automated by tricycle is the waste of traffic between OpenStack Cloud and North Dot's traffic in each OpenStack Cloud could be supported. Except the network we just mentioned in the last page, class OpenStack L2 network work can also be attached with instance directly. And the non-local router could be attached with external network so that to organize one tenant's network so centralize the traffic pass. We have one onboarding session tomorrow afternoon. If you are interested in this project, please join us. For another project to deal with multi-legion issue is the Kimboarder. Kimboarder is mainly to provide the multi-legion resource synchronization like SSH key or flavor or volume time. And it also provides a function to manage a multi-legion quota for one tenant. This project has been a part of OPM of being multi-site but now it's moving out from the multi-site project. Okay, so next will be folk ideas, thank you. Yes, so thanks. So I'm going to dive into too many details especially because I hope we'll have a couple of seconds to ask questions at the end. So the TechWave message on that slide is that within the massively distributed cloud working group we are not addressing the federation question. We are already investigating how OpenStack can orchestrate natively Fog, Edge and massively distributed clouds and infrastructures. So if you are interested to have much more details about the vision of the working group, please join us tomorrow afternoon. There is the buff session that has been announced this morning by Jonathan. I will give you a lot of details about what we are doing in this working group. So the current action we are focusing right now is that the exercise we try to do today during this session is to identify whole piece of components that can be suited to operate massively distributed clouds. What we are doing right now is that we are implementing a performance analysis tools to get results. So for example, when you deploy regions, what will be the performance you can expect? If you choose sales, what will be the performance? If you choose federation, what will be the performance? So that's another session we are going to present on the Wednesday afternoon and obviously I encourage you to join us and I will be pleased to give you more details. So I think it's time to conclude. So please. Okay, so we just have a couple seconds to conclude here. We've discussed quite a few different solutions. There's a lot of rich, complex authentication options. Some of the components of OpenStack have really great segregation methods and that's something that I'd like to personally see in some of the other projects within OpenStack is an ability to segregate, do cell-like things in other projects. Regions which are sort of my historical method of using multiple clouds probably don't work in deployments where you have 10, 20, 100 or even 3,000 deployments. There's some work that's being done around that. I do think that centralized identity using federation might be a more powerful tool than we currently realize. As Adrian was talking about distributed clouds is a very important use case. We also know that we can't always have a full HA control plane in every deployment. It just, we just can't do it like that. If you have 10,000 locations that you need to put hypervisors in and you want to do with OpenStack, well, how do you do that? We're going to need to do some sort of distributed methodology. And then we talked briefly, we don't really have enough time but about multiple clouds, multiple separate, completely separate clouds. So what do you manage that with? In the NFE world and the Etsy NFE world, they've defined this management and orchestration layer called MANO. Is that what's going to do all of this work or something like it? And is that, as an OpenStack community, something that we're interested in or we'd like to see a different solution? And then as you can see, there's the, oops, is that not going forward? But yeah, if we have any, I think that's it for our presentation. So if we have a minute, maybe for a couple of questions, it'd be great to hear something. Thank you for the presentation, it's really good. So what I saw is, in my experience too, the biggest challenge is the network blueprint design for deploying multi-site deployment and getting them together. What we saw our friends design is three networks. One is the Xnet for northbound to southbound. One on the bottom, east-west. And one on the middle, mainly for intra-region for the tenant workloads, right? So in that sense, most of the service providers, at least the Telco, will not likely like the idea of expanding one layer to broadcast domains across multiple geographical regions, okay? This is a hurdle. If that's the case, did you ever consider using only layer three, as in the case of Calico, to implement communication between multi-sites even for east-west traffic? Thank you. For layer two network, it's not necessary to be used in inter-cloud networking. Even for VMSs, in fact, it's slow layer three networking. The layer two network is only work for the Belliger network. It's only for the Belliger network, yeah. And in the future, for the Belliger network, it could be built by other technology like USB-PN. I think it's a plug-able for how to interconnect different routes. Okay, well. Actually, VX9 is only one of the options to Belliger loaders in different OpenStack cloud. Okay, and I think we're out of time. Thank you very much. If you saw something wrong in the presentation, come talk to us. Let us know if you have questions, and thank you very much. Thank you. Thanks.