 Okay so put that you don't need that or maybe you do. I don't think you do. Shit I'm so nervous. The mic's not on right now. Why don't you try saying something? Yeah. Give it to me. It's nice this way. Hello everyone. Can you guys hear me? Why don't you give it another minute? Okay, sure. It's short and sweet man. Yeah. Yeah, it's not just a random person trying to hijack us. No, I'll be recording. Okay. Thank you everyone. Welcome to the Systems Engineering Track at DevConf. The next talk is Mix and Match Resource Federation Across Cloud by Jeremy and Parle and I'll let them take over. Thank you. We are from BU and this is a collaborative project between the entities you see up there. Today Jeremy and I will present the work done by many others over the past few years on this project. Clouds have had a dramatic effect on how computing is done these days. Consumers are happy as they have access to all computing resources on demand and it can be cheaper than running and maintaining your own data center. The producers are happy as they can reach economies of scale and have automation for most of the management process. If this trend keeps gaining more and more popularity, maybe all computing storage will move to the cloud one day. Currently, the user signs up with one cloud provider and has all their projects over there. As a user, you have very little knowledge of the operational process of the clouds, creating a lack of transparency, which hinders your performance as you cannot optimize your workload for the current setup. Moreover, you're limited to the resources provided by this cloud, even if you're interested in better or different resources offered by other cloud providers. As moving the data can be extremely expensive. So our response to these problems is to implement a mechanism which can stitch together services across clouds, possibly even across administrative boundaries. So by stitching together what we mean is a service in one cloud will be able to consume resources from a service in another cloud. And this is what we call federation. So at least in the public cloud use case, federation can be used to architect a marketplace of clouds. The stitching together of multiple clouds will allow the user to be able to choose the best resource from a group of providers. This in turn might drive competition, specialization and innovation among those providers. We call this model of marketplace the open cloud exchange, which we are pioneering at the mass open cloud. So beyond a marketplace, federation offers other values to the clouds in general as well. As an alternative to having like one large cloud, many clouds could be federated together and that can give you some security advantages as well. Because the penetration of one cloud might not mean the penetration of all the others as well. Similarly, having a huge cloud with thousands of nodes may be less scalable than having much smaller clouds federated together. We know that this will be helpful for addressing current scalability issues in OpenStack because in federation of clouds communication only happens when sharing resources. As another example, it makes collaboration on projects across different clouds possible. For example, physicists from BU and Harvard can work together on a project using quotas from both BU and Harvard's different clouds. Resource federation may also be applicable to edge computing. Like you could have several cloudlets at the edge but still be able to administer and orchestrate them as one single larger cloud. Now that we've talked about like the idea we're proposing and why it's important, the next step for our team was to implement it. And OpenStack was a natural choice of platform for several reasons. It is open source and has a large community invested in its growth. Also, all the services are modular with well-defined functionality and API. Because of these well-defined interactions and born boundaries, it is not a far-fetched idea to have services owned by different entities and more than one of each service. Moreover, eSource resource within OpenStack can be identified by a UUID, which gives us a probabilistically guaranteed way to uniquely identify resources across clouds. This ended up being useful in the implementation of our solution and we'll talk more about it later. To look specifically at the architecture of OpenStack, we first see that OpenStack is composed of like several services as you can see identity compute storage and they're very modular. And there's only one of each kind of service and each service exposes an API that users and other services can use to communicate with it. Not all services talk to each other because storage only talks to identity but compute talks to all the others in this picture. As OpenStack provides only one of each service, the user might want to access another cloud. Now the user has to manage and interact with the two clouds independently. And which is already made very difficult by the client tools because by default they're only designed to talk to one cloud. And still it's not easily possible to share resources between these clouds. So the burden of managing resources across separate clouds only grows worse when the user has more and more projects on different clouds. So what are the steps towards fixing this problem and offering a more seamless experience? So the first step is identity federation. So here's a reminder of what two totally separate OpenStack clouds look like. As a first step towards sharing resources between them, we want to have communication between their identity services. So the OpenStack identity service already has support for federation with a feature called Keystone to Keystone. If this feature is present then the identity services of each cloud will exchange information so that a user's identity is mapped among that group of clouds. This feature has turned out to be extremely valuable for us but it wasn't a complete solution. Keystone to Keystone gives us a consistent mapping of a user's identity across clouds but resources OpenStack are actually owned by something called projects. So we had to invent a way to map projects together across clouds. So we built a mechanism to achieve this on top of the existing identity federation abilities in OpenStack today. So sister projects are created and basically sister projects are created that share a mapped identity and that's done by tooling in each cloud. This way the projects identity is carried across clouds and the owner of resources is able to be consistently identified. So having that capability lays the ground for resource federation because it allows a project in one cloud to clearly own resources in another. So now that we have a solution which meets our needs on top of the existing OpenStack identity federation capability we can move on to our solution for resource federation. So first remember again that OpenStack is composed of a bunch of separate services but there's only one of each kind of service in a cloud. But because users and other services communicate with the service through its API we can actually put a proxy in front of each of the services which Enthoried requests to the correct cloud. So here in this picture we can see the proxy notably it's a or that is our solution. So that's a drop in solution meaning that we didn't have to modify the code of any of the OpenStack services. We didn't have to modify any of the client tools either we've preserved compatibility with the existing ones. So how did we implement the proxy? So it actually turns out that all of the OpenStack services have very similar API design. So from the URL structure we can actually tell what kind of resource is being requested and the idea of that resource and we can tell that consistently across all these different services. So there was very little specific logic that we had to bake into the proxy. So the proxy is actually a very slim piece of code. The only parts which are even specific to OpenStack in this proxy are related to authentication with Keystone, the OpenStack Identity Service, and the handling of API pagination and aggregation which isn't totally consistent across APIs. So how does the proxy actually do its job now that we have it implemented? So first notice that the user and all of the services now address the proxy instead of each of the services directly. We can enforce this flow by taking advantage of the OpenStack Service Catalog. Normally the Service Catalog contains entries for each of the URLs of the APIs of each of the services in one cloud, but we can actually substitute each of those URLs with the URL of the proxy. So now as an example, let's say the user makes a request to the storage service to either view or update an existing volume. Now the proxy is in front of the storage service, so how will it actually know where to forward the request? So first for the sake of just for performance, the sake, and for simplicity, the proxy can just look in the local cloud for that resource. But if the resource isn't there, it has to then look in each of the remote clouds, at least in the remote clouds where the user already has a project. So remember that every resource OpenStack is uniquely identifiable by UUID, so there will actually only be one matching result among all of these clouds. Looking at every cloud isn't actually particularly scalable because there could be many and many API requests takes time. So instead we've actually implemented the solution into the proxy where the proxy essentially catches the location of each of every resource in its database when it's during a creation request. So this avoids these unnecessary API calls. So as another example, let's say the user just makes a request to the storage service to list all of their volumes. The proxy can use the identity federation to make a request to each cloud on behalf of the user and then can aggregate the results from each cloud into one list and then return it to the user. Now let's say the user tells the compute service in one cloud to boot a VM from a volume that's located in a different cloud. The proxy is still in front of the storage service, so when the compute service requests the volume details, the proxy can find the volume by UUID very easily. So from the perspective of the compute service, nothing is different. It boots the VM just as if the volume were local. This does assume though some amount of data plane connectivity and trust between clouds in order to expose the storage target to boot the VM. So this offers exactly the same user experience as for a single cloud. Because of that we won't actually demo it here today at DEF CONF, but feel free to talk to us so we can show you what's going on. So now as a final example, let's say the user wants to create a new volume. How will the proxy know where to forward that creation request? So by default the proxy can just choose the local cloud. That might not always be the right choice, so we're working on some more creative solutions to address this and we'll come back to that later. So now we've demonstrated a solution for resource federation across clouds, but we actually didn't specifically address network federation and that turns out that works a little bit differently. That's because the federation of storage, like images and volumes, actually involves the direct sharing of those resources, but with network federation we actually have to extend the network across clouds. So in order to figure out how to do this, we first looked at networking between virtual machines and one of our own open stack cloud just in a single cloud. So our cloud just uses VXLine's open V-switch for networking, which is actually the default choice for open stack. So it turns out there's actually, in green, that's the VXLine mesh created between each of the compute nodes in our cloud and the mesh is defined by entries in the database of the open stack networking service. So from that we wanted to go and find out what was the least amount of work we had to do for a proof of concept of a network extended across clouds. So it turns out that all we had to do to have a network expand multiple clouds was to expand the definition of that mesh so you can see the green. And that goes all the way across. It was also necessary to ensure non-overlapping VXLine IDs to prevent conflict between the two clouds. So the solution is functional, but it is limited. It's limited in the fact that you have to go and manage these database entries, which can be kind of unwieldy. And also it requires clearly a high level of trust between clouds because all the compute nodes have to be networked together. Ideally, we're looking towards having some kind of device at the edge of each cloud, which can perform VXLine termination and forwarding, which will minimize the amount of trust needed between any number of clouds. So after we set up this mesh and make it nice, the open stack networking service can automatically match up subnets and VXLine IDs across clouds to create the effect of one private network between VMs and different clouds. So in our solution, the user just talks to the proxy and then the proxy will orchestrate all this necessary VXLine matching up because that's usually a privileged operation in open stack. So currently the user makes an explicit request to the proxy to request this extended network, but we're exploring possible improvements to the user experience. So right now what we have it pretty much works. The users can interact with the federated clouds and just using the existing open stack tools without any changes, and we didn't have to make any changes to the core of open stack. All the main features are working today by that we mean you can proxy the block storage image compute and networking, but there's still one thing that we haven't addressed. So what happens when you want to create new resources like how does a cloud know where it should go? So right now the proxy can easily determine the location of the existing resources, but it has no way of knowing where to make new resources, right? But you can bypass the proxy and create resources on another cloud by explicitly going into open stack and saying, okay, I want this resource on cloud B because I know the name and et cetera of the cloud. But there's still some problems with this. And in the end we want the proxy to know about these creation of the resources because they want to maybe cache the UIDs of them and save them in the database. And of course there's like users have preferences about where the resources should be created. So there should be a mechanism for the users to specify where they want to create these resources. Moreover, generalizations can be made based on like certain properties of the cloud that you choose that cloud for the properties it offers. And right now open stack client tools also don't have full support for informing the proxy about where to create the resources. So in order for the proxy to be aware of where to create resources, it may be useful to define or gather creation preferences beforehand. So as a first step, we are building a UI where the user can add preferences of where certain resources should be created. As a simple example, the user might always prefer the cheapest cloud for storage or they may want to keep their quota say balanced across a group of clouds. Using the UI, the user will be able to let the proxy know about these preferences. I mean, ultimately we are hoping that the proxy would learn where best to create more resources. Like maybe this is like a very open topic and there are a lot of research opportunities here. And later maybe machine learning would be used for the proxy to know where the user would like the preferences where the user would like the resources to be created. So right now all of this is still in its testing phase and we were planning to move it to a more production like environment at the Massachusetts open cloud with real workloads and real users. Once it's in a production like environment, we would like to validate the performance of certain functions like having to attach an image from a local cloud versus a remote cloud and like what overhead that might actually entail for this system. As I already said, there are a lot of research areas in this project and it's a very exciting place to be in. Thank you and now we're open for questions. Have you thought about federating public clouds like Azure and Alibaba? So the problem with that is that our solution doesn't really address, our solution assumes that the API of each cloud is sort of similar. So it could be possible, but not really easily. Do you want to say something? This is about really how you grade one larger cloud out of local entities and stuff. And this really relies on open stack and it's like, you know, the mechanism is there. So it's really not, we're not focused on the people that are doing federation that are doing purely at the client level versus this is, I don't think one larger cloud is more efficient. So this is the provider level. How do you resolve the authorization on each cloud? So if someone owns one and someone else owns the other one, how do you set up your sister project? So they both know about a single user instead of assigning authority to your proxy or assigning authority to one person. You have to somehow make them create a consensus to authorize a person on both a single person. So identity federation really just solves the identity of the user. We're working on a mechanism that allows the user to sign up for services by other service providers. During this process, the user authorizes the creation of a new project and that project is tied into their identity and to the identity or rather to the project which they are trying to extend into that other cloud. So that's how it works. Actually, I have a question. So you spoke about network federation and you can extend VXLANs to have more, but there is a limit to how many nodes you could have on a VXLAN now, especially when you're talking about mixing BU's cloud with Harvard's cloud with MIT's cloud, the number of nodes, and now each would have its own VXLAN with its own convention of, okay, I'm going to name this from this number to this number. Now, when you're talking about federating such large clouds, how does that scale? So if I understand the question right, basically the solution would be to have, so in our case it's open V-switch, which is software defined networking and we can just have, instead of having every node talk to each other, just the two open V-switches talk to each other. If I'm understanding your question right. Okay, so how would open V-switch know that, like, would it rename all the nodes because you said that, you know, you can't have the same node ID for two different nodes. So the VXLAN ID is just, that's one subnet, one network has one ID. It's not that one connection has one ID, it's the whole network has one ID. Go ahead. So again, this was the trivial way to get things working, right, with the VXLANs, but irrespective of what solution you have, basically what Jeremy talked about was having an agent that sort of terminates the networks on one side that's talked to explicitly by the proxy to say now, you know, anything that comes on this network, pop it over onto this other network, right? And so if you have these agents sitting on all the edges of these clouds, again, this is an internal provider to provide a sort of arrangement to do this. So it gives you a very scalable approach. And that's actually, you know, maybe going back to your earlier point or what Perot was saying is that if you have a cloud that you're trying to scale up to, you know, massive scale, right, you actually do have to deal with some of these solutions here. The only time you're actually stitching a network across is if you've got a project in these multiple clouds. So it gives you a natural way to actually have a more scalable thing. Can you tell us a little bit more about Mass Open Cloud? Sure, the question was, what is the Mass Open Cloud? So this is a project we've been doing, I guess Peter and I are the PIs of it for the last four years. It's basically the, all of today's clouds are proprietary and actually, Mike will be presenting our operational experiences on it tomorrow. But it's all of today's clouds, the public clouds are really single provider clouds stood up by one entity. Our philosophy from the beginning was could we create a cloud that would have multiple entities offering different parts of the cloud but actually offering is one larger cloud. And there's other aspects of this like hardware, being able to reprovision it between different services beyond the Open Cloud, beyond sort of the mix and match or resource federation is key part of it. We were by, this region had actually built a large data center, the MGHPCC data center. It's 15 megawatt, two acres of rooms for computers. It was built by five entities, MIT, Harvard, BU, Northeastern, UMass and that gave us an incredible opportunity to create a shared cloud for the entire region. It's all open source. It's all open for researchers. It's the only cloud out there where researchers can get involved or companies can get involved in operating part of the cloud, in pulling the metadata, the telemetry out of the cloud, optimizing their products or researchers, understanding what the heck is going on with cloud. So it's been incredibly successful in terms of research. I think there's like $20 million in different grants by different people on different aspects of the cloud and Red Hat's offered all kinds of, been working very closely with us on it, as well as Intel, Two Sigma, a bunch of companies are the core partners of the initiative as well as these institutions we talked about. And in fact, what's been really cool about the project is today in open source, again this is about evolving. There is no CD for OpenStack. There is no CD for many of these products. So Red Hat gained enormous experience from operating the products with us in the mass open cloud or gaining the operational experience because they've worked with us enough. And as a result of that, we were really excited yesterday to announce Open Data Hub, which is a new data and leaks platform Red Hat's doing, and it's starting it off on the MOC and exposing it to users before trying to bake this into products. So we think that's actually the future of how open source will develop, is by offering services for real, we're then figuring out how to productize them. That's right, that was a very long answer. Any more questions? Okay. Thank you. Thank you. So, here at DevCon Flare, today we are having a party. It'll be either at the BU Beach or it'll be in the zitsky lounge based on the weather. If you wish to attend, you'll need to pick up your tickets at the registration desk. That's fine. Thank you. I know what you were saying, but the rest of it is fine. Yeah, the whole believe it. Did you make it up? No, no, no. I think it was great. Thanks. Yeah.