 So, good afternoon, everyone. Welcome to this new session. We're going to discuss about OpenStack one-wide deployment. So, deploying OpenStack is for wider network links. So, this is some work we did during this last cycle within the Fog and Edge massively distributed cloud working group. So, we did that with some folks. So, hello, everyone. My name is Adrian Lever. I am the chair of this Fog and Edge massively distributed cloud working group. And previously, I'm leading an action, which is called the Discover Initiative. That also deals with this issue of investigating how OpenStack can operate Fog and Edge infrastructures. Hello, everyone. My name is Ronan Xancherio. I am a research engineer working at Inuiah. And I'm mainly involved into the massively distributed working group and into the performance working group. Good afternoon. My name is Pierre Rito. I'm from the University of Chicago, where I'm the lead DevOps on Chameleon. And inside OpenStack, I contribute to the scientific working group. And I'm a core reviewer for the Blazer project. So, the agenda of the talk. We are going to have three parts. Maybe the two most interesting are the two last ones. So, the first one will just be a reminder of what we present on Monday. So, why we need OpenStack OneWide? And then we'll make the focus on the real content of this talk, which is the ANOS framework and how we can perform performance analysis of OpenStack OneWide. So, why we need OpenStack OneWide deployment? So, maybe, and I'm sure now most of you heard about the keynote on Monday by Jonathan. So, basically, we are currently moving toward a new paradigm, which is called the fog-in-the-edge computing. And basically, this paradigm can be overview as a massively distributed work cloud. So, you can have different kinds of massively distributed clouds. You can have the academic one. Actually, you try to federate different cloud infrastructures in order to provide more resources to the academics. You can have all the clouds that are targeting the VCPE use case. So, typically, clouds for telcos. And then you have these new trends around the fog-in-the-edge infrastructure, which is related to the interconnection of IoT with data centers. So, you can obviously combine all these different models. But, basically, that's the model that drives our work. So, the scenario we choose to investigate during this presentation will be the first one. So, it's a quite simple one. You keep everything in the control plan. You keep every control services in a central data center. And then remotely, you will deploy your compute nodes. So, this is quite basic. There is more advanced strategy. We present them on Monday. If you didn't attend the session, I encourage you to give a look to the video. So, once again, the goal is to take one scenario and to investigate it from the performance perspective. So, the question we would like to address are what can be the impact of the latency of the strong boot? Can we characterize the message that are exchanged between the different sites? How we can perform such experiment between the different OpenStack version and how we can deal with the complexity of OpenStack? So, once again, just to illustrate the complexity of OpenStack, you should keep in mind that, actually, OpenStack is a large piece of software composed of several services, and the question is how you can deploy those services across different geographical regions. So, the work we propose to do and the work we did during this last cycle is to try to develop a sandbox for conducting performance analysis of OpenStack. So, I'm really talking about OpenStack, not about DevStack. So, the idea is really to be able to evaluate the performance of a real OpenStack that has been deployed on real machines. So, the framework we developed is entitled NOS. So, it's an experimental environment for OpenStack. So, the motivation, as I said, is to conduct performance analysis in a scientific and repositional manner. So, maybe I forgot to do that, but we are all researchers, or at least we are working with researchers. So, we want to be able to perform these experiments at small or at large scale under different network topologies. So, that means we want to be able to emulate any kind of advanced structure. So, what does it mean? It means we should be able to emulate wider network links, wire links, and other possible links. We want to do that between different releases, and we want to do that with any kind of benchmark. So, we build our system by leveraging the COLA containerized-based solution on OS profiler, rally and checker, and actually our workflow of this framework is following three steps that will be introduced by Ronan. Thank you. So, NOS is a tool to test and measure different configurations of OpenStack. And the first thing you have to do in NOS is to describe the OpenStack services you want to measure and describe the topology of these services so that then you can call NOS, and NOS will deploy that OpenStack over your testbed. You can see here a really simple, really basic description of resources where you are saying on my testbed, I want one control node for all control services and one network node for all network services that are running on cluster A and on cluster B I want to pick 50 compute nodes. So, next, when you do a NOS deploy and giving this description, NOS will deploy an OpenStack that truly follows that description. Okay, this description is really basic, a basic one that into the massively distributed working group, we want to try configurations that are more fancy. In general, we want to isolate some services and we want to do some replication and into NOS it's really easy. You just have to give the name of the services you want to isolate and also a number of resources and when you perform an NOS deploy, for example here you will end with five more nodes that run Novaconductor. And because we love being hard with OpenStack, we also add something we call the network constraints that the idea is you can define some logical group and define some constraints on the network between communication and this different group. You can define delay, rate and loss, packet loss. So, all things work under the hood actually when you have this description NOS relies on the notion of providers. The providers will get a node on the test bed. So, here we get two nodes on cluster A and 15 nodes on cluster B and then NOS will SSH on these nodes to install the code demand. So, then you can call COLA to install all OpenStack services on these nodes. Following that, NOS set the bare necessities that is to say an image, a router, some networks. And it ends by applying whole network constraints using NetM. Actually, we provide three different providers for NOS, one based on G5 care. Another one based on Kameleon and finally one based on Vagrant. So, at the end of NOS deploy, you end up with a fully running OpenStack that follow your resource description. And now you want to run some tests on it. So, inside NOS we integrate two kinds of benchmark, rally benchmark for control plane and checker benchmark for data plane. And when we are running benchmark, we are also monitoring all... we are collecting CPU RAM on network consumption per services node and cluster. So, at the end of the NOS bench, you can do an NOS backup and you will get all rally, checker report, but also OS profilers that give you a trace of the execution of the OpenStack and also all misery collects so that you can get it back into Grafana. Please go to read the doc to find NOS project. So, thanks for this overview of NOS. So, as you mentioned, everything is online, so you can go, you can test it, and you can obviously give remarks and feedbacks. So, as I said, the purpose of this talk is to illustrate how you can perform complex performance evaluation of OpenStack using NOS. So, once again, keep in mind these pictures. So, the scenario we investigate in details is the following one. So, this is the VCPU-1. All the control pane is deployed in one central data center and then, remotely, you deploy the compute nodes. Obviously, the Habit and Queue bus is global to hold these infrastructures. So, what is the pro? This is simple. I don't know if you can do even more simpler. Obviously, you have some cons. So, security management for IPC message, power issue, single border failure, but this is not the objective of this talk. The objective of this talk is really to address performance analysis. So, under twangles, the scalability, we had a presentation during the last summit in Barcelona with folks from Mirantis, and in that talk about network, latency, throughput impact on the functional behavior of OpenStack. So, once again, everything is deployed in the first DC. So, Neutron, Keystone, Nova, Glance, and remotely, you only have compute nodes. So, we used two testbeds, the Grid 50001 and the Camino 1. So, I will introduce the Grid 50001. So, Grid 50001 is a testbed for this computing, which is quite famous in the academics world. It is used by computer science researchers in HPC, in cloud, in big data, and also in networking. So, basically, the idea is that you get raw resources. We can call it like a real infrastructure of the service. What you get is that you get really bar metals, and you can do what you want with these bar metals. So, you can perform any kind of experiment in our policy baller way. And I'm going to present the Camilion testbed. So, similarly to Grid 50001, this is a testbed for distributed computing, cloud computing and so on. Lots of fields in computer science. And we are funded by NSF, so we are in the U.S., and we built it with OpenStack software, and as well as some software from Grid 5000. It's a testbed that can be reconfigured, so you can reserve nodes with the Blazar OpenStack project. And once you have those nodes, you can deploy an image on them and get bear metal access, and that's using the Ironic project. It's a large-scale testbed. We have more than 500 nodes, including some storage nodes. We also have 3.6 petabytes of global storage that is configured as an object store. We have lots of heterogeneous hardware, in addition to the other hardware, so things like InfiniBand, some SSDs, GPUs, FPGAs, low-energy nodes like ARM and Atoms. So really trying to cover a lot of use cases for interesting research. All this hardware is distributed over two sites. One of them is the Texas Advanced Computing Center in Austin in Texas, and the other one is University of Chicago in Illinois. And so far, we've been helping about 1,400 users in over 200 projects. So if you want to hear more about this, we have a talk literally right after this one, 10 minutes later, in room 208. And so those two testbeds, GridFesters and Chameleon, we use them to run the experiments for this presentation, and those experiments are fully automated, so they are defined as software, and InOS runs the experiment automatically. We've got around 250 benchmarks which need about 100 hours to run, so that's not just a five-minute thing, that's quite extensive. And the results run on the two testbeds lead to the same conclusions, similar performance. You can access the scenarios that we've run from the GitHub URL that is on this slide, and you can also get the results. So now Ronald is going to explain the results from the first experiment. Yeah, thank you. So I'm going to show you how we use InOS to conduct the latency impact experimentation. So as you can see here on the left, on the right, so there is the description of your resources. As Adrian said, we go with one node for control services and 10 compute nodes, and then we set up latencies between these two entities, meaning that every communication that goes through the first group to the second group pays the cost of latency. And I'm pretty sure that you're not in your latency as a strong impact on the short put. You can see here on the diagram, the measure we have done by pinging from the first node some compute with different latency. And what you can see is that the short put really goes down. And the question we are asking is, what is the impact for the behavior ring of OpenStack at both control plane and data plane? So for control plane first, we define some rally scenarios. Actually, these are all NOVA scenarios that try to boot a VM and then do several things. And the result is that the latency has an impact on the completion time. And not a small one because if you take land speed and then you take at 2,000 millisecond run-threat time, you can see that the time to deploy a VM is twice the one at land speed. The thing that I mentioned here is that we did all experimentation in an outcache manner, meaning that images are already pulled from the glance and are cache on the compute node. Here you can see that as Pierre said, we run the experiment both on bridge 5,000 and we get the same tendencies. Actually, if you look at the completion time, you will see that they are quite similar. The problem with rally when you do some tests with rally at control plane level is that the results, the completion time you get at the end is just a time about how long it takes to complete your task. But you cannot find where the bottleneck is concretely. So we also integrate an OpenStack project that is called OSProfiler. And OSProfiler is a profiling library that will provide execution times for all REST, RPC, Python, and DB calls. But if you look at the output of OSProfiler, actually OSProfiler provides an HTML view and this one is quite easy. Actually, for a scenario such as another boot and add security group from rally, you end up with 10.5k calls. So when then you want to find where your bottleneck is, it's really hard because you have to seek into all these calls. So we developed a prototype that is called OSPUteals and it's simply a set of operators that let you query your OSProfiler trace to do some filter and then reduce the trace size and easily find where your bottleneck is. And at the same time, we use this to produce a sequence diagram of the execution to really have all interaction between all different OpenStack services and to get the complete workflow. So on the next slide, you can see the HTML view that is provided by OSProfiler. And so you can see here some RPC calls that are the ones that are responsible for the more longer time during the execution of Ray scenario. But these calls are really melt into 10k other calls. And at the right, there is our second... at the left, there is our second diagram we automatically generated from our library OSPUteals and we use some filters to directly focus on calls that are responsible for the bottleneck. Here is another view of... generated from our tools and you can see that blue link are REST calls, black one are for Python calls and green one are for RPC calls. So if you want to... the idea behind OSPUteals is that if you want to understand how things behave, how things are implemented into OpenStack, then simply use this tool and you will get automatically a sequence diagram. Now let's go back to the experimentation. We also look at the latency impact at data plane. So we go with some shaker benchmark. Actually we use two, two benchmarks. The first one is called danceF3 and the second one is called fullF3. So S3 benchmark runs 2VM on two private networks and then try to generate some communication between these two VMs. When you go in the first case, in the dense one, your two VMs are located on the same compute node and into a full one, your VMs are on different compute nodes. But what you see is that every time a VM will exchange a package with another one, the package has to go to neutral at the control service level and then go back to the compute node. It means in case of latency, if you put some latency, you will pay every time the cost of latency. Okay, and so moving on to the second experiment, this is an extension of what Ronald just talked about. So we saw that there was a high impact on the traffic between the VMs and we can help that by using DVR. So DVR provides distributed virtual routing instead of having all the layer 3 forwarding and nuts being done by neutron, a centralized neutron, on the controller node. These mechanisms are moved to the compute nodes and are distributed on all the resources. So this means that it's not impacted by the delay between our first group and the second group. In Enos, it's very easy, just adding one key and value to the COLA config and you get DVR. So let's look at the results. So on the left is the previous graph that we just showed where there is a linear tendency. When you increase the latency, you get an increase of the communication between the VMs. Now with DVR on the right, you can see that the latency between the VMs is almost flat. There is some noise, but if you look at actually the numbers, this range between half a millisecond to four milliseconds. So it's still some noise, but it's very low values. And this is a critical change for wide area without DVR. It's not really feasible to communicate between those VMs, but with DVR it becomes really like a normal site being in the same localization. And now let's look at the third experiment. The third experiment is taking into account wireless networks. So instead of having wired links between the master site and all the VCPs, we're using wireless links. And with wireless, in addition to the latency, you have to take into account the loss of your packets. Again, this is something that NOS can experiment with. So we configured NOS to go through a range of loss rates from 0% to 25%. In addition to the delay, so you get two dimensions of experiments. And similarly to the delay, this has an impact on the throughput that you can expect. On the graph on the bottom right, you can see five different lines, and each one of those is a different loss rate. And even increasing the loss from 0% to 1%, you go down from almost 10 gigabit per second to around 6 gigabit per second, just 1% loss rate. And when you go to 10% or 25%, you get basically no traffic at all. So very high impact. So what does this mean for our benchmarks? Well, we're in this context in a rally, a boot and delete scenario. And what we can see is the more we increase the loss rate, the more time it takes for the scenario to complete. And going from a rate of 0% to a rate of 25%, we have about a 10-fold increase in the time that it takes to run this scenario. And some of those scenarios actually time out. There is a time out in a rally of 300 seconds, and about 25% of the experiments actually reached that time out without finishing the operations. Possible solutions for that. We can tune some parameters at the operating system level, like buffers for TCP, but also at the open stack level, like increasing the number of free tries or making time outs longer for also messaging, and that could help in high loss, high delay scenarios. So here, INOS is a very interesting tool because if you know what loss and what delay you can expect in your deployment, then you can simulate that in your lab and tune the system until you have the right settings and then just deploy that in your production environment. And Adrien is now going to take over. Thanks. So to be honest, when we target this presentation, our goal was to present conclusions. And actually as every time when we start to make experiments, things are harder, and actually right now we are still investigating more in details the results we gathered. So I will give you an example concretely on the HabitMQ messages. So as probably you already know, there are different kinds of messages. So we have the REST API that Renan showed on the second diagram. They were in blue, and we have all these RPC calls in green. And what we discover is that actually a lot of time is spent in these RPC calls. So you have two kinds of RPC calls. You have the RPC where actually you want to call directly particular machines, and you have the RPC cast. And I will make a focus on the RPC cast. So basically the RPC cast conceptually speaking, the idea is that you send your message in an asynchronous manner, and then the broker will be in charge of multicasting this information to the different nodes that subscribe to this topic. So if you dive into details, actually you will see that it's alpha and synchronous. In the sense that actually you need to wait to get the SDK, the acknowledge from the broker before going on on your codes. So what does it mean? It means that actually according to the place where you are going to put the broker, you will face the latency issue or not. What does it mean? It means that let's consider that the request comes from the client, so that is the remote node, and they want to make this kind of RPC cast. So they expect to do that in a few milliseconds, but actually due to the latency, due to the fact that you should wait for the acknowledgement, you will be impacted by the latency. So that's what you can see on the top left. My client will push a message to the broker. He will wait for the SDK. So here he is facing the penalty of the latency, and then the broker will send the message to the server. So then now if your broker is located actually on the client's side, here the RPC call is completely immediate. So that's really important because if you dive into details and if you measure actually the time it takes to perform such RPC calls, the latency will impact a lot. So our goal now is to try to characterize all message that have been exchanged between the different core services of Open Slack in order to be able to solve what we call the Placement Challenge. So we have several component services in Open Slack. Obviously the broker is one service that faces such a latency issue, but you should think also about grounds, about sender, where you should locate these different components. So what is the Tech-O-M message? So as I mentioned, we are still now consolidating the results we gather. The message we want to give today is that if you are interested by performing scientific and agorus performance evaluation, performance analysis of Open Slack, we develop Enos. So the goal of Enos is really to evaluate real deployment of Open Slack, once again not depth stack, but real deployment of Open Slack, and to see how it behaves under the performance perspective. The other things we also identify is that conducting all these experiments takes a lot of times and a lot of engineering effort. And actually the automation is critical. You need to be able to perform your experiment in a software-defined manner. And that's why we develop Enos with all these different description language in order to be able to automatize the three steps of the workflow. The deployment, the execution of the benchmark, and the collect of the different metrics. So as we also highlight, this has been done under the Open Science concept. So everything is available. Scripts are on GitHub. You can redo everything. You can just take the script. And if hopefully the provider is already present, so that's been, for example, if you run an Open Slack infrastructure, you can take the script directly and redo on your Open Slack infrastructure. If not, you can deploy your own provider. And at worst case, you have to implement 500 codes. So a few points regarding the current conclusion about the one-wide deployment. So the control plane, at least for the rally scenario, we perform Open Slack still behave correctly. So we need definitely to perform more complex scenarios and revolving glance, cylinder, whatever. From the data plane, what we discover is that you should be aware of key components such as neutron and you should know how you should configure them. Typically, if you don't know about DVR and if you do not configure DVR, it seems to be a nonsense to use Open Slack one-wide. So because we are not operator of clouds because we are a researcher, we are pretty sure we missed something in the experiment protocol. So please come on and tell us what we missed, how we can improve the experimental protocol. And actually, this will be really good for us because it will be straightforward to add all this benchmark directly in Enos because this is the goal of Enos. So maybe rally is not the right benchmark. Maybe shaker is not the right benchmark. We are already in touch with some red-eyed folks to make a benchmark dedicated to the NQP bus. But that's the idea. The idea is that you can come with your benchmark. We can put the benchmark and plug in and run your own benchmark. So what next? As Pierre said, we have a different session because I was the cameraman one just after that talk. Then we have the working group face-to-face meeting. So if you are interested by all these questions about fog edge and massively distributed cloud computing, I encourage you to come. The focus that we plan to do for the next cycle, so the first one is on the MQP alternative. So we want to investigate how we can and what can be the right bus solution to cope with all the fog and edge requirements. And we want also to address the placement challenges I just discussed before. So thanks for your intentions. And if you have questions. So it looked like you were testing network performance of virtual machine deployments across OpenStack. Do you plan to or do you expect there to be any difference with containerized deployments like deploying containers onto OpenStack? I think it's a cool question in the sense that I don't know the answer. But I mean, if you want to do that, you can do that. The real question is that is there any benchmark that is dedicated to do that? If there is one, you just have to plug it and you can conduct and make your experiments. So I don't know. We can guess that maybe some people, the folks that made shaker will probably provide some extension also to evaluate container solutions. I don't know. To be honest, I don't know. Because that's some experiments we are performing from the research point of view. For example, if you consider a lot of containers, let me give me a concrete example. When you start a VM, it takes, let's say, 20 seconds. If you start the same VMs under a condition, it can take minutes to start the VMs. What we discovered is that if you start a container, you expect that it will start in a few seconds. But similarly, if there is a lot of competition on the nodes, the boot time for the container can last maybe one minute. So I don't know about the network because this is... The question is that how the network performance is divided between different containers? So that can be a good question, but... Thank you. You're welcome. Any other questions? So I think we are over. Thank you, guys. And enjoy the remaining part of CEMIT. Thank you.