 All right. Good morning, everyone. My name is Sriram and this is my colleague Chandan, one of her colleague Sharath who could not, unfortunately could not make it. We are here to talk about how you can build a healthy network for your cloud using Silometer, Nova and Neutron. We will have time for QA and you can use the mics so that the recording can capture the questions and our answers. So, we will, that will be at the end of the presentation. In OpenStack, Nova has the responsibility to choose an optimal host while creating instances. It does it by walking through all the hosts and selecting based on various criteria. For example, if you have a compute requirement, then Nova will check if the particular host has sufficient CPU or RAM. If it doesn't have that, it will ignore that host from while placing the instance. Similarly, for storage, if we don't have sufficient storage, Nova will choose to ignore a particular host while placing the workload. But when it comes to networking, things change a little bit. If your host doesn't have network connectivity, Nova doesn't look at that aspect and still considers the host for placing an instance or creating an instance. What this will lead to is instance which gets created on a host, which will eventually not have network connectivity right at startup. In this talk, we are going to share with you a solution on how you can solve this problem and make your instance creation and workload placement more network aware. We'll start with an overview of Nova's scheduler and then take you through how Neutron and Silometer work together with Nova to solve this problem. In OpenStack, Nova takes care of placing the VMs. It uses something called a scheduler and a filter to figure out which are all the hosts that match a particular requirement. It can be customized by the end user by writing their own custom filters and entering the filter in the nova.conf file. Here's an example of all the default filters that are shipped with OpenStack and it's this value is configured in nova.conf. I've taken an example of one such filter which comes bundled with OpenStack where basically it is looking for an exact disk space match. The key function to look out for here is the host passes. So when you write a custom filter, you will have to implement that method and it will return a value back to Nova which will tell whether this host is a good candidate for placing a workload or not. You can see in this example it is looking to see whether the requested disk space while placing a workload matches the available disk space on the particular host. There are various other kinds of filters for lesser greater than availability and things like that. So let's see now what happens, what are the challenges that happen in a network that really impact the workload placement. So in a physical, this is a sample diagram we have taken to network topology which is very commonly deployed everywhere. So we have compute nodes that are connected to top-of-the-rack switches and these top-of-the-rack switches are further connected to a core switch at the aggregation layer. We can see that there are multiple levels of network devices. So the failure points are already more than very easy to see that there are multiple failure points. So network congestion, port status being up or down or bandwidth utilization, all of these factors can impact the quality of connectivity available for an instance. Nova can leverage the physical characteristics of a hypervisor, for example RAM, CPU like we talked about earlier, but today there is no way to leverage a physical network characteristics and this is where we are trying to bring in a solution where neutron can be leveraged in addition to Nova and CELOMETER and the critical attributes of a physical network can be taken into account while placing a workload in an optimal fashion. So I'm going to hand it over to my colleague Chandan to walk you through the details of the solution. Hello everyone, I'm Chandan and I'll take you through the solution that we have implemented. Okay, so first part of the problem to make Nova or OpenStack as a whole aware of the network related challenges is to make sure that OpenStack knows about the physical network connectivity. So to solve the physical network connectivity knowledge problem, what we have introduced is an extension. This extension is made to neutron and with the help of this extension we can store and retrieve physical network connectivity links on neutron. So if you can see on the slide we show you two ways of retrieving information, one which retrieves the whole set of topology information, connectivity information in one shot. The other one is to give query one endpoint at a time and get the links. So once this network information or the physical connectivity information is stored with neutron, the next part of the problem comes into picture, which is now that we know about the physical network, how do we actually capture the utilization or bandwidth or various network characteristics that we have in the physical network and store it with OpenStack. This is where CELOMETER comes into picture. So CELOMETER is the default telemetry service in OpenStack and it provides the metering service you can store metering information about various physical and virtual resources in CELOMETER. As with any other OpenStack service it provides a REST based API which can be used to store and retrieve metering information and it suits perfectly for our use case. The way CELOMETER stores information about any resource is it associates a meter with the resource and then it collects samples of the utilization of the resource and stores it as a time series. So going forward, again we come back to the same image which shows the typical installation of OpenStack and how it connects to various levels of switches you have taught, you have the core switch. And let us now look at what are the various attributes of a physical network that will impact the connectivity of a VM that you place in any of the compute nodes in the bottom side. So at the first level you can see all the compute nodes are connected to the Tor switches. So now if I place any VM on any of the compute node it actually depends on the fact that how much of bandwidth is actually available on the access port, whether the access port connectivity is even there if it is up or down. Similarly, you can see the second stage of network attributes that come into picture is the uplink port from the Tor to the core switch. Again we have to see the network connectivity, congestion, and bandwidth available with the uplink port. Various other network characteristics which are inherent to a physical network like network congestion, various parts that you have to reach the core switch will also impact this decision. So these are all the KPIs that we would like to collect about the network and provide this information to Nova so that he can make a good decision about where to place the VM. Okay, so now that we have talked about some of the characteristics of the network that we want to store in CELOMETER we would like to present to you a simple use case where we try to capture one of the meters and store it in CELOMETER. If you look at this use case you'll see that we have defined a very generic meter like switch port state and the resource metadata is the field that we use to capture the exact characteristics of the instance. So in this case it is the switch port details that we pass as the resource metadata. CELOMETER is actually good at capturing various resource and giving you a aggregate output of that resource over time. But for our use case we need two kinds of information. First part is raw meters that we will use to filter out various hosts. So let's take an example of switch port which is down. So immediately we can say that a compute node that is connected to this switch port should not be part of scheduling a VM. This can be derived from the raw meters that we have connected. The other part is where we would like to rank the hosts as based on the network health. So suppose there are five hosts and the first one is having the best network bandwidth available to it. We would obviously like to place the VM on the first host. So that is where the second part of the information comes into picture where we calculate a network health score which we are calling NHS. And this is the part where we bring in another component which is the network health monitor agent. The work of this agent is to go through all the individual meters that we have collected and kept in kilometer and create a composite score, health score that will be easily consumed by NOVA for ranking of compute nodes. So to recap the solution, we have three parts here. First part is to make OpenStack aware of the physical network connectivity. The second part is collecting various attributes of this physical network and storing it somewhere that is where kilometer came into picture. And the third part is the network agent which actually aggregates all these various meters into a composite health score which is consumed and is used for ranking the various compute nodes. So bringing it all together, we have something called a network aware filter. So this is a filter that we are adding to NOVA scheduler and it has two parts. You can look at the diagram. One is the network filter part which we talked about which will be able to filter out hosts based on raw meters like port up down or zero bandwidth available, over committing a bandwidth on a port. These are the kind of attributes that we'll look at in the filter part of it. The second part is the weight part which takes care of ranking the compute node based on the network health of the compute node. So this is where we talked about network health score that will be computed by a network health agent. So in this slide, we are talking about the various components that we will interact with each other and how the flow of action will happen. So if you look at the first step when the user adds physical network details to Neutron through the physical network APIs, it will trigger a periodic action. The periodic action is for Neutron to extract all the physical network information for the, sorry, for the network health monitor to extract all the physical network information from Neutron and start a periodic action of collecting details about this network attributes for the physical network links that we have just received from Neutron. The second step comes into picture when a user tries to schedule a VM. So at this point, Nova is going to query the network health monitor and the network health monitor is going to look at zero meter, come back with a composite score which will be used to rank the various compute nodes. So this was a simple case of placing a VM compute node based on the network health score. But we can extend this idea a little bit further and make the feature of network flavors available to user. What this does is it will give the end user a knob which he can tune and say, this is the kind of network resources that I need for placing my VMs and that can be used as a predicate when the scheduler goes through various compute node and it will be able to check and give you a network or a compute node which fits all your networking needs. Another use case is while live migrating, if you have a VM which requires a certain kind of network connectivity, the live migration will also trigger this kind of network based selection of compute nodes. So to go further, we have a simple demo where we'll see two cases for our demo, we have taken a very simple case. We have one single node which is acting as a compute node and what happens is we try to place a VM on that and depending on if the compute nodes uplink connectivity is upward down, the VM can be placed or rejected from that compute node. So this is showing the normal course of action and if you look at the next slide, this is what happens when the port is down, the scheduler is going to reject this compute node. So I'll play a small video which captures this. So the video starts with a normal workflow we should know that the scheduler is already in place. So this is the normal happy case scenario where the uplink port is available and the VM placement is going to take place. So I'll just speed it up a bit. So you can see that the VM got placed and it's in a active state now. Now next to simulate network failure, what we do is we inject some data in a silometer and simulate that the network is now down, the uplink port is now down. So if you look at the sample that we are trying to create, we say it's a Boolean value and it says the port is basically down, zero means down. So once we do that, because the filter is in place, what happens is as you try to create the VM, the scheduler will go through all the compute nodes and because we have only one of them here, it comes to a state where it says no for the host found because your networking needs are not matched by any of this compute node. So this is basically what we wanted to speak about in the presentation, basically making NOVA an open stack aware of the networking requirement of a workload. That's all we had. If you have any questions, we're ready to answer for you. Okay. All right, thank you folks. Thanks for coming.