 Hi. Hello. Hi. Let's get started. I'm Luca San San Rega, Q-Engineer at Mido Cura in Barcelona. I'm here today to present me on its scalability testing with Newton and Open Source Tools with my friend and former colleague, Dani. Yeah. My name's Daniel. I'm currently working as a software engineer for Rehat for the QE department. And basically I'm going to introduce you just in case you don't know after all these QE sessions that we have been having through during all the summit about what is Tempest and so forth. But let's go through it. Okay. So the agenda for today. We briefly introduce Milonet, what's Milonet? And the problem statement of scalability testing in SDN. Then we will show the Milonet acceptance cycle at Mido Cura. And then... Yeah. And then we'll be speaking about Tempest a little bit. How does it work for the OpenStack CI? How does it interact with Sule and every component? And then we're going to speak about how to scale it. And then we'll be roughly speaking about the real use case where we found it back using those tools. So what is Milonet? Milonet is a network virtualization product. And it's open source. So what is network virtualization? Network virtualization is about decoupling the infrastructure from the physical layer. So that the network functions that were previously done with physical appliances now are done on software. You can think of layer 2 switching, layer 3 routing, load balancing, firewall. So why do we need network virtualization at all? And network virtualization is here to fill the gaps between host virtualization and the networking. So you can think of network virtualization as a service. So what are the benefits of network virtualization? Network virtualization allows you to have flexible... allows you to scale. You don't have the limitations of the underlying physical network. You can think that now you can create routers, switches, everything in software without any limitation. In the case of Milonet, it's also distributed. You don't have single point of failures. And also it's cloud-friendly as it's completely programmable and it offers a pluggable API to plug the either VMs or dockers into the network, into the virtual network. So in this slide, we can see the underlay physical network formed by a number of BGP gateways and the compute nodes running the VMs or dockers containers. So what happens is that when a packet egress a compute node or a gateway node, Milonet will put this packet into the overlay network and we will perform what we call a simulation. That means that this packet will traverse the virtual topology that is created so that Milonet decides whether it has to be sent. If it's a south-north packet, it will be sent to a BGP then to the exterior. If it's an east-west traffic, it will be routed to... it will be sent to another compute. So here we have the reference architecture for Milonet and OpenStack. So we have a number of gateway nodes that will connect our OpenStack deployment to the exterior. A number of compute nodes that are the same compute as in an OpenStack deployment with the difference that instead of having the neutron OBS agent, we have the Milonet agent that will tell the OBS kernel module how to treat the packages of the networking. Then the controller, again, is also the same node as in an OpenStack deployment. And we will have the neutron server running there and the Milonet cluster, which is the endpoint API for Milonet. And the way Milonet is integrated with neutron is throughout our network plug-in, Milonet network plug-in that is responsible to translate the neutron calls into Milonet API calls. So what's the problem statement of SEM scale testing? So there are a number of intrinsic challenges of SEM scale testing. The first one is distributed. So it means we need many points of probes in many different places, which makes it very difficult. Also, although it is decoupled from the underlying infrastructure, any failure on the underlying network will affect the overlay, too. And, of course, now we have fast-changing environments with unpredictable traffic patterns as network functions are not mapped into physical devices as previously, now they're in software. So there's all kinds of new patterns that cannot allow you to find these specific points of failure or troubleshooting the failures. On the other hand, we have a number of implementation challenges. We cannot have a dedicated testbed for scale testing on SDN because of the cost of the hardware, the cost of the deployment and the maintenance. So we need some kind of tool that allows us to emulate these... emulate scaling the underlying. So simplifying the dimensions of scale testing, you can drag them to these three dimensions. Traditionally, network testing has only been done in one of the dimensions at a time, and that leads to many problems when going into production as the relationships between these three axes are not always as expected. So these three dimensions of SDN scale testing can be classified into... scale the overlay. So we can increase the API stress. We can increase the size of the network virtual topology, such what are the maximum limits of routers, load balancers, networks that virtual topology can support. In addition to this dimension, we can start adding workloads. So how these virtual topologies behave when you increase the workloads on the data plane and you start adding new flows and so on. And on the last dimension, we will scale... how are we able to scale the underlay? So the physical network. We can just say that it's not... it's unfeasible to scale a physical underlay. So we need the tool to allow us to scale an emulated underlay. So in the Y and Z axes, we will use rally that allows us to scale the control plane operations at the same time add workloads. And for scaling the underlay, we will use our tool, which is called MiloNet Sandbox, which is a wrapper around Docker. So this is the MiloNet acceptance cycle we follow for an open source release. Once the release candidate is generated, we have three parallel deployments, which are the scalability lab, which is the one I'm presenting today. We use the sandbox based on Docker and then rally for benchmarking. We have a certification cloud, which is a bare metal cloud to perform and benchmarking on real hardware. And then we have a last automated deployment that uses a vagrant and we perform the functional testing with Tempest. Everything's fine, we make the release. Okay, so I just want to introduce you in case you know what is Tempest. So as you know, OpenStack is a really big project and everybody knows what Nova is, what Neutron is, what Cinder is, what Glass is. But maybe you are not aware about Tempest unless you are testing the cloud or you are doing some commits. So Tempest is the integration and testing framework for OpenStack, which was realized based on Python unit test-to-framework. And the idea for it is that it would be like doing black-box testing, like from, so it would be used from a single-dev stack, so from that to production clouds. So if anyone wants you to have ever done any kind of commit to OpenStack, even if you don't know, you have already used Tempest because this is being used in Seoul for every OpenStack commit. In fact, it's even been used several times because it passes through many gates. So, for instance, the services that are covered by Tempest includes Batar not only like Ironic, Nova, Sahara, Trove, Keystone, Glance, and so on. And those were our unique monolithic and everything was integrated within the Tempest repo. So as we the Tempest guys, we are great guys and we know a lot of things about OpenStack where we can't really cover everything about every OpenStack project. So lately we have been thinking about like splitting the workload and we are working the Tempest plug-in interface which basically it's a Python package which would contain the Tempest test for every other project. One big example about this would be Neutron. For instance, if you are a Neutron, it has a ton of sub-projects and I think it's one of the most complicated projects within OpenStack. So you have like Neutron, Neutron Elbas, VPN as a service, Firewall as a service, and even within Neutron this has been split within different things. So the solution within Tempest to check that out is to split that into their own Tempest plug-in then you can create a built-in environment for that and using TestR, which is a test runner, everything would be discovered in a seamless way from your testing node. How can we do that? Well, there's still a cookie cutter which allows you to create side-to-plug-in. So I don't know if you are seeing this currently, but the idea is quite simple. So basically you would need to add an entry point which is the way the Python discovers new stuff on it and that would be like the scale for a Tempest plug-in. You would have a plug-in PY which would handle everything and then you would have like a test folder with the API test and then the SNAR test. Those SNAR tests, just in case you are not aware of that, as this is a beginner talk, we can just tell that those are the tests that cover several projects. Let's say, for instance, you want to spawn a VM, then you want your VM to have a port, then you want your port to do something with another port, you want to test QoS or you want them to do something on that. So those tests are a little bit more difficult as they would be sharing a ton of products within them. So unless it's a real specific service, they won't be migrated to a Tempest plug-in. In order to help leverage this load, we have been publishing like what we would like at a Tempest stable interface which is, okay, now I got my plan. How can I use Tempest within my library without having to import the whole Tempest? So we have released this library called Tempest.lib, or really it was Tempest.lib, but it has been reintegrated lately which allows you to get most of the functionality that you would get from the original framework just as a library to use before. So basically you can call the test runner, it would use subunit and you would have access to all the unified rest client. So let me summarize this a little bit more because I think it's interesting for every other project contributor and please, if you are developing Tempest plug-ins for your project, start using this as we have seen like guys who try to reinvent the wheel or just import the whole project or just want to add everything there. This was released as a way to ease the pain of doing that and I think it's a great and great effort. Here again, as I said before, this comes out from the developer guide and if you see like the Soul Jenkins, once that you get on your commit, you could be having it over all the workflow, you'll be having it passing the Tempest test even two times. So the first one would be like over the review, you would have to use Soul, which as in case you don't know that too, it's a gating system for automated testing and it will go through a ton of Jenkins for access to PEP8 for validation content, then Tempest, Tempest Neutron. We have a ton of gates for this, I didn't really count the note, but you could be saying that there are more than 20. Once that it passes on and you got your core approval plus one for the Jenkins, before it gets merged, it will have to go through another set of Jenkins. So you can imagine how important is this framework for these momentos of OpenStack itself. Okay, so now moving back to the Mionette Scalability Lab, this is the underlay emulated deployment we use for performing the scalability tests. So here you can see a number of elements that are Docker containers and what Mionette Scalability is using, we use this Mionette Sandbox which is a wrapper around Docker. So you define, you are able to define flavors which are these components and then you build the images and then you run them and you have, in this case, a complete deployment of what we are interested in. So we are interested in testing Mionette with Neutromat Scale. So we don't want to use NOVA. That's why we have these network namespaces there. So in here we see the minimal deployment and what we can achieve with Mionette Sandbox is scaling the compute nodes with Docker Compose up to R. Now we are scaling 30 computes per server. So what it makes interesting is that we can use Rally to run the benchmarks against these deployment while monitoring with Lafana, which is the dashboard, and in FluxDB the time series database and two collectors, Telegraph and JMxtrans that allows you to collect time series data and put it into the in FluxDB. So right now we are using the benefits of these two main tools. So what are those benefits? The benefit is that Mionette Sandbox can easily deploy and reproduce these deployments and scale an emulated underlay on demand. It's really fast as you can just rebuild the whole environment in just seconds and it's easy to scale. It allows you to version the different components while starting with a base version. You can just create a number of versions very easily and you can use it on a CI in Jenkins so that deployment can be automated and you can run them nightly for every patch for every different version. In the case of Rally, then it's the perfect framework for testing scalability benchmark. So you have already upstream neutron tasks and it's open source and then in double fold it's customized. So what we are doing is we are developing our own plugins so that we can test our specific benchmarks against these environments. So what are we achieving currently with our plugins and the benchmarks? So we are able to compare the overhead introduced by Neutron. So, as I said, Neutron calls are translated into API and Neonite API calls. So we can compare just using Neonite API calls and using Neutron calls. What's the overhead introduced by this? Two layers of attacking the API. We are developing our own scenario tasks based on the upstream Neutron tasks, scenario tasks. So instead of running VMs, as we said, we are not interested in NOVA because of the complexity that adds to our deployment, we are using network namespaces to ask emulation of VMs. So with that, we can either do plain control, plain test, so finding the upper limits for the Neutron objects. And we can... So we are testing on the two dimensions and then adding workloads to the tests so we can add north-south traffic and east-west. Right now we are benchmarking the latency north-south and east-west. Additionally, we also are running some rally tasks on the upstream networking plugin. And now I will show a simple use case we found... a bug we found running a simple use case of our network plugin. So this is the rally create port task which is a slight modification of the upstream Neutron create and delete task. So what it attempts is to create a router and a network. So in each operation, iteration, it attempts to create two ports and an IP and then it benchmarks the time for every each port creation. Rally allows you to define the rate per second of this port creation. In this case, we use a rate per second of one request per second, the number of iterations, the 3,500. And another important thing that rally allows you is to define an SLA. In this case, we said that it couldn't be higher than five seconds per iteration and a failure rate. So the task will be aborted if there is any failure when running this. So now, I will show you the rally task report for these... Can you see it? Can you read? Yeah? Bigger? Bigger, even bigger? Is now okay? Okay, so that's the... the task report for the rally task report for the port creation. So what we see here is that there was this SLA failure. That's something because of some reason it aborted the execution of this. So that's the report. See, the execution failed because of the SLA was aborted. You can see here that's the trend of the benchmark of the time of creation. So we go into the details. We can actually see that if we leave the benchmark for the creation time. So in the x-axis, we have the iteration number on the y-axis. We have the time of the second port creation operation. In this case, we see that the second port operation creation didn't fail. So if everything would have like... Well, this would be the trend when creating ports at scale in mid-on-net. And on the other side, we see that the first port create operation iteration thousand-something failed and make the SLA aborted. So another thing is what are the failure reports returned by Rally? So the first error is like this connection to Newton failed but then we see this could not acquire a log for a storage operation. So we were wondering what's this coming from. So we decided to go to our Grafana dashboard. So look at just a second. Maybe we should also introduce about what SLA is just in case people didn't know. I have said that if either the maximum iteration per maximum time per iteration is above a certain threshold then the task is aborted. And the same with the failure. So we went to the Grafana dashboard which monitors the zookeeper node count. So I haven't explained that in addition to the compute nodes, the controller and gateways, we use a number of network state databases which is zookeeper and Cassandra. Zookeeper stores the topology. So the objects that are created in the virtual network and Cassandra stays the networks flow state. So when we run this create port and we saw the failure, the Grafana dashboard to see what was happening. So you can see this increase represents the creation of port objects into zookeeper. So at the rate of one per second, this was creating ports, objects in the virtual topology until the failure occurred. At this point, nothing was added to zookeeper until the SLA was cancelled, aborted by the rally and the cleanup of resources started. So another trend we can see here from the dashboard is that there is an offset in the remaining zookeeper node count after the cleanup, which means that this access to zookeeper was somehow blocked during some time and these operations could not perform these write and delete, create and delete ports at the end. On the other side, the zookeeper data has the same trend as the node count and the garbage collection time gives us an idea of what was the memory consumption in the zookeeper nodes. So what was the bug that was causing this failure? So unlike MySQL, zookeeper does not support transactions. So every operation needs to acquire the lock to perform these translated neutron operations into mid-onnet API codes. So what it happens is that these acquiring lock operations time out and so the SLA fails. So what was the solution then? So the solution consists on serializing these access to the zookeeper lock inside the cluster so that the request don't time out acquiring the lock. I don't know if we are fine of time. Yeah, we are fine of time. Conclusions and future work. We've explained that scaling one dimension is not enough in distributed systems because of the unknown relationships between the different dimensions and that we need the emulated testbeds to test scalability in these agile environments. This allows us to catch, as we've seen, scalability bugs, compare the trends between software versions, different patches, but of course, you can use an emulated testbed to perform and benchmarking as it is emulated. So we still need the real testbeds for that. As future work, we plan to scale these Mionet sandbox to allow a thousand computes with million months and this will be done by using Docker swarm so we can scale up to then and on the other side, adding new rally plugins so that we have better benchmarks for that. Here are some links to the... And thank you so much for attending and see you in Barcelona in October. Also, if you have any questions, comments, or want to say something, now it's time. I have a question that... when you use rally to perform a test, what if the HTTP response is a sync process? That means you get HTTP 200 but you think that, oh, this HTTP request is good but the process may be a sync to the agent, maybe RPC, or maybe something to the agent and the agent need to do something that the agent may be a work file and what you say, the HTTP request is good but maybe the real work may be failed. What do you do or handle this scenario? So it's true that the failure report from rally doesn't tell you the source of the problem. It just tells you the symptom. That's why the dashboard, the graphics monitoring are useful for. So, of course, you need to go... we had to go on the neutron server logs that were... then we had to go to the Mionet API logs that were giving more insights, better log error and after that we could use this information. So I guess it depends on the type of error you get. You need to go to the logs in the end. Logs and monitoring, these are the two main tools you have for debugging. Okay, thank you. That was an interesting setup but I was wondering if it's possible to reuse any of your test framework for evaluating other SDN controllers and perhaps doing comparison tests? Sure, the Mionet sandbox is open source so you can just go there to second link Mionet sandbox and then you have these flavors in YAML, it's a YAML file, and you can add or remove components, dockers, and you can build the docker file for any other component. So as you see, we have spawned for the topology. So here instead of middleman, you could create your images, your components for whatever other... So for example, in fact, we also plan to perform some comparisons against OVN. So we will spawn an equivalent setup but instead of with Mionet... Yeah, I'm actually interested in OVN and some of the open daylight... We haven't done yet the configuration of these flavors with OVN but we're after it. So right now you can deploy that just getting the source code from the Mionet sandbox and running it and free to play with it. It should be pretty technologically solution. All you have to do is just go fetch the code and configure it so you would use your solution instead of Mionet agent. Thanks. So actually I want to add to one point. So speaking of OVN, there's an ongoing effort that follows a very similar type of methodology using docker container to emulate. So there's work from both eBay and IBM working on that and I think it's already committed into the OVN testing to get to repository. But again, my question is do you have some thoughts about... So your middle man agent running on the computer is part of your control plane, right? So do you have some thoughts on how you're going to be facefully emulating the scalability of your whole control plane including your idle man? You talk about 30 containers per host. Is there some consideration from a performance? It's memory constraints. We've been watching how much we can put into 128 gigs of RAM, 24 cores server. But yeah, it should be pretty easy using multi-node docker swarm to scale to different servers. That's also a future work. So your middle man agent is running in JVM, is it? Or the middle man agent, is it a JVM-based application? So the other question is you talk about the locking of the zookeeper. What was the parallelism that you were testing? The what? The parallelism that you're testing that led to the locking timeout. The access to the lock of writing, of creating a new object because the zookeeper stores the virtual topology. So every new port, every new router needs to be created in zookeeper. But as it doesn't support transactions as in MySQL, for each of these operations you need to acquire a lock, perform this creation, then give away the lock, and so on. So when the load increases, so in port creation you create great ports, and then this queue of acquiring the lock increases at some point the timeout is fired. So you get this error. Yeah, I assume this type of error happens when you have higher parallel requests for ports, right? Exactly. So what was the parallelism that you were testing? Now it's one request per second, so that means that if it takes more, because it's one create port, but this means that there are many operations, so it's not just create port, but it's assigning an IP and so on. So this create port operation is not just one operation, but some other operation, you know? And those operations will need to happen while the lock is held by this one single transaction? Exactly. Okay. So that's a problem. So now there are different approaches, but right now the approach is to have a queue, but in the cluster, the timeout doesn't occur when accessing the zookeeper, but it's not said, but the API controls the access to the zookeeper. So now with this serialized lock acquisition mechanism, you just talked about how many ports per second open-minute can you sustain? I haven't tried because that was for the 5.0 version with Liberty, but we have been in 9.1 and I can't remember, but it's... I don't know the number, but it increased like hundreds of times, so you didn't have... Probably we hit another type of error, but there was no any more problem with the zookeeper access, but yeah, that's a good question. I don't know how to answer right now. Okay. Thank you. You're welcome. Okay. So then thank you very much for attending. Okay, I'll go ahead. One of your bullets actually said that you were doing data plane testing. What are you using for traffic generation and monitoring, and is it distributed to all your containers? So right now we are only testing ping latency, so our only real traffic, but we plan to create with high-perf, net-perf real traffic work. Okay. So anything else? Then thank you for attending and hope you have enjoyed. Thank you very much.