 Are we official, is that, we're ready to roll? Also, thanks for coming and joining us today in my home city of Boston to listen to us talk about the continuous integration infrastructure that we built for dealing with all the Microsoft components upstream in OpenStack. As I said, my name is Peter Pouliot. This is my colleague and dear friend, Octavian aka Tavi. Yeah, and we're, like to get in and basically start with a little bit of history of sort of how we sort of got into the CI and the path that we took to get there and the choices that we made, some of the problems that we've seen, just, fair enough. All right, so we have a really, it started with a really small group of individuals that originally started building the continuous integration infrastructure. We essentially made a decision early on because it was pretty much the model that everybody was doing to use an undercloud of KVM to host our DevStack instances. Because of the nature of Hyper-V at the time, we didn't have the luxury of being able to do nesting. So from day one, we've always had to have physical compute nodes. And because we supported shared nothing live migration from day one, we had to require two. So essentially we spent a significant amount of time early on automating the layers of configuration necessary to prep the target environments, specifically a lot of the Windows side as well. And we tried to use as much of the upstream automation for configuring the OpenStack environment early on specifically that was a pack stack. So to give you a big picture, we were on one of the largest continuous integration infrastructures in all of OpenStack. And a lot of it's because of the physically demanding nature of our hardware requirements. We need to have a lot of hardware to be able to satisfy the incoming requests. So starting in 2012, I started working here at Microsoft, and halfway through the year, we started amassing equipment to begin building the continuous integration infrastructure. So there was three racks originally located across the river here in Cambridge, Massachusetts. And essentially over the course of the next few years, we did think we essentially continued to grow as we were building this, acquiring more hardware. Now, the hardware we acquired was essentially acquired out of other engineering teams that were recycling equipment. And as we grew, we went from three racks at the beginning of 2012 to a set of an additional racks that were located on the other side of Cambridge, Massachusetts, so about five blocks from where our first racks were. So we essentially had to build a man for those of you who are telecom geeks. And I never thought I would ever even use that terminology. But a metropolitan area network across the city of Cambridge using dark fiber. And unfortunately at that time, given the equipment we had, we were essentially running over trunked one gigabit links across the city. And in some cases early on, our test loads would have Hyper-V nodes five blocks away and KVM nodes in the building over here sort of computing through a straw, as I like to think about it. So shortly after that, when we expanded into the space, we were able to, as I said, gain more equipment from some other local Microsoft engineering teams. And essentially, we were able to add a significant amount of capacity, specifically in 120 1U nodes, and we were able to light that up shortly before, I forget which summit that was, but I'm sure everyone was at the end of 2013. Okay, so fast forward a little bit. As we kept adding capacity, we kept running out of networking. So we were fortunate enough to get some budget for Microsoft to essentially build an entirely new network infrastructure for the continuous integration lab. And the first, well, the first time we rebuilt it, it was all Cisco gear, Cisco Nexus, and essentially that allowed us to get full 10 gigabit connectivity across the city, as well as a much more stable platform and enough port density going into each rack. So now we could essentially segment our data plane network services from the data plane test side of the fence from our production services that we were using to manage and maintain the core infrastructure, which was a huge deal because it helped simplify a lot of troubleshooting rather than having to stare at the rest of packets that were going by and you weren't sure which was which. We now had a dedicated isolated data plane to allow the testing side of the fence to be operated in its sort of own encapsulated area. So fast forward a little more. After the network, we basically spent two years just getting large amounts of hardware due to the nature of a local IT guy that liked me and liked to send pallets of equipment to me. So literally in the course of the two years, we probably processed racking, unracking, cabling, uncabling, pulling all the parts out, roughly almost just shy of 2,000 servers. And essentially we got to the point where we had somewhere in the aspect of 17 racks of equipment. Probably close to 50 plus different types of server models operating. It was starting actually from servers with four gigs of memory and going up to servers with 128. So it was really a very large spread of types of hardware. Yeah, so the problem that we had from get go was always being under not having enough tools, not having enough hardware to do the job. The hardware that we had was substantially, let's say, under qualified for what we were doing. But the beauty of that is it forced us into sort of survival mode. And resiliency mode and do whatever it takes to make it move forward mode. So I love to say that when we first stood up the continuous integration infrastructure for a period of time, we actually processed more CI votes than upstream Jenkins, and we did it on Hyper-V nodes that had four gigs of RAM. So I always joke with these guys, I'm like, if anybody who saw the movie Rocky Five, it's kind of feels like that. So you're old, you're broken down, your body parts don't function. But when we process a job, you're gonna feel it. So to move forward a little more, shortly, well, actually about last year, towards the end of the year, we received another influx of hardware, this time some actually decent hardware coming out of the engineering teams in Redmond. And we received some high density compute, high density storage, and real infrastructure. So we set about for three weeks, the cloud based team and myself sat here in Cambridge, Massachusetts, screaming at each other in a data center, fighting over how we're gonna do this and essentially assembling the next generation of our continuous integration infrastructure. So essentially what that means is today, we have a series of roughly about 78 nodes per rack, it was 78, 76, 78 nodes per rack of highly dense quanta compute power, roughly I think they have around 96 gigs of RAM. The core of our under cloud is all SSD based. We have 10 gigabits to the host, 40 gigabit backplane. So as you can imagine, this helped our CI processing substantially. In fact, I think it didn't drop our test runs to roughly about an hour or it's like this. We actually have most of the runs now somewhere between 35 and 45 minutes. And before, we were actually running, doing a full tempest run in roughly around one hour and a half. So I want to go through this to give you an idea of what the level of hardware issues and all those things that we had to deal with just to get off the ground. It's been, we spend a significant amount of time doing what I like to call the rack and stack workout. And when I say we, I really mean me, because these guys are located in Romania and I'm the only one locally here sitting in the data center, so big and strong. So as we said before, when we started doing this, Hyper-V brought some significant challenges. And as I said earlier, most of the people in the community did their test runs were a single DevStack instance running on somebody's cloud nesting a VM. Because we didn't have that luxury and because of what was available to us at the time, we needed to be able to automate pretty much any Linux distribution, any Windows flavor, multiple hardware patterns, all the supporting applications and processes. And because of the high demand of the needs for physical compute, obviously we just had the challenges kept piling up. So basically what we ran on in the first generation was essentially a CentOS under cloud, as we said earlier, using KVM with Ubuntu as a DevStack VM for the upper cloud that we would then plug our Hyper-V compute nodes into. So if we talk about automating, now if we paint the landscape of what DevOps tooling looked like for Windows back in 2012 when we started this, Windows support was minimal at best and chef and it was slightly better in puppet. So at that time, as a result of that and the fact that puppet supported some 15 odd platforms, I made the decision to start automating in puppet. Because it had the most sort of Legos already readily available for me to begin. So we set out and started creating a basic framework for pulling together all the necessary pieces that were already there to enable us to get to the point where we could put OpenStack on Hyper-V using underneath basically all the binaries that were already being produced out of the community. Now, once we get to that point, now we need to automate the rest of it. So we essentially started building a pixie infrastructure that allowed me to change the between different Linux distributions that had similar patterns. Because some days, unfortunately, when we first began, the Microsoft lab infrastructure wasn't always kind to me and the proxies that we used to have to go through back then once again would prevent you from getting to some installation sources on one day and then the next day it would work. So you had to be able to switch from Sentos to Scientific to Fedora to Debian to Ubuntu just to be able to see if you could get something to run. So to continue that once again with the, on the automation side, obviously with such a large amount of servers and having such a high need to be able to control the environment. We had to essentially build a network infrastructure that we could control and essentially provide static leases for if we wanted to for any network interface on any device in our network. So once again, we built an ISE DHB cluster that essentially would bulk load YAML for every network interface, every subnet, every piece of DNS information. And dynamically assemble effectively our IPAM infrastructure to deploy it. Now the good thing about this is it allows us the ability if one of the nodes goes down we can redeploy it in less than seven minutes and the cluster comes, the clusters, there's no service interruption. I can move it between operating systems and it doesn't care. Now we roughly have, well, before we decided to skip looking for some white space problems, we probably had close to 20,000 lines of fair data that we fed into this system to create the network. Today we're roughly about just under about 13,000 lines of fair data in YAML that we pass into that. Additionally, with so many network devices and we had some tooling that we would use to pre-build pieces before we would move them in, we needed to be able to dynamically change between pixie infrastructures across the data center. So we automated patterns in Jenkins jobs to be able to swap VLANs and swap the HCP relays in the switching infrastructure and deal with a lot of expect and just being able to manage large amounts of hardware with very little people. So, yeah, in the last year, just because we've generated a lot of, from the legacy perspective of the automation that we've written, we've generated a lot of puppet code. So over the last year I've been trying to see what sort of is available, what I have that has actually hasn't been sort of, I guess, created again since then and what I have that's still useful to potentially, essentially I want to put it back in the community in the upstream puppet forges and all that stuff. But today, if anybody's interested, you can go look at my public GitHub repo and all that code is there. So, from a CI contribution perspective, we have 11 CIs, Nova, Neutron, Compute Hyper-V, Networking Hyper-V, Cinder, which includes iSCSI, SMB for Windows, and SMB for Linux, Manila, OSWIN, OVS. And as I said, we maintain a production CI facility with 700 servers, around 200 plus switches, and multiple storage devices. From a CI voting perspective, here's sort of, I guess, shows our percentage of how we participated in voting throughout the release cycles, right? So as you can see, we handle a significant amount of the voting for the community, and that's a lot of test processing that you guys probably don't know are actually being tested on Windows, so, yeah, it's good stuff. So here's a graph that one of our colleagues created just to, so that shows some of our different activity across the CIs, in this case. Now, I'm gonna bring Tavi in to talk a little more about what we did in the new CI, because we changed a lot of the automation at that time. So, Tavi, if you wouldn't mind. So, initially, we actually had focused on two CIs. We started with Nova and Neutron. Currently, this is the full list of CIs that we actually run. A couple of them, as you might see, are still under testing, so they are not fully reporting upstream, but all of them are actively running in our environment. So, getting back to what Peter was mentioning, we had a major upgrade in the hardware in the CI in 2016. And together with that upgrade, we decided to also upgrade the software infrastructure we rely on when we do the testing. We actually moved towards deploying all the bare metal components using mass. Since we also integrated in mass support for Microsoft workloads and Windows Server. And we use for orchestration, we use Jujju. The main reason why we actually switched to Jujju is the fact that we can take advantage of all the facilities that it provides for detecting if any of the components that you require is already deployed and it will not deploy again. So, for instance, in any CI run, we are using actually one DevStack to Windows Server nodes, which need to be joined into an Active Directory. So, whenever we do a run, if the Active Directory is already deployed, it will just register those nodes to Active Directory. If by any chance that Active Directory fails or something happens to that machine that runs Active Directory and the system at Jujju cannot connect to it, it will automatically deploy a new instance of Active Directory. So it really helps us in avoiding a lot of failures that may happen in the CI due to infrastructure reasons. So a few words about the solutions that we deployed. We've moved since we had better hardware, we also moved towards an HA approach. Our master is deployed on two nodes. We use HA proxy between the two nodes, so we can actually take advantage. If any of the nodes had any issues, we use through HA proxy and a floating IP, we actually revert to the second one. Because we didn't have a very large amount of hardware, we didn't deploy it in a tree node which would be actually the, let's say, a recommended way of using it. We stick with only two. That means that if anything happens and we fail over, bringing back the second node which had issues will happen manually. So we have to intervene and bring that manually. But we still make sure that the CI runs fine even if one of those nodes fails. And we consider that this is a good enough trade off for us at the moment considering the hardware limitations. Jujju is as well deployed in HA mode. We have three virtual machines which are deployed, so Jujju is always in a full HA mode. And since the number of CIs grew, our team as well grew, and we use different users for different CIs. And in order to make sure that we will not have users interfering with other groups CI environments and servers, we created the Jambox and we use that Jambox with dedicated user for each CI. This is a short list of the models we have in Jujju. Jujju has the facility of defining users and models and linking particular models to a user. So for instance, if we have the user OVS running two of the CIs, the Neutron OVS and the OVS CI, that model will never be accessible. Even by that mean, you can just list and see that that model exists, even at that mean, you cannot query and get information about what happens inside that model. Only the user that owns the model can actually see what it's happening inside it. And this is a short list with all the models and it also shows at that particular moment a snapshot of how many machines and how many cores, CPU cores were allocated to each of the CIs. As you see, most of them are actually in the infrastructure model, where in the infrastructure, we have all the Hyper-V compute nodes which are shared by most of our CIs. Now, let's get a bit into details on all the components. The first and the most important one is actually the Zool model. In our case, since we have a large number of CIs, we wanted to make sure that we will not have issues. In case one of the CIs goes bad and starts importing, upstream a lot of errors or something happens and the account gets suspended, we couldn't afford getting all the CIs suspended because they are linked to the same account. So we created accounts for each and every CI. This means that in Zool, we have to have actually a Zool instance for each and every account that we run. And also, the Zool has two other components. It has the Gehrman component, which is actually the central unit of processing, and it has the mergers, which always collect for each and every incoming request, collect the upstream patches and create the prepared git pool to be able to test that required patch. And we are running now in a structure where we have on one single bare metal node. We have on the machine itself deployed Gehrman and we have as container, we deploy two mergers just to make sure that if one of them for whatever reason fails, we can still go on and run the CI. And we deploy one instance of Zool server for each account. The code that we use, we had to change. There is a Zool charm available, but unfortunately the upstream Zool charm does not allow this separation of each and every Zool component if you want to run only that particular component in a container or in a node. So we had to extend it and we are preparing actually the code to be contributed upstream as well. It's available on the GitHub link, on our GitHub. On the infrastructure model, we actually deployed in the infrastructure the Jenkins component that runs all the jobs that we need and it's actually connecting through the Gehrman plugin to the Zool Gehrman. We have the log server because of course we need to collect and provide back to the community the result of all the tests. We actually have in order to speed up all the tests as mentioned before, we managed to reduce the total time from over one hour and a half to roughly between 35 and 45 minutes. So in order to do that, we also have in-house deployed caches. Caches for both the Ubuntu packages and for all the PyPy caches. For PyPy, we use actually DevPy in two node deployment. So we take advantage as well of failover in case of any of them. Has an issue. We notice that sometimes DevPy has the so-called feature of hanging. So from time to time, one of the nodes happens to block. So this is why we go always with two nodes of DevPy deployed. Also in the infrastructure, we have deployed all the Windows Server 2016 nodes which are actually, they create a pool of nodes from which we always select pairs of two for each and every test run that we do for Nova Neutron and all the other components. So these are shared and recycled. We do not redeploy those nodes for each and every run. We just clean them up and reuse them. The undercloud model, it's actually defining the OpenStack undercloud but on which we actually spin even today on the new CI, we spin DevStack virtual machines that will become the controllers for all the tests. So we have a virtual machine which is DevStack, which uses a flat networking to communicate with the bare metal Hyper-V nodes that we have deployed and are available from the previous model. Most of the CI, as I mentioned, use this format of undercloud and overcloud. This is because we still need to get as dense as possible in our environment. There are a few CI switch do not allow this, but as often as possible we try to create, to use virtual machines to do that. So we are actually having a few limitations here because we use the same flat networking as a data plane for all the tests. We have to make sure that, for instance, the VLAN ranges that we use in the test are separated. So each and every run will actually query through a script, query a database table, and reserve a particular VLAN range. Because if not, you will end up having multiple runs and multiple instances of DevStack that can create traffic and overlap between them. And although you have theoretically a direct connection between the compute nodes and their own controller, this traffic will actually interfere and you will see that some of the tests will actually fail due to this. So it's really strongly recommended to separate all the time the VLAN ranges if you have multiple tests running in the same environment. In case of Cinder, we have another node because we also test Cinder on Windows. So we need a third-windows node which is also going to be allocated from the same pool that I mentioned before. All the build process happens in parallel. So whenever after we allocate all the nodes and we start the DevStack VM, we start in parallel building on both DevStack and all the Windows nodes. Currently, we are still looking on ways to optimize the DevStack build because we ended up basically the Windows nodes, since they have less resources and less processes, basically, to build, finish in roughly around 10 minutes while the DevStack build takes around 20, 25. So we are still looking on ways to minimize this. One of the things that we do, the image from which we actually start the DevStack, it's prepopulated with all the GitHub repositories. And we update that image constantly, basically once every two, three days, to pull out on all the projects that are required to pull the latest information in all the GitHub repositories. Now we're getting to the other side. As I mentioned, most of the projects can take advantage of the undercloud, overcloud, but there are some on which we cannot afford this luxury, unfortunately. All the projects that run obvious and use obvious as an underlying layer do not take advantage of this. The main reason is the fact that it's very difficult, if not impossible, to ensure that there are no leftovers to say so after an obvious run. Because on Windows, for instance, if anything happens and the obvious build is not correct, you can get a loose screen, you can get leftover components in the kernel space, and you never know that after the cleanup, the node is actually clean and ready to be reused, or you have things that remain there, like ports or other things that you couldn't clean up, and those will affect the subsequent tests. So in these cases, we always go with deployment from zero, basically. We deploy the full operating system and all the packages that are used for testing on top of that. In order to optimize here as well the functionality, we actually use a node pool. We have a modified version, since I know that the community has also a node pool. But in our case, what node pool does, it actually ensures that we have nodes registered in Juju and actually have only the OS and the Juju agent deployed, and we always have those nodes ready to be used by Juju whenever a new request comes. So we always pair those 10 minutes to say so that might be taken to deploy the operating system when we need to do a test. So at the moment, the way the OVSI is working, this is a CI that actually covers not directly the OpenStack project, it's a CI that covers the pure, obvious contributions. And unfortunately, in this case, we do not have the luxury of taking advantage of Garrett and other things. They only use mailing lists, and besides mailing lists, they have only the repository. And we can just query the repository to find out when there are new commits. We are in discussions with the community to be able to implement actually GitHub hooks. So from outside, our jobs are going to be triggered whenever a commit gets in. We have for OVS two steps, because for OVS we need to ensure that the system works fine. The first step that we do is run unit tests on one single Windows node. So we run unit tests, and we build basically the OVS installer, and once the OVS installer runs and we see that all the unit tests pass, then we move forward to having a full integration test where we actually use the generated installer. We deploy it on two Hyper-V nodes. Together with the Nova and Neutron component, we deploy a DevStack VM, and we use that to actually do a full integration tempest run. Currently, on this system, we are reporting upstream to through mailing list. So only on the mailing list, we are reporting all the unit test runs with success or failure. And details, we are not reporting at the moment. We are still evaluating the reliability of the CI to make sure, and we plan to actually enable reporting in the next couple of weeks for the integration test. For monitoring, since the CI is nice, it runs OK, but we always need to know what happens, actually. And if there are errors, what's the reason for that? For monitoring, we have chosen Zabix. We are actually monitoring OS-Level information. We are monitoring OpenStack services for the Hunter Cloud. We are monitoring the status of Hyper-V node and status of the networking. This is actually work in progress. It's not fully done. We had it implemented in the older CI, and we are now migrating all the components to the new deployed environment on the Quanta gear. For monitoring and being able to actually identify which node is located where and has issues, we also use rack tables. And we integrate rack tables with Zabix to be able to see from the rack tables interface if a particular node had particular issues. I will actually switch a bit to a browser to show you some live information. Actually, let me try and connect to the environment. Is it visible? To turn on the? Like, I don't know, is it one of these? Like function f of lemon or something? I mean, one of those, function keys. Oh, right there, right there. Applicate. And I guess it's a bit small. And just like control plus, plus, plus. Yeah, I would love to. But over here, it doesn't work. So in order to see the, can you actually see what's written there? OK. In order to see the status of the infrastructure, we can use Zool's status. And minus m means selecting the particular node. Before we start, will someone order Tavi, Uber, just in case we have to go and fix it? Sorry. He's running on production systems. Yep. Yeah, that's better. So this is actually a view of the infrastructure model. It actually has most of the nodes. On top, we can see the list of so-called applications, so basically types of charms that are deployed. It's Jenkins. We have a Jenkins slave, which we have one slave for each Hyper-V node. We have the logs component, the Ubuntu cache repository. We have the active directory and the Hyper-V nodes. Here we have a full list of servers and components. This is the list of actual machines on which those applications are deployed. And at the end, we have the list of relations. The relations communicate, establish a link between, for instance, between Jenkins and the slaves. That means that when a relation is set between a slave and the Jenkins, that slave is registered in Jenkins. So the process actually has two stages. Stage one is to deploy the slave. And stage two is to establish that relation between the charm which represents the slave and the charm that represents the Jenkins master. If we can actually destroy that relation and when we destroy it, basically all the slaves remain registered and installed on the Hyper-V nodes, but they will no longer show up in the Jenkins master. For some reason today, I always try to type Zool instead of Juju. Should go to bed earlier. And since I shared the screen, I can no longer see the last line. This would be interesting. This is the list of actually all the containers that we run with all the Zool processes that we have. As I mentioned, Gehrmann is the first one. It's deployed directly on one machine, the machine number one, and all the others are deployed as LXD containers on the same machine as the Gehrmann at the moment. A nice review. This is actually the interface that Mass provides where you can see the status of all the nodes. And I can bet we even have a few failures over here. These are actually nodes that are these ones in the release. Yes, as expected, since we know that there are a few bugs still in the OVSI, these are actually nodes that had some issues and could not be properly released after we ran some OVSI integration tests. I was going to say I know that rack. I know that rack. One of the advantages of Mass is that you can actually see a lot of information on every single node that is here. Basically, whenever you add new hardware, you just plug it in, you cable it, you power it on, and it will show up in Mass directly as a new node. You can run a so-called commissioning script that actually will identify all the hardware and will get you all the information about the node. For instance, I have already opened a couple of tabs. For instance, this is one node from the rack where we have all the controlling data plane. And we can see a small summary. If there are any tags, we can actually see the power type, which is in this case IPMI, the IP user and password used for connecting. The list of interfaces, as you can see in this case, it's rack 12. This is actually one of the nodes that is dedicated for the obvious testing, and this is why we have such a large number of interfaces. We have one pair of 10 gig interfaces, one card with two ports, we use that, and we use also another one. The first four interfaces are some interfaces, also four interfaces of one gig, which are actually on the same card. And we use that card and the one with 10 gig to also test whenever we need to test bonding, and how obvious works overbonding. Because we also use, from time to time, when we need to do performance tests, we can manually allocate one or two nodes from this rack. We basically stop the Jenkins slave on them, and we can manually allocate them for performance testing for a certain period of time. So this interface actually gives us a lot of flexibility to bring nodes in and out of the running CI whenever we need to. And just a quick look here, this is basically how our Jenkins looks with all the jobs. For instance, these are the center jobs. They are running fairly stable at the moment. This actually, the one we tried, as you can see, it was not running since a long time. It was using still Windows Server 2012 for a Cinder node. So we switched to 2016, and since then the system is way more reliable. This is getting back to the presentation. And of course, now I have to find a way to. So just a couple more things while we're still here. One of the things that we sort of ended up doing as a result of having so much hardware and getting used to using the Jenkins model of processing CI jobs is applying the same model to do operational tasks across the data center. Not a perfect example of that is we have, in the 19 plus racks of equipment that we have, we use a PC managed power PDUs. And when we got those, we need to both enable SSH and firmware upgrade them. So essentially, through a series of parameterized Jenkins jobs, in under two minutes, we were able to upgrade the whole data center at once. And essentially, the upgrade process for each device was essentially FTPing three independent files with a reboot in between. And we were able to do it. It's funny because when I've been walking around and talking to other ops teams that just manage IT, I asked them, hey, so-and-so at big company, how long would it take your IT team to do that task? In most cases, they're like, well, I'll take that one guy the one week or two weeks to walk around the entire data center and do it manually. So I think one of the things that I've been learning out of this process personally is the tasks and the mechanisms we're using in continuous integration can actually make ops life a hell of a lot easier, but it requires some rethinking about how you're scripting because it's a lot easier, from my perspective, to parameterize the Jenkins jobs and just pump through a loop of sending curl commands than encapsulating all in either Python or Bash or whatever. So anyway, we'd like to open the floor for questions. Go ahead. So essentially, you can think of it as as we have two completely independent networks. We have one that's not connected to anything, and then we have the one that is what we call management. The one that I say is not connected to anything, the only way to get in and out of it is through our neutron controller on the under cloud. So we isolate, that we use that one. So if you think about that, the VLAN, remember, VLAN is an L2 isolation mechanism, right? So we use VLANs on the data plane network side to create the 4092 network segments, and then we take a bunch of those network segments and essentially deploy identical environments across those ranges of VLANs, using the VLAN as the mechanism for isolation and the external floating IP coming out of the network controller to get back and forth between what's going on inside the data plane infrastructure and, let's say, production services or helper services on our management infrastructure. So does that answer your question? So there, once again, RCI infrastructure is external from CloudBases Network. It's external from Microsoft's network. It is its own dedicated infrastructure that I built for this. You have to realize a lot of this had to do with the way things happen historically, right? You know, in all honesty, like I said earlier, when I started at Microsoft, their typical corporate lab infrastructure was heavily proxied, no direct internet access, all this stuff. You can imagine what a nightmare it was. So we had to literally gorilla style and say, we're going to ignore all the rules. Now I'm not going to go and advocate this to everybody because we have a very unique circumstance that we're trying to change Microsoft. So I was fortunate enough early on in my career to work with Microsoft and a lot of IT people in Microsoft where I would be showing up to a room full of IT people and I was the only IT person coming out of the side of the organization I was in. So I ended up early on, got the respect of the gentleman who runs information technology along the East Coast and essentially, you know, we've helped each other out through the years and he was the one who allowed us to have this environment which is not a standard environment from within Microsoft and because, in all honesty, because I had been part of the team that put Linux on Hyper-V originally and we had gone through this process once before getting Novell to have the type of environment necessary to run the testing we were doing at the time which was non-standard because of things like needing SSH access, needing actually direct internet access, needing those things, right? That we had already done it so many times with him before that he knew exactly what it was and was able to help us navigate through whatever we needed to navigate to to get what we needed done. Now that's the thing, we've been very, very fortunate and lucky to have, you know, key individuals inside the Microsoft organization to help us get there but we wouldn't have been able to do the level of detail and automation that we did, you know, in all honesty had we started this in Redmond just because of the nature of the bureaucratic policies of information technology departments and, you know, everything. It's just, it's tough, right? And that's, and I'm sure, you know, everybody who's running Continuous Integration in an enterprise business environment, you're going to have the same bureaucratic political, you know, management type discussions that we've had for years now, right? So, hope that answers your question. I'm going to make it very quick because I know we're getting towards the end of the time. First thing, what you're saying there about, like, separate network and that, we found the exact same thing I was previously working in Intel and we had our own third party CIs. Keeping that as far, like, in its own separate thing gave us so much more freedom to actually build up that infrastructure and do things the way we wanted. The second thing is you were talking about open V-switch and the problems in getting patches that you wanted to test because they don't just get it. As an informational thing, there's a tool called Patchwork, which is the open V-switch mailing list you use. That's got 2.0, I think release candidate, one of that came out about a week ago, has support for this thing called Chex, which lets you post results for CIs up there and it also has a REST API which you can use to pull in patches and drag them into your CIs. So that's probably something that you should, you could do with having a look at. At the moment, the way we are doing, we are actually posting directly to the mailing list. We actually send emails to the mailing list with the results and we use polling for checking if new patches are committed. We also have a script to check that if in the interval there were more than one commits that got in, we are actually going to go to all of them in the order they were actually received. But thank you for this. Yeah, so, yeah, serious support and stuff, all of that. It will definitely make our life easier. Any more questions, guys? Ladies, go ahead. Well, you have to realize when we started doing this, all there was was DevStack. It runs fairly fast, it runs, it's fairly reliable and it really keeps close with all the components that we need. For instance, when we test, we didn't find any other direct option that we can actually just take, use and take advantage, if we test Nova, we can actually use DevStack and let it know that Nova should be from a different repository to pull that, so on. So, we take advantage of getting Git Prep, which relies as well on the structure, on the structure. So, it was the easiest way to go forward. You know, in a lot of cases, we take the path of least resistance. You know, we're not, in all honesty, I am technologically agnostic, right? Like I had to be. So, we don't, you know, it's, we do whatever it takes to progress what our, you know, our testing and our ability to maintain this stuff forward. So, you know, we're open for ideas always, and we're willing to share. So, go ahead. Sure. Thanks. We'll be outside.