 All right. Good morning, everyone. Welcome to Docker and Ironic, a match made in heaven. My name is Scott. And with me today are Vlad and Bernard to help present. And we're all folks from New Agh Networks. And I'm sure everybody recognizes the Docker Whale. I'm not sure that everybody recognizes Pixie Boots, the ironic bare metal bear. But that is the other key logo for this presentation. So what we're going to talk about this morning, I will give a very brief introduction to containers since I'm sure anybody who's been to any of the other Docker sessions will already be familiar. Talk a little bit about deployment approaches for containers in production data centers. And then we'll talk about why Ironic helps with that and walk through an example deployment demonstration. So containers versus virtualization. Traditional virtualization, you have a hypervisor or a host OS. Then you've got a hardware abstraction layer. And then you've got the OS within the VM itself. And then you've got all of its libraries. And then you've got the app sitting on top. Containers are a much lighter weight solution because you get rid of the hardware abstraction layer and the additional guest OS. You're running your applications with some custom libraries and binaries on top of the host OS natively. So rather than having multiple layers, you're just dealing with processes. So containers are great for some things. It's a single OS to manage, lower overheads, better hardware utilization. With Docker, the really neat thing is the lifecycle management. And you get very quick launch times because you're not booting an entire OS. You're just moving. You're just launching a process. But all good things in life don't come for free. Containers have issues, too. It's Linux on Linux only. And there are a bunch of issues around security. Because you're sharing the same kernel, even with C groups and all of the other goodness that people are constantly putting into Linux, there's still issues with multi-tenancy around process coexistence, around network, and around storage. So one network approach that you'll see that I call the Russian doll approach is, eh, just throw it in a VM. That's great. It is the easiest way to get started. For some workloads, performance is perfectly acceptable. But the overhead is substantially higher. You're losing a bunch of the bare metal flexibility that you get with Docker. So for optimum performance and scale, you just deploy containers on bare metal. You put your hosts on a network in your data center. You build lots of really big homogeneous clusters. And that works great for people like Google, Facebook, Twitter, where you've got one application. It doesn't work so great for people, for instance, in financial services, where you've got high security applications and low security applications. Hospitals are another good example. There are places where you need to secure patient information, and there are other applications where you don't really care. So you've got all of these security considerations where a rogue container on hypervisor one, or on Docker host one, can get potentially to other containers here and get to other containers throughout the cluster. So the easy way to deal with that is separate clusters. High security cluster over here, low security cluster over there. And you just build that out. That loses a bunch of flexibility because if you don't size your containers appropriately, or size your clusters appropriately, then you end up needing to take hosts from one, move them to the other, and re-network things. So avoid it if you can. The other thing, containers don't exist in a vacuum. Containers need to talk to VMs. Containers need to talk to bare metal. You need to have some way to make sure that your containers are only talking to the things they're supposed to talk to. And with the way Docker networking works, that's difficult to do because of all of the layers of NAT and such. So how do we approach this in a simple way without requiring changes to Docker, without requiring changes to many of the things that people are already deploying? I'll hand this over to Bernard and he'll talk through one approach to that. Thank you, Scott. So one of the solutions that we believe we can build is by using Ironic. So for a long time, OpenStack has been mainly focused on virtualization use cases. So typically using Nova Compute together with KVM to start VMs. Since a couple of cycles, Ironic is now in the picture to allow you to boot bare metal servers directly with Glance. So typically, you have a Glance image that will load into your bare metal. So that's basically Ironic with OpenStack. So how does it actually work? Well, Ironic is connected to the power control interface of your bare metals. So this is typically an IOM interface or IPMI, which will allow Ironic to boot, stop, or reboot bare metals and to later on PXE boot an image. So typically, if you use KVM, you use Glance to store your virtual machine images. With Ironic, you will use Glance to store your bare metal images. And in the same way you will do with KVM, you will PXE boot those images directly to the host. The host will boot and get started directly on the bare metal. So the other thing we built specifically with Nuage is to Ironic integrate nicely with hardware gateways. In this case, the Nuage gateway, which will allow on the go Ironic to reconfigure the port from that gateway to allow network connectivity and network provisioning from the bare metal to the border over there in order to be able to reach VMs in your data centers, for example. OK, and so typically one of the problems that Scott highlighted is that security issue. You want to separate clusters, right? So in this case, we have tenant 1 and tenant 2. And what you want is to be able to have tenant 1 completely separated from tenant 2 both on the compute level and on the networking level. So how do you do that? Well, with Ironic you will boot a cluster of separated bare metals. But then you will also integrate and create with Newton two separated network that will be instantiated on the gateway. And so each bare metal will actually have a single link to a port of the gateway. When the bare metal boots, we will instantiate the network policy directly on that gateway, which will allow both separation on the compute level and on the network level on the gateway. The other advantage by using Ironic and bare metal directly is that you can actually use full line speed. So for example, if you use a 10 gig nick on your bare metal, you will be able to use those 10 gig because you don't need an OVS. Since it's pure bare metal, you directly output the frames on the wire. You don't need a laboratory neutral agent, for example, since our gateway is natively capable of doing the network functions such as routing and switching. So with this solution, you can have full line speed, 10 gig or even 40 gig. The other thing you want to be able to do, and this is very important, is to be able to bridge between your bare metal and the VMs in your cloud. So typically, I guess, you're an open stack. You've got another compute, a couple of KVMs, plenty of VMs running in the back of your cloud. What you want to be able to do now is to bridge this with the bare metals that you start. And by using Nuage and Neutron with the plugin we got today, you are able to basically instantiate a network policy on the gateway and to bridge into a single overlay bare metal and VMs. And finally, the thing, since Ironic is a module from OpenStack and it's nicely integrated and so on, what you can do is use heat to have one single API call to create a whole stack, which means on one side bare metal, the other side VM, and then network provisioning in between. So you can really easily create a template, start that template using an API call to heat, and that template will create everything from bare metal to VM to the network, both on the gateway and in the overlay. So I think it's time for a bit of a demo. What you want to do is highlight the use case that Scott presented, which is basically a Mezos cluster launch. So I don't know if you are familiar with Mezos, so let me quickly go over it. What you want to do is you want to have a single tenant Mezos cluster, which is going to run containers. So the way Mezos works is basically you've got Mezos master, which is basically, you can see it as Nova scheduler for containers. And then you got Nova Mezos slave, which is the workhorse, which will start the jobs as containers on the bare metal directly. So how do we want to do this? We want to have the Mezos slave on the bare metal directly, and we want to have the Mezos master as a VM. And of course, you want network connectivity in between by using the new hash gateway and our overlay plugin. And again, in this case, it's a single tenant. And what you want to be able to show later is to add a second tenant and have both nicely separated. So here we go. So we got a whole open stack cluster. I'm going to log in into the controller as tenant one. Here we go. Perfect. So initially, there is nothing, no instances. Exactly. And we pre-provision one network, which is the network you want to use to bridge your VMs and your bare metals. So let's go in CLI. Let's use Ionic to show all the nodes. And in this case, we got 10 bare metals. We got 10 bare metals nodes that you can use to start your image. In this case, it will be a Mezos slave. Again, why? Because you want to run your Docker on bare metal directly. Each of those bare metals are connected to one specific port on our new hash gateway. And this is the key integration we got with new hash. You can specify which bare metal is connected to which gateway port. So let's start the whole stack with heat. I'm going to orchestration. And I'll launch my stack, which will define my VM side, my bare metal for my Mezos slave, and the network in between. Here we go. So Mezos cluster one. And all of those are the parameters that we need to specify in order to launch that stack. OK, for the bare metal, we used the Mezos bare metal image. We pre-provisioned that image in Glens. For the VM, we used the Mezos master image. Again, it's in Glens. And we launch it. Here we go. OK. It's creating it. So beautiful representation of all the objects interacting together. And it's building it. So it takes a bit of time to boot the bare metal. About five, six minutes. Why? Because if you're a bit familiar with Ironic, you need to go over a nylon or IPMI boot up, get an agent, then reboot it, and get the final image of our PXE. That's for the technical details. So here it is. At this point, it's spawning it. And if I go back in CLI and show all my nodes, what I can see now is that I should get two nodes, which are in weight callback. So that's basically I pushed my image to the bare metal node. And I'm waiting for the node to come back. OK. Here we go. And finally, I skipped about five minutes of video here, but we got both bare metal, as you can see, and one mesos master, and one marathon to manage all your docker, container, and jobs. OK. So again, what's the key value here is the fact that between your bare metal and your VMs, we were able to stretch one single network dynamically to create both the overlay and the gateway provisioning to enable to have those specific two ironic nodes on the network in order to be able to reach those VMs. And the nice thing is ironic objects show up in NOVA. So we have those four NOVA compute objects. Here we go. OK, so let's connect to mesos. At this point, what you can see is this is mesos master. We got two slaves, which are the two bare metals. So this is sounding as a VM. And the two bare metals booted up, and we were able to join the mesos slave, which means at this point, mesos should be able to accept jobs and launch them on those two bare metal slaves. Again, why? Because we bridge the network in between. What we got also is 16 CPUs. This will be important for later. Then we got marathon, which is, you can see I said, orchestration for docker. Integrated with mesos. Here we go. And the next thing I'm going to do is launch a job. Let's say you want to launch a dupe, for example. You would push that job to marathon. Marathon will schedule it on mesos, and mesos will push it to the two slaves on the bare metals. Here we go. OK, so at this point, I scheduled my job on marathon. Marathon is, so that job is made actually from two sub-tasks, you can see it over here. And if I go to mesos, I can see that those two tasks are staging. So this means that it's being pushed to the bare metals, to the slaves. And the slaves are downloading all the executables in order to be able to run it. And you can nicely see it's actually load balanced across those two nodes, so 58 and 59. Here we go. And now it's running. So let's scale it a bit. At this point, it's two sub-tasks. Let's go to, I think, 40. And what we integrated into this demo is an auto-scaling capability. So I want to scale it. Let's say you want to grow your ad-oop cloud to 40. In this case, it's going to push those 40 sub-tasks to mesos. Here we go. And you can see mesos is scheduling them one after the other on those two bare metals. And what you can also see is on the lower left side over here, little by little, those 16 CPUs, the 16 logical CPUs are being used one by one up to the limit. So we're at this point at 12, 13.5. Here we go. 15.5 and 16. Perfect. So at this point, basically mesos is completely overloaded. It cannot start any more job. Marathon is being stopped at 32 out of 40. Each sub-task being one half of a CPU has been defined here. And so the thing we integrated is that auto-scaling capability. Basically, if I refresh my instances, what you can see is automatically we spawn a new bare metal on them dynamically. So at this point, Ironic is creating that new bare metal. We've got now three bare metals, three slaves that can run Docker. And you can see that the first slave just joined the cluster over here. And one by one, the remaining eight jobs are being launched on that third bare metal. And we've got 24 CPUs. That's it. It's scaling up. So again, by using our gateway and Nuage, we're able to dynamically add those ports into that overlay network and bridge it with those mesos and marathon VMs. Here we go. And what I want to show you, which is key for this demo, is the VSD. So this is our policy abstraction, which is actually running the whole cloud, both the gateway and the overlay to the VM. And if I connect, I can see my network that I use to bridge both the bare metals with the virtual machines. And what I see is typically those two objects on the top in green are my virtual machines. And then I got three objects lower down, which are my gateway ports. And you can see, for example, this one is port 114. And this was created by Ironic and Neutron together. In order to be able to, when you start your Ironic, when you boot your Ironic, this port is dynamically put into the same network as the VMs. So it's one single tool to bridge both your network over bare metal and virtual machines. And I think Scott is going to go over the second part of the demo. Thanks, Bernard. So Bernard shared how to build and scale one cluster with Ironic. But the interesting part is the flexibility of being able to build and scale out multiple workloads and multiple clusters with Ironic with separated tenants. So we've got tenant one that we already built out. I'm only showing two of the three posts there, but you got the idea. And then we have the second cluster that we're going to build now. And what you have is traffic passing from the VMs through the V-switch on the hypervisor. And then it's encapsulated in standard VXLAN and can integrate with either a Nuage gateway, which has layer three capabilities, or we support a number of other gateways in layer two mode. So there's quite a bit of flexibility there. So let's switch back to the demo. And we'll log in as tenant two. And this is going to look a little bit familiar if you were paying attention. So no instances. We can go and take a look at the stack. We can launch the stack. And in this case, we're actually launching the same stack. Just a little bit of Benny Hill music to wake people up and to entertain them. We're launching the cluster in the same way. It was, for those of you who were entertained by the music and not paying close attention, it was exactly the same heat template that we were using to launch the second cluster. So we get a Mesos master node. We get a couple of slave nodes. Those launch quickly. And you can see that we've got three nodes that are active and two nodes that are in process. So again, you can see everything from Nova, which is great. And if we were showing Neutron, you could see all of the ports allocated to the different networks. And we'll see that in a second. So here we go. Stack is up. And we can go through exactly the same process of completely independently autoscaling the second cluster. The first cluster is happily running. As long as you've got capacity someplace in your data center attached to a gateway, you can use those bare metals to attach to any of your tenants. So there's no need to build out independent clusters and dedicate them to specific tasks. So here we go. We've activated two slaves. And we're going to use them to launch another job. This is a different job. This is the play framework. And we're going to scale this up to 40 instances in exactly the same way. So a little bit of a reprise. You can see the active tasks. The resources get used up until the point where they're exhausted. Now, if you were doing this for real, you would have the threshold set so that you wouldn't completely block launching new jobs on your cluster before you started scaling it out. But in our case, we decided that it was fine to do that. So here we go. We've got our next bare metal. That's happily launching. And it will shortly be connected into the network. Docker booted on it. And within five minutes, you've got additional capacity on your cluster, completely independent, completely separated, and fully flexible. So it really does make launching high-performance Docker clusters really easy and painless. And here we go. We're starting to launch the rest of the job and we're good. So ramping up, we've now used our 20 CPUs. We've still got a little bit of space. So now what I'm going to show is exactly the same thing. I've actually launched a fourth bare metal node here. But you can see that you've got the two networks, 10 at 1 and 10 at 2. You've got full visibility of both. And you can see that even though we've got ports on adjacent switch ports on the same gateway, it could be different gateways, it all works together. So a couple of other points. In summary, we've connected ironic heat and NOVA to build a flexible, configurable bare metal plus a VM-based solution using Neutron. We happen to have the only Layer 3-capable VX LAN gateway that integrates directly with VMs. We're working with other people to share with them what we've got so that they can also add Layer 3 and integrate with us. So that's coming. You'll have choices of other gateway vendors. And we're also working with the gateway team in Neutron to make this a standard API. But the key point here is the consistent networking and policy that Nuage supplies. And we also provide seamless inter-DC capability. So if you have a cluster in one data center and you've run out of capacity, you can bring VX LAN traffic through tunnels over to your other data center and have that seamlessly built into your cluster. You're going to lose a little bit in latency. Probably don't want to do that for long. But it gives you capacity where you wouldn't necessarily have it otherwise. So a couple of other points. I could do two or three other presentations on all the cool stuff we do. But networking policy across bare metal, VMs, and containers, we've demonstrated recently 100,000 containers across 200 hosts launching and converging with traffic passing in under 10 minutes. So you can have your bare metals. You can have your VMs. You can have your containers all working nicely together. And it all behaves well. It's scalable and forcible and is supported under OpenStack. So thank you very much. Anybody have any questions? No? All right. Oh, question. So why are you using a hardware gateway to manage the network? Sorry? So why are you using a hardware gateway to manage the network? Why are we using a hardware gateway to manage the network? Yeah, right. Because the other option of using provider VLANs, or provider networks on VLANs and running VLANs through your network, means that you're limited by the number of VLANs you run to any given cluster. And you don't have the flexibility. So with most modern top-of-rack switches, they can terminate VXLAN. And then you also don't need to bring traffic back to a central routing layer. You can route directly at the endpoints and send traffic straight to the other end. So does the containers on different hosts can also communicate with each other? If the containers are on a common network, then they can communicate with each other based on how you've set things up in Docker. So you have the outer enforcement layer of cluster one is a security zone. Cluster two is a security zone. And then you have all of the Docker security within that. But what you see is people don't believe that Docker security is adequate for many workloads. Because in our chart, I didn't notice that you are using any network manager to make sure that two different containers from two hosts can talk with each other. So I did not see that you are using Flannel or Vivo or whatever else to make sure that. So if you want to deploy other container work, or if you want to deploy a container network abstraction, we would strongly encourage you to deploy the Nuage VRS on your Docker containers. It works great. It scales to 100,000 containers. It's really fast. Flannel and Weave may get there eventually, but what we were focusing on here was the Docker on bare metal and the ironic use case. So like I said, we can't talk about everything in 40 minutes. But excellent question. Anybody else? All right. Thank you very much.