 Hello and welcome to our talk. We are Dmitry Tansur and Delya Tengov. We work for Red Hat. We are on the OpenStack Ironic team. Our talk is called Bare Metal in the Cloud. Isn't it ironic? In this talk, we're going to explain what the bare metal provisioning is and why it is becoming increasingly important these days. We will introduce you the Ironic project and its place on the cloud landscape and it's working based on a simple, on a trivial workflow example. In the end of this talk, we'll give you a glimpse of what the future holds for Ironic. The idea behind bare metal provisioning, bare metal allocation is simple. The idea is to let the cloud user to request and subsequently receive the bare metal machine, the physical machine in the very same way as they do for virtual machines. The reasons for this desire are many. Just to name a few, some applications rely on hard to virtualize hardware like GPU or FPGA. Some applications deal with sensitive data and therefore require perfect tenant isolation for security reasons. Some applications just need raw computing power. Among all these reasons, one of the emerging one, I would say, is the desire to manage the cloud infrastructure itself. Because in the end the cloud is software and it is running on the bare metal and when you have a large cloud, a large set of computers automating all things bare metal, all things deployment makes huge win. The Ironic project is a bare metal provisioning or orchestration agent service. It's a web service. It's API driven. It's shipped with CLI and graphical user interfaces. It's quite old. It's been around five or four years. It's been used by many cloud operators around the world. That probably explains why it has quite active and diverse upstream community and many contributors. Besides that being a hardware management tool, Ironic enjoys deep and wide vendor support. Vendors join the upstream effort. They contribute their drivers for their hardware to make their hardware compatible with Ironic. Ironic started its life as a fork of an OpenStack novice bare metal driver. Since then it has grown into a standalone, I would say, sophisticated tool. It can be used as a standalone tool for hardware provisioning for whatever purpose or it can be used within OpenStack. In any of these configurations, the Ironic system consists of the orchestration agent called Ironic conductor, the deployment agent, which is running inside the bare metal machine when deployed, and the restful API for communicating with the world. If Ironic is used within the OpenStack cloud, NOVA has to have Ironic driver. Often times, more often than not, the hardware, the servers designed for data center use, they are not just the motherboards. They are accompanied by a small satellite computer called BMC, Baseboard Management Controller. This computer is always up, it's always on the network, and most importantly it has very intricate relationship with the main system. It can manage it in many different ways. Inside this BMC computer, there is an agent or software running to control it. This software is communicating with the world by means of a specially designed hardware management protocol. There were many protocols in the past, the most modern perhaps and emerging protocol is called Redfish, and it is rapidly displacing the previous king IPMI and many other vendor specific protocols, because vendors are vendors, everyone has to come up with its own protocol for that thing. BMC can do many things, the most important one once I would say is the ability to power cycle the system, that is to control power remotely, and changing boot configuration to boot system at least from the network or from the local drive. Most sophisticated BMCs can do magical things, they can change bias configuration on the machine, they can deal with hardware rate configuration, they can use virtual media boot which is a way to boot a machine from virtual CD with an image supplied by the operator, and many other things. We are going to explain how Aronic works based on the bare metal machine life cycle. To make it more easier to explain we separated this life cycle into pieces, preparation, deployment, and the tear down. The preparation step starts with so called inspection. Whenever Aronic becomes aware of the bare metal machine to manage, it can gather more information about it, meaning the hardware configuration capabilities of this machine which is useful for scheduling, for picking the machine out of many, and possibly for inventory information, for putting this machine into the context of the whole data center inventory. With Aronic inspection can be done in two ways. One is called out of band and the other in band. The out of band inspection is based on communicating with the BMC. So Aronic goes out and talks to BMC and asks BMC what kind of machine is this, what is serial number, what is CPU, what kind of memory it has. The in band inspection is performed by the Aronic agent running inside the RAM disk, which is booted into the machine being inspected. So Aronic boots this RAM disk, runs the Aronic agent, and then the orchestration agent, the Aronic conductor, talks to this embedded Aronic agent inside the machine and gathers all this information, all the above information. And beyond that, the Aronic agent can run benchmarking. It can listen on the network for say LDP frames to learn to which port this machine is connected, is plugged in. Once the inspection is over, the next step, which by the way can be run many times in different contexts, is known as cleaning. Cleaning can also be done in band and out of band ways. The idea of cleaning is to make sure that the machine being deployed is like new. It can be recycled from previous installation, but our goal here is to make sure that every time we deploy new machine it's the same. During the cleaning phase, what Aronic can do is to reset bios settings or apply new bios settings, reassemble the hardware rate, update the firmware. These things are probably more often done out of band. In band cleaning what we can do, we can wipe out local drives from inside, which is easy. We can sometimes deal with hardware or software rates, it depends on the machine, on the hardware. By the end of this cleaning cycle, we have like new machine, factory reset machine. And with this, I'm passing mic to you. Okay. Now, we have a pool of machines all cleaned already for deployment, all more less uniform. What happens when you deploy? Of course, you start with picking a bare metal machine for your deployment. Historically, we have been relying on the OpenStack Compute service for this goal. But in the upcoming release, we're going to introduce our own API for picking a machine based on certain simple criteria. It's class and it's trades. Now, after you've picked the machine, the deployment starts. You connect it to the deployment network, which is a network where the deployment agent will run. You configure a boot of the deployment RAM disk via pixie via virtual media. I'll go into some details a bit later. The deployment agent starts and orchestrated by the Aronic Conductor. It does actions like partitioning the target device, flashing the image, configuring local boot, maybe some custom actions. When it's done, we disconnect the deployment network. We connect the networks that were requested for the instance. It may be one or more. Here is a booting of the final instance, again, net boot or local boot. And we reboot another final instance. At this point, you can use it. Let's cover a few topics in depth. Networking. As I mentioned, we have two kinds of networks we operate with. We have service networks, which are used for cleaning, which is for provisioning, for process code rescue, for inspection. Essentially anything that happens in-band. And we have ten of the networks that you requested for your machines. And we have three network management models in Aronic. First, you can use your infrastructure. Whatever you have in your data center, we can use it. You have to pre-configure the HTTP to boot the deployment RAM disk. And you have to request your instances with local boot. So every time they net boot, they get deployed. Every time they boot from local disk, you just get into instance. We have good cooperation with OpenStack Networking Service called Neutron. We can use it in two modes, with shared network. Essentially, all networks are flat and are shared between different machines. In this case, we only use Neutron for DHCP options and for pixie configuration. And the most interesting of all of these, we can integrate with Neutron for advanced switch management. In this case, we can ask Neutron to switch ports to which the machine is connected to on different VLANs. Essentially allowing integration with virtual machines running on OpenStack compute, allowing tenant isolation, allowing isolation between service and tenant networks, which is quite important to make sure, for example, your tenants cannot access your control plane and cannot access each other. After the boot configuration, I mentioned pixie, which is an old, old technology based on TFTP protocol. And when you provide DHCP options for your machine to boot from sort of a network location, we support IPXE, which is an improvement on Pixie using HTTP instead of less reliable TFTP. More interesting, as Lea mentioned, we support virtual media, which allows Aronic to talk to the BMC to instruct it to boot your machine from an ISO image located on HTTP, CFS, NFS share. Aronic has a quite decent support of UFI, including a support for secure boot in certain scenarios. We also support hybrid bias and UFI images with supports creating partitioning table for UFI and for bias, including some rare cases like using GPT with bias legacy boot, sorry. When we deploy the images, again, we have several options. They all are concentrated using this ironic agent running in-band on the machine and temporarily in memory RAM disk. The old way was for the agent to create an ISO target and for conductor to connect to this target and to flash the image. New or more reliable way which we have supported for quite some time is for the agent to pull the image via HTTP from your HTTP server or from the OpenStack Swift servers. It can do in-memory conversion, but if the image is raw, image already we can stream it directly to the block device, allowing to skip any caching in memory. This is quite efficient. Moving into supporting bit torrents to simplify simultaneous boot of many machines with large heavy images over maybe less reliable network. So if you're interested in this, come contribute. We need help. Other features that I'm not going to go into details are firmware updates for certain vendors, serial console support for your Aronic instances, rescue mode which you can boot around this right from your instance. For example, if you need to repair it because it does not boot any more, you can boot around this SSH into it, mount the disks, do any reparation and then return the instance into operation. We support port groups, which is port bonding essentially. And the thing we are working on, I think called deploy templates, which is a very interesting way to run the same reparational steps we use for cleaning, like building rate. But during deploy time on demand, for example, a user of the OpenStack compute will be able to say, build the rate, get me an instance with this rate template, and you'll get an Aronic automatically build this rate for you and return an instance with rate already built. Then you will not need to have a pool of machines with specific rate or bias or any configuration. We are working on graphical consoles, essentially allowing you to VNC into your instance. We are also researching new use cases where we will sort of like your input on keep it converged infrastructures and continue engines which use Aronic as a satellite service for scaling up, scaling down of installation. I know people are working on Kubernetes integration for Aronic, for example. I forgot the term, but yeah, for essentially, as part of the cluster API. We are looking at each architectures and various approaches to make Aronic work better in highly distributed, geographically distributed environments, including attempts to improve, attempts for federation of Aronic across geographical locations. And that's it. Come learn more from our documentation. It's quite extensive. Come talk to us. We have a mailing list. OpenStack discuss. Come talk to our C channel. We are very friendly. We don't bite. Of course, we welcome contributions. We have very active community. We would like to review your code. We want to hear about you use cases. It's quite important for us to know what people would like to use Aronic for, how they would like to use it, that will allow us to plan our features. Next. Any questions? How does Aronic boot when we have UFI ISO? As far as I remember, there is an option. It's actually used on installation ISOs of other Linux distributions to create a universal directory. I forgot, but how to not boot what is already installed. So, all BMCs allow a feature that's exposed in APMI, everything to configure the boot sequence. It's possible in legacy mode. It's possible in UFI. It's sometimes different in this mode, but we abstract it all for you. So, we tell it to boot from network, or boot from virtual media, this ISO, or boot from local disk. We can control that. Yes. That's a big pain point for... How do we make sure that the tenant didn't mess up the firmware? This is a difficult topic, and essentially, I know that public cloud providers who use or used to use Aronic, yes, there are public cloud providers who use Aronic, they cooperate with vendors to make sure that it's not possible. Because it's kind of like catch-22 here, we can update firmware from Aronic, we can reset bias, but if something from inside already messed it up, it can just ignore our request. So, there's no golden peel here. If you... Tennis are completely untrusted, you have to cooperate with vendors to make sure you can disable this messing up firmware from within. Will we trust the vendors on this? Yes, sometimes. I have a question. How is it feasible or is it those days to actually deploy Aronic without OpenStack? Is it possible to deploy Aronic with OpenStack? Yes, it is. It has been possible for quite some time, and we're working on improving that. For example, I'm working to make the dependence on RabbitMQ as a medium optional, and I have a patch for that. There is a project called Byfrost, which allows deploying Aronic without OpenStack. You can deploy Aronic with some OpenStack bits, pick Keystone and Neutron, or what without glance, using HTTP locations. But yes, it's very possible and very, very interested in the people doing that and giving feedback. I don't hear you, I'm sorry. Yeah, of course, HTTPS is also supported, and you can provide custom certificates for that. And HTTP process settings as well. Similarities and contradictions with Formant projects. I haven't looked at Formant in depth, to be honest. There are certainly quite some overlap in provisioning part. I think less overlap in life cycle part, as far as I know, like BIOS and all the things about returning instance. Our mission is a bit more towards serving bare metals to clients, like in a bit of a cloud approach, even without OpenStack still. Like API-driven, uniform, because there's cleaning features. I think Formant is a bit more like tight to a suitly installation approach, but again, part of my ignorance, I'm maybe missing something in this. But choosing a bare metal node, are you planning to use a placement project? So whether we plan to use the OpenStack placement service for picking a bare metal node. Both. The answer is both. The work I'm doing allows two backends for this, we call it allocation API. It can be database-based for the case when people don't want to use placement. But my next plan is to use placement as well, because it will allow Ironic's standalone API to be used without conflicts with the Nova Compute API on the same set, on the same pool of bare metal nodes. So yes, we plan on that. Contributions welcome, by the way. Anything? You get a lot of jokes about the Ironic name. Do we get a lot of jokes about Ironic name? Oh yeah, and we make a lot of jokes about Ironic. Isn't it Ironic? Anything else? Yeah, the question is about some project. I can't hear the name. Digital Rebar? No, I haven't heard about it. Okay, if no questions. Thank you. Come talk to us.