 So I think in theory we've got one minute to go, but no one more is going to get in the room Yeah, look people can still breathe crash in more people. No all right, so I'm gonna talk about running OpenStack on OpenStack and The reason this is an interesting thing to talk about This is the model that we tell people about OpenStack, you know, it's these lovely Big boxes compute networking storage. What we don't tell them is that this is what it looks like to talk about deployments This isn't a lovely pretty model And this is the reality. It's not because we've made it much more complex than it needs to be so We've deployed a public cloud and other people have deployed public clouds and none of these people have said hey It was so easy. We're gonna do that again and again and again and again and just forget about it Installation is quite straightforward. Although if you look inside DevStack, it's not all that straightforward But running it and upgrading it. They really start to get a bit more complex and So The complexity comes in partly due to that diagram back here But also because other things can go wrong. You can have bugs in the code You can have craft or entropy build up on a node and you can have hardware failures Now hardware is kind of the simplest thing to talk about make all the things high available Well, there's a lot of sessions this week on high availability for different services and and this is an important reason why the craft and entropy thing that turns up it doesn't turn up when you would install your environment and turns up when you upgrade from grizzly to Havana or Havana to some letter begin somewhere beginning with I and bugs Really really interesting to the focus we're trying to deploy from trunk even snapshots from trunk The thing about the release is that it's a well-known quantity other people deployed it three weeks before you You know what their experience was you can take an educated guess about whether you're ready to upgrade or not But if you're trying to run from trunk and this session isn't about why you should or shouldn't do that But if you're trying to or if you just want to have the ability to do it You have to expect that you're going to run into bugs daily new ones fun ones So you need to be able to do CI CD continuously integration the code and Testing your deployment before you do it or else you'll find out about those bugs when users hit them not when your test environment does But how do you make sure you do that with exactly the same tools and processes that you're using to do the deploy? If your test is different than your deploy the results of your test are no longer reflective of the results of your deploy So the triple O project open stack on open stag is about doing continuous integration delivery of an Open-stack production cloud or a test cloud, but primarily we're aiming at production We want to drive the maintenance and installation costs down they're not long enough at the moment How far down can we drive them? What heat? Plast Nova Bear metal We think we can drive them very very low We're going to encapsulate the installation and I prefer oh, I should I should say a caveat to the beginners I'm going to talk about some science fiction in this talk not not a lot We haven't got this thing fully working yet, but we're a long way down the path with Tim Miller who's been doing most of the work on the The heat rules for open stack for us was saying earlier. It's about 99% there. So Once we're at the point where we've got something working and scaling then we can look at actually making it better and better and From an operations point of view one of the key things is that we want the same API for deploying scaling and managing the cloud That we use to deploy scale and manage applications in the cloud because then the skill sets and the knowledge that you have are Reusable and the monitoring things that you put in place to monitor that API apply at both levels So this is the story that we've got Now it starts with a developer making a change because we're looking at continuous deployment deploying off of truck From the change that goes into the Zoolger sort of system We've got here in public open stack, but you could also run a private instance of all the code because it's all public it's all open and You'll build an image that contains The changed code if you made a change in Oslo That change might be five or ten images, but if you made a change just in Say the quantum API server you probably only need one new image, which would be a new quantum API image You deploy that to be a metal and you test it you make sure it works and runs its unit tests and that kind of minimum Degree of capability, but you then deploy it into a production bootstrapped cloud So you start a cloud on one machine you scale it out with this new image And with the old images so you're testing cross version compatibility, which is one of the weak points We currently have in our CRCD story for all of open stack But when you're deploying to an existing cloud, you don't magically flip all your nodes from grizzly to Havana You know, it's not like we're done It's actually even with the best deployment technologies around it's going to be a 10 to 15 minute exercise to do that for a large cloud So you're going to expect API requests coming in on one version and being satisfied by Backends running either the older version or the newer version. You've got lots of scheduling issues to worry about there We want to make sure it works. So we put load on a cloud and we do that upgrade Assuming everything comes out okay, we then take the result as permission to go and deploy and Here's the key thing what we deploy is then the same images we use to test we don't do a separate build process We don't install software into existing environments We take those images and we deploy them and so this is this is the overview. I'll go into all the details further on the Problem space that we think we're dealing with is one which has these separate columns which is provisioning of Machines software installation and machines configuration of the machine Managing state managing different database schema versions database migrations Swift ring transitions. These are big state changes that can take a significant period of time to complete All these slides are going to be published So if you guys are taking photos to make sure you have the slides for later You can get much better version from my Twitter stream later and Finally you've got orchestration which is about making sure the right machines are only the right endpoints and that all the machines Know about each other and it connected and glued up properly We've got a list here of some of the different products in this space and our perception of them so Razer does provisioning and a little bit of software puppet does software configuration and state management, but not really that much orchestration and I'm I can guarantee that everyone who's got a vested interest in any of these products will have a problem with this slide so I'm just gonna say this is our view on it and The blue boxes at the top are the components in the open stack that are responsible for these things Disc image builder open stack configure plier and open stack config refresh are three things that we've been building specifically for this They're on stack forge now We're looking at what to do with disk image builder. We might Propose it to be promoted to a full-blown project or we might roll it into this of glance umbrella because it deals with disk images also config and plier and config refresh are Very very small tactical things just so you don't need the full-blown puppet or a full-blown chef to do what we're talking about You can replace them by using sheff or puppet if you if you feel that's in the the right way forward or juju so Nova be a metal is a key component for this because we need to be able to deploy things to actual machines and Nova be a metal can deploy a disk image to physical hardware Heat is as I said Hawker's orchestration our Clint has a talk I think on Wednesday or Thursday about running open-stack with it which goes into the details of our templates and how they all tie together disk image builder is A tool that just knows how to take an existing cloud image like an Ubuntu published cloud image or a red hat published cloud image and transform it to install software on it and prepare it for run as Element in a heat deployed environment so you can use disk image builder to put something like a dupe on a cloud We're using it to put the open-stack components on it on an image And we we build two big sorts of images We build one image for bootstrapping with everything glommed into one big machine And we also build images that are very targeted. This is a Nova be a metal computer image This is a glance image And that's how you get horizontal scaling without any craft because you've got a clean image on every single time Who here is not familiar with Nova be a metal? Right, okay, so Nova be a metal operates like a Nova compute Process it sits up as far as Nova's concerned It is a Nova compute process and it has got resources that can manage But it doesn't put itself in the compute nodes table what it puts in the compute nodes table Synthetic records that refer to actual physical machines that it can deploy to and whenever it deploys an instance It takes an entire machine. It never takes a partial machine and the way it does the deploy is it uses PXE and IPMI IPMI or any other power manager to to reboot the physical hardware and then the PXE boot process to Copy a RAM disk of our own on and then that RAM disk Exports ice guzzly and we copy stuff on for all the gory details see But the key thing is it's an over API and it's be a metal and it's a machine image It's not an installer so you're not running an installation process You're taking an actual ready-to-go image and dropping it on He I mean it's got all the scaling and a bunch of other things But the key thing for us is it's the glue it's the thing that's able to say I know that I need a rabid mq server and a mysql server and a Nova API server and I need passwords from this one to be handed into this one and this one here needs to pass Connection details back here and can just handle that flow of information configuration metadata between machines and It doesn't do anything within a machine. I mean there are tools that specific to it that can do stuff within a machine But essentially all it does is provides metadata and Delivers the metadata to the machine and then you take that over and you run with whatever you want to run with Now as I said, we can use cheerful puppet But we've written a very very small toolchain that will Just enough to do to do what we're doing So we've got templates for this that they're not in stack for jet They're in this open stack ops project, which is going to have ref stack and a few other things turning up there And those templates Describe the metadata that he know it needs to know to be able to describe an entire open stack configuration It's not finished. It's work in progress Now the really nice thing is that when something changes in the environment You need to take actions on your other nodes like if you turn on a new Nova API front-end somewhere It's going to need credentials to talk to rabbit And you probably don't want all of your machines using the same credentials to talk to your queue Because if one machine gets compromised everyone is compromised at that point So when you turn on that machine you want to ask the rabbit MQ machine to create a new user Set it up and give the credentials back and heat triggers are just a perfect way to do that. So the You export the presence of the machine into heat The rabbit MQ machine is listening for an event. It says, oh, there's a new machine that uses me I'll create users for it and it exports that back in targeted back to the machine that just got added So you have this information flow back and forth Within the images that we build When something changes We've got several different sorts of changes that can occur. It might be something very trivial In which case we want to just restart the services that were affected It might be something more substantial It might be something where we need to shut stuff down before we make the change take place a new KVM library version or something or it may be really substantial. We have to reboot the machine because there's a new kernel So we've got a very very simple model in this we say We're going to do something shut down anything that's fragile Take the new metadata write configuration files from it start up services again and then run migrations and The restart point is where we'll just do a check and say hey, there's a new kernel. We should reboot the machine and When the deploy is finished we notify heat that this has been done So at that point heat can say hey that machine is completed I can move on to the next machine how many machines you do at once is going to depend on the size of your cloud You've got ten compute nodes. You probably want to do one at a time, right? If you've got ten thousand you might want to do more than one machine at a time We're using golden images for this now Golden images can mean a lot of different things to different people. So Let me define really precisely. I mean we have a disk image that represents a running version of software It has no configuration on it or none of the configuration that settings that we'd ever change and it doesn't have Seed databases for my sequel or anything. It's just just the software when we deploy it We create a separate petition that we're going to store the state in Then when we redeployed and we rebuild that node we keep these other petitions intact now That's this is one of the bits of science fiction We can't do that yet until we've got the bare metal cinder API stuff in place But was very clear to us how it will work when that's done Conceptually the golden images act as packages They are a they are a thing that describes an intact set of software that works But they describe it at the machine level rather than the individual bit of software level and this is important because when you deploy you don't want to find out about conflicts and dependency issues on the machine you just deploy to so Your unit of deployment needs to be the unit of thing that you test with and so Because to turn on a data center in the first place you have to deploy the machines our unit of deployment is clearly a machine and This image builder I mentioned before is a very very small tool chain. Just a few shell scripts, but very very useful We split out the core which knows how to do Ubuntu fedora I 386 AMD 64 images and the Stuff that's specific to running open stack on open stack So the disk image builder should be useful for people who are deploying Hadoop or other large cluster software into an open stack environment Even if they don't use Heat and they don't use the rest of the tool chain. It's a focused focus tool. Sorry So the way deployments work is that the heat stack is defined in the cluster Heat is driving the nova API to deliver images to machines and its triggers are updating the environment on the machine One thing I skipped over before is that once we have a machine running if we've got a new image to deploy We can do that without rebooting the machine if the software in that machine decides to cooperate So it can observe the heat metadata and see there's a new image ID It can pull the image out of glance and ask sync it across the existing root file system Because none of our state none of our precious data is on the root file system We can use delete after we delete any craft that was there old package versions gone temporary user codes for debugging gone And we end up with something that is exactly equivalent to having started with an image for shout of glance In developer test environments use virtual machines in production use real hardware Now these things combine to give us what we call the under cloud in the overcloud now Conceptually you might say let's have one big cloud It's got bare metal and some of the bare metal nodes are running KVM or Zen compute nodes and then within those nodes will run other things and those services will also be part of the cloud Now there's there's several Pragmatic problems with that in the short term long term I'd like to get to that as a as a capability in the short term Nova doesn't particularly like having really different hypervisors Running in the same environment. I don't know the exact details. Maybe it's false Maybe that she works just fine in which case I'll redo the slide tomorrow But another pragmatic problem is that when you have something virtualized and you turn off the data center Say there's an earthquake and or you know bomb threat you something you have to evacuate or even a catastrophic power failure That's turns off your data center. You don't have a choice about it. How do you restart services if everything's virtualized or if Say all your glance API endpoints are virtualized To bring them up you have to turn on machines. They're running on and those machines are running with Nova bare metal So they also need to be bootable But the metadata that machine will need to know to know which VMs to restart and which VMs had actually been migrated off or were stale For some reason is inaccessible the rest of the networks not running yet so By having a very clear separation of this is the bare minimum facilities to run a cloud and it's the bare metal cloud It's just doing bare metal machines some of those machines will host actual services, but nothing is virtualized You can be sure that your power on story is fairly simple once we get in place the ability to recover without the rest of the network turn on a KVM or a Zen hypervisor and have all the VMs that were running on it before come back with their networking configuration and Everything else intact then you can start saying well, let's mix and match much more freely But the other thing that's nice about having an under cloud and over cloud is that you're over cloud tenants are not special They may be privileged because you trust them to run on bare metal hardware but they're not in Any way unique so you can run several different clouds on top of one under cloud You can run a test cloud. You can run your production KVM cloud you can run dev test cycles for people doing perform at high performance work and It's it's it's an API API is a great So in the under cloud we want a fully high available bare metal open stack Now we don't need swift. We just need glance Nova Keystone heat So it's very small subset of everything that we need But it's got to be fully available because how are we going to deploy? upgrades to that cloud We're going to use the API that the cloud hosts to turn off nodes within it and turn them back on again with new versions of a Disc image in the case of a kernel change rather than the special case I mentioned before We think we can do this quite reliably with just two machines both of them running a full stack Rabbit my sequel Nova API Nova be a metal compute just full h8 up and down Now that's not how you want to scale out when you're hitting thousands and thousands of machines But it means that the overheads are low enough that someone is running a 10 node compute cluster could conceivably Put two small machines aside and and do this without killing them You could do one, but I don't know how you would upgrade that ever Using this methodology because it would be trying to reboot itself and it would just my brain hurts and The over cloud can be because it's running on bare metal. You can run your actual high-priced It can be a full KVM or Zen based Over cloud Here's an important thing the heat that orchestrates the over cloud is running in the under cloud So you need to be able to talk cross-cloud and if you we can expect that there'll be some bugs that will run into we have folk assumed that the Local cloud configuration I've got is authoritative for all of the endpoints. They talk to an actual fact this will be one of the cases where it's not and Because KVM and Zen Nova can boot just Petition images rather than full disc images. You can use the same disc images for your over cloud and under cloud Installation installation is where it's fun. So this is a special case of normal deployment And this is where I think the value proposition starts to become really clear We start with one of these all-in-one disc images. I mentioned before they have got the whole stack But we start with it in degraded mode. So there's no high availability and we can build one of these images Straight out of disc image builder. We don't need an existing open stack to get this going We run I'm sorry sore throat So you run that in a virtual machine on your laptop and you bridge it to the network in your data center Enroll all the machines so you teach it about the MAC address and the physical characteristics of all the machines in the data center And then you just tell heat Please scale this out change me from being a One node everything in one place to a two node everything in one place and it will take one of the machines out there and it will bring it up or under data migration scripts to copy over the MySQL databases to Set up rabbit in HA mode and so on and then you just turn off that VM that you had Everything is now running in HA mode degraded on one of the machines in the data center Do that again heat will ask Nova for another be a metal node that will boot up It will drop a disc image on that that is the same thing It has the whole vertical stack and it will re-associate everything back into HA and you're now fully operational You've got a full be a metal cloud that can deploy to any of the other machines. You've got glance keystone and That I think is probably the simplest installation story. I've heard and this is why I'm talking about because I love it the HA no, so the The HA scale out stuff for my sequel is done the HA order configuration for rabbits not done yet The heat roll heat doesn't yet really know about rolling Deploys so it will if you were to try and upgrade that it would turn them both off at the same time that won't work very well and Nova There's a bunch of little niggly HA things in quantum and Nova be a metal That will give you, you know, you'd have to have manual fix-up scripts at the moment rather than having it just work we Not this is not a forward-looking statement, but Getting all of the stuff or all the polish needed to make it just work reliably is the Havana cycle focus for us So the the grizzly cycle we spent most of the time bringing the bits together Frigging out exactly how things were going to interact and Fixing be a metal to be more production reading and so on so the cycle is is largely polish So that is actually the last slide. I don't know if I talked too fast or whether we're just on time, but questions Okay, so the question is if you've got an overcomputed VM host and you do a deploy to it How do you make sure you don't lose state about the VMs that are running and in particular rollbacks? So I don't think rollbacks are any different to roll forwards really and in the sense that if we have to restart the Orchestration process it will lose whatever state it has that's not been persisted on disk at the time and There's a session on that so I mean to frame things We're interested in making this work by improving the various bits of open stack that need to improve So I would delegate that answer back to Nova and say we need to make sure that Nova Can be shut down on the KVM node leaving the KVM processes running started back up again and any Things that the Nova process itself would have had to do to orchestrate live migrations for instance Should be the next step of that should be picked up and as long as you have the software ready to go So you shut it down and you start it Shouldn't be visible and that seems like a Nova problem to solve that lots of people interested in solving I don't think triple O as a as a program needs to do anything specific there The tricky thing with KVM instance host is the kernel if you want to change the kernel out without rebooting That gets a little bit more tricky. So the code is Wrong button I have a slide with the main I may not have a slide with the main link That's wonderful Robert Okay, so GitHub comm slash triple O slash incubator is the entry point into the The project and that's where we put stuff We don't know where it really belongs yet and it's got to read me there that describes everything and has links out to all The other code now. I have the other links in here Discimage builder configure player and the refresh config which are the two things that take the heat triggers and react appropriately In Stackforge, so if you just look in Stackforge, they're right there and there aren't you know normal review process and so on the incubator itself has Basically documentation and a couple of convenient scripts like take a disk image and boot it on my local machine without open Stack because I want to try booting something that is open stack And that things it'll probably die off eventually all will move them in as helpers in the Nova libraries or Whatever and everything else we've been doing has been done directly in projects like Nova So Devan and is working on Nova be a middle nearly full-time because that needs to be really robust and solid for this whole thing to work Is that? Is that sufficient? Oh, what do we have working right now? So disk image builders basically done. We're very happy with all the The way we build this images osc config a plier, which is just that this template thing. It's done Refresh config is done as well. I mean that the enablers the templates to do open stack Broken into two parts. There's a set of templates that are the Config a plier templates that is moustache templates and they describe the etc files you're going to have on disk They are Basically done for the bootstrap everything in one node But they're not done for the and I want to separate things out into this targeted node and that targeted node That will be fairly straightforward to do once we get to the point where that's a the next thing to do the Description of an open stack cloud and heat metadata. I think that's pretty much complete Probably doesn't have swift yet, and it certainly doesn't have non core things like Bach or other components that people like to drop in But it also is it's very simple environment to work in so it should be straightforward to do that if you're interested the Thing where I talked about we are on the disc. We separate out the metadata that the Persistent state and and your config files that you are changing from the disk image itself is not done Because we don't have sin to be a metal in place yet now There's going to be an unconference session session with John Griffith and myself and whoever else is interested in making Deciding the approach for that Quantum what so one of the things we had to do in the last cycle was to fix quantum so that you could say hey This machine we're booting. It's going to actually have this Mac address Because it's hardware. That's the address on the card. Sorry. You need to deal with that That sort of stuff is largely done At the moment though you can't do PXE booting with quantum So don't hands been working hard on getting a generic framework into quantum where you can do here's some DHCP options To give this particular port inside quantum Once that lands, there's a very small patch We need to do to pass the Nova compute host. It's going to be doing the image Transferral and fast needs to be listed as a PXE host for the node to quantum as part of the config for the network so that they'll be a very small patch I Think that's pretty much the whole state of play so we within a few weeks have been able to say hey Look, it's all working in its polish, but it's so close So there's a kind of squidge. So so one of the interesting things that that is kind of a refinement of this is like Scheduling locality, right? I want to make sure I've got a high available database And I want both sides of it to be in different racks different power zones Which is not quite what you're asking, but it kind of relates to it. Okay, so in terms of having pools of hardware And so on I think we need to solve OpenStack needs to solve Rack awareness in the general sense for scheduling and once you've got that then I think the ability to say hey I'm doing database deploys virtualize or database deploys physical and make sure that they get the right hardware available is As a subset of that problem Is that do they understand? Right. Oh, so who's who schedules on to which which things so when there are two clouds you just point your Thing at the cloud you want to deploy into you. That's the endpoint you use So it's very straightforward If you're trying to do things that burst across two clouds then heat doesn't do bursting yet But I believe there's a session this week on making it do that and for that point you just have two definitions They will say you can burst from here to here Maybe you say start with virtual and if you run out of virtual go to physical or the other way around More interesting to me is making sure that you land on the machine It's really targeted to databases like the best can hardware config for a high-performance database config looks very different To the best config for a virtual machine host So I see lots of schedule work in our future to make that I mean There are hints and there's policies I haven't dug into and it may be just a matter of configuration file writing But someone's gonna have to take the time to really make that very straightforward for people to do How fast can you bring up a 50,000 node environment? That's a good question Now I don't I don't know the actual answer this because we're not in production with us If we if we have to do image deploys So the process for bringing back up so if we fix all the bugs that are related to this I would expect something like turn on 10% of the machines Wait for their power utilization to stabilize just look at a better at a load meter in your data center power bar And then turn on another 10% then turn on another 10% then turn on another 10% and so until you're done And you shouldn't have to care about which machines come on in which order Other than your network switches and even then you shouldn't really have to care Because all of our software should have be able to deal with the fact that in a cloud Hardware fails a switch will die You don't want to have to reboot all of the machines the hosting VMs that are attached to that switch to replace the switch So we should be able to deal with any component going away and coming back in a reasonable time frame gracefully So if you have that I think it should be something like three or ten times the power cycle time plus the boot time for your OS and that should be a constant shouldn't matter how big your data center is it should be just flick on ten percent Wait three minutes flick on 10% wait three minutes And then you need to wait a couple of minutes for things to retry and reconnect and go Oh rabbits actually there and it's usable and you should be done That said I I don't think open stacks really ready for that There's a lot of code that I'm sure hasn't been written yet Needed to enable that right now. I suspect to look something like turn on your control plane your two machines or your 50 machines So you said 50,000 node. I'd expect you'd maybe maybe 50 machines to manage that Because you want to be able to do deploys very very quickly So you're just having capacity there to spread the load out spread the network traffic out One of the future things we'd like to do is to use bit torrent to do the images when we're doing a live push So all the machines can just get local traffic and it can flood Twitter has done a paper about thing. They called murder. We haven't got on to doing that yet But I'm you know, I'm trying to talk as much as possible what we have going today so I think turn on the control plane and You'd probably turn that on database first then rabbit then you turn on the keystone Keystone API your h.a. Proxies for that then you turn on dependent services depend on keystone such as the Nova API and over backends glance and then once those are all turned on You then start powering on at 10% of the time your compute fleet and your complete fleet will be coming up and It's got the key services. It depends on and away you go But because there are two clouds, there's actually two times that first stage So that could be quite an extensive process. I don't think it should be but I just expect it would be today with any Deployment technology in fact because it's an application problem The ability to respond properly when things weren't there when you got turned on is a novel limitation not a Papadoshi for Or anything else Sorry, I don't quite follow the question Any so the Nova be a metal driver can deploy anything that can be written to a petition in a normal petition and It's got no constraints on the host on the operating system. It's deployed So if you have an image of windows or an image of Zen or an image, that's just regular Linux Any of those images can be deployed by Nova be a metal onto a machine So there shouldn't be any limitations on what you can do. I Think does it does that? Okay, so for the triple-o stuff itself we are I would throw 16 giga RAM and I'd big enough disc to run You know six months history of glance images because you can keep the old images and you can roll back and you can say hey Throw him up a test cluster with this. I want to see what was going on six months ago I think something bad was happening and you get some really nice history out of that The reason I suggest 16 giga memory is because if you imagine that you've got Rabbit my sequel Nova API and Nova compute be a metal compute keystone and keystone API You quickly get to you know double-digit Numbers of discreet images and when you do a deploy all of those images have to be read into RAM Before they can be copied out through iSCSI so The image size that of the petition you write on the the far end isn't necessarily connected to the image size that you have with A q-cow to or whatever image, but you're going to take a ram hit on the bare metal compute node when you do this Even we've got some optimizations in place now and there are more we need to do We're waiting on this quantum PXC thing us as I've mentioned before before we can do them all But when they're done you store them on enough RAM that that fits in your working set so that you're never waiting for a disc during a deploy It will have almost no compute work to do at all you could throw an atom processor in there And I think it'll work just fine And the reason and and as much disc as you want glance history But again because you don't want to be hitting disc at all you shouldn't need to worry about this performance It should just be capacity Right, so that was the thing I've touched on before which is the power on After a failure of the entire data center story Pick a service that we might want to virtualize something like I don't know No very hmm Horizon okay, so horizon is something that's got no dependencies and sure virtualize it great Pick any other service. I think you'll find some things that depend on it The conductor service for Nova for example if that's not there you can't bring up other Nova nodes If it's running on top of an over node That Nova node can't be brought up to bring up the conductor service that the Nova nodes need to run on The same goes for a database and for anything else in the under under cloud I think this is solvable, but it's a bunch of engineering work to allow all of the services to come up without their dependencies and Resume operating the way they were when they were shut down No So with bare metal we can write a boot block To reach disc so if the bare metal provisioner is not there the machine just boots up with the boot block and And the heat metadata that was in place at the time of the power failure is still in place on disk It's being written to a persistent file. It's not in the RAM disc. It stays there across reboots It might be stale, but once everyone's connected it can get updated the The only tricky bit is the IP address for the machine needs to be preserved and actually DH client writes that and we'll try And use the previous IP address Now i'm not certain that we've got the configuration absolutely right to reuse the IP address when DHCP isn't available But that is an existing feature of DH client that we can use yeah, so one of the big things is being able to recover from disasters and I'd like to get to the point where everything is is virtualized turtles all the way down Because they'll give you greater hardware density But and here's the thing Another approach to that is to just run multiple services that can coexist Within one image coexisted and not virtualized at all Now that's obviously not a general solution for any arbitrary bit of code But for just running open stack enough to get open stack up. It works fine, which is why we've taken that approach. It's simple and yeah Like I say, I'll be so policy versus Mechanism policy wise for the deployments on putting together I need to be able to do power on reliably and the things I mentioned before reasons why that won't work Will the triple o heat stacks that that are being built be ones that would support that if you wanted to roll it out that way Yeah, absolutely. I don't see any reason why people shouldn't be able to do that if they want to if they're confident in their power on They'll never need to do a cold power on That's their choice using using what Mesos. Okay. Yep. I don't know I've the HP's a big company and we've got lots and lots of Things happening. I've been focused very much on the this deployment story and for this deployment story We we don't have enough. We don't have it all working yet. So we haven't really been in a position to branch out into more esoteric questions Yeah, it's So, I mean these nodes are just running straight, you know, red hat or Ubuntu, whatever your preference is We are Ubuntu folks so we're developing on Ubuntu, but with with The software you install on it. We don't place any restrictions on how it works or what you can do with it I've checked my twitter rbt columns on twitter and I'll put them up after the talk Rb Rbt columns Yeah, it does in fact So, yeah, rbt columns at hp.com is my email and the first part is my twitter handle So I'll put the slides up there and you know, soon as I get online So Probably we should have everything If we don't have everything working in Havana, then there's going to be a problem for me anyway The Amount of work we've got needed to get to a minimal functional set is fairly small So I'm I'm confident that Havana will see this being usable So I think folk from the next session is starting to percolate in Um, you can grab me outside if you've got further questions