 So welcome to this talk, which is titled broken stack open stack failure stories. We're here to talk about why open stack deployments fail and what you can do to try and avoid or at least increase the chances of success with open stack deployments. I'll do the super fast boring introduction. I'm John Kelly. This is Jason Grimm. This is Chris Rivera. We're all solutions architects with Cisco Metapod now, but I used to work at Rackspace and he used to work at Rackspace and Morantis and he used to work at Piston. So we've been doing the open stack thing for a long time and have witnessed many open stack failures firsthand. I used to have hair before I started working on open stack. So what is broken stack? I think most of us here, if you've worked in an operational role on open stack, you have experienced the state of broken stack. We're defining it here as a failed state or result of an attempt to launch an open stack based cloud. Also as an expletive used to describe an open stack deployment or service not working as anticipated, typically accompanied by profanity, a lot of profanity. I know my Rackspace friends back there can vouch for this. Something is wrong with my private cloud. So you guys have probably all seen the news articles that say Gartner says 95% of all private clouds fail, which is not exactly correct. And so if there are any Gartner people in the audience, I'm not actually saying that Gartner said that. But what Gartner did find was that 95% of private cloud deployments have issues. When people are asked, are you satisfied with your private cloud? 95% of people report that there are issues with their private cloud that they're not satisfied with. So when you look at that statistic, if I were to tell you, okay, you're going to try and create your own private cloud, you have a one in 20 chance of actually being satisfied with the result. Would you still proceed with that? I mean, that seems like pretty terrible odds to me, right? So what we're going to talk about here is we've got three case studies of companies that have successfully or unsuccessfully deployed private clouds. We went to customers, to people we've worked with in the past, and we interviewed them, and we asked, you know, what were you trying to do? What didn't work? What could you have done differently that would have allowed you to be successful? So we're going to take those case studies and extrapolate some lessons learned. And from those, we're going to give you guys some keys to success. What things do you need to do to try and minimize the chances of failure and maximize the chances of success in deploying OpenStack? So for our first case study, these are all anonymous. The names have been changed to protect the innocent. Obviously, it was difficult to find someone who wanted to come forward and say, yeah, we had a terrible failed OpenStack deployment, and we want to put our name on that publicly in front of hundreds of people at the OpenStack Summit. So I hope you'll forgive that. This is a financial services company, and the gentleman I interviewed for this actually did OpenStack deployments for two separate financial services companies, one of which was successful eventually, and the other which failed miserably. In this instance, the thing that brought about the need for a private cloud or the perceived need for a private cloud was needing to reduce time to market slash time to value. And that's a common thing that you'll see for migration from, you know, VMware or mode one IT to mode two IT, right? They had issues with really slow release cycles, six months a year per release. It would take weeks for them to get firewall rule changes, load balancer rule changes, VM provisioning took days. You know, they had these heavyweight manual approval processes with the security team. And I'm sure, you know, all of you have experienced this because these are the pains of traditional IT, right? And a lot of the reason why, you know, we have cloud now in the first place. So after putting a huge amount of money and effort into this project, the primary issues that they ran into can be broken down into three groups, right? The first one was staffing issues. They attempted to do this with four full-time heads with the thought process that they'd be able to take some VMware admins, stuff like that, cross train them to help support the platform. What they found was that realistically for 24-7 production workload, they would need a bare minimum of eight full-time heads focused entirely on OpenStack. They found that it wasn't just operations that these guys were doing. This was earlier on in the days of OpenStack, and they wanted bleeding edge. They wanted current projects and everything. So they had to manage their own packages. They had to do their own upgrades, which is quite a bit of work, right? So these four guys were doing that. They were managing operations, keeping the cloud running. And the other thing that they had to deal with that they hadn't quite anticipated was the amount of enablement and support for end users that they had to do. Because operationally, OpenStack is very different than VMware. And so when you have a user base that's accustomed to interacting with VMware via service desk tickets or a nice UI, there's some end user education that's required to move them to Horizon and APIs. The other big problem they found was that they couldn't just cross train VMware admins to be OpenStack admins. To be an effective OpenStack admin, you basically need to be a full stack engineer, right? You need admin skills, you need at least light development skills, the ability to read and troubleshoot Python. You need security monitoring, storage knowledge, a broad skill set that is not generally widely present within an organization. And so they found that they needed to bring in outside talent. And that that outside talent was expecting salaries that were basically twice what they were paying their normal admins. So they had some major staffing issues there, which leads us to an adequate budget. When this project was launched, the budget was allocated between hardware and manpower and everything. And they bought a lot of hardware thinking that they would be able to get by with four full-time heads. In reality, that was not the case. And I'm not sure how many of you guys have run into a similar circumstance. But we all know operating OpenStack in production is hard. It's not VMware, right? It's not a black box solution that just kind of works. It requires a lot more care and feeding. And the care and feeding it requires a level of technical expertise, significantly greater than what a VMware engineer or even a VMware architect requires. Finally, we get to institutional hurdles, right? Resistance to change, so you can build this platform. But if there are groups that are still, you know, change equals risk, right? Like, we've always done it this way. We've got this process that works for us. Why should we change the way that we do things? You know, like, it's taken us, you know, we've had a three-day manual VM provisioning process for years now. Like, why can't we just keep doing that for the next 20 years, right? You also have people who try and force old operational or architectural models, you know? Well, you know, if this server dies, I want my application to automatically fail over at the infrastructure level without me having to do anything. You know, I don't want to implement monitoring or anything like that. I just want it to happen magically. You know, that's what VMware is for, right? And on the security side, in my experience, that's been the hardest part because you have companies that have a very traditional approach toward security. You know, they're used to manual approval processes where you put in a ticket, et cetera, et cetera, and that completely defeats the purpose of an API-driven cloud, right? Finally, last but not least, fear of automation and or inadequate staff to implement automation. So people just saying, you know, well, you know, what if the script breaks something, right? Because we all know that well-written automation fails more often than people doing things manually, right? Or, you know, the teams that actually, you know, yeah, we'd love to automate this, but we're basically all working 60 hours a week right now and we don't have the bandwidth to do that and we have no additional headcount, right? So how are we going to automate this stuff? In a nutshell, to summarize the lessons learned, right? OpenSAC is not a VMware replacement. The people expecting VMware-like experience, VMware-like, hardware-mediated HA will be sorely disappointed. Cutting edge comes at a cost. If you want to be, you know, running cutting edge software, maintaining especially your own packaging, patches, et cetera, staffing is hard and it's expensive. Third point, before committing a multimillion-dollar budget to trying to launch a private cloud, build a small cloud and test fit for purpose. Make sure that your teams are actually going to be able to consume this in a useful manner that they're going to be able to adapt their operational processes in a way that will allow them to use it meaningfully. Finally, and I think this was the best point that the gentleman I interviewed on this made, was consult with those who've already been successful. He said, I wish I would have gone to other people in the industry who we know have had successful private clouds and talked to them because in retrospect, having done so, they ran into the exact same problems we did and I think we could have saved a lot of time and a lot of money if we had just done that in the first place. Mr. Grimm. All right. So, do I use the clicker? Oh, the arrows. I can deploy a multi-region open stack but the mute button and PowerPoint escapes me a lot. Thanks for a big turnout. You never want to present before lunch or people you say never want to present between you and the person's lunch, but we're between us at the end of the day and beer. So, thank you for coming out and listening to us. So, case study number two is a major telecom service provider in the U.S., specifically in the Southeast. So, their goals were to reduce cost and increase capacity. They were coming from a legacy VMware and bare metal environment, but they didn't have any multi-discipline workloads not an easy way to share, compute, network, and storage capacity. And, you know, through that, they wanted to drive higher utilization and drive higher consolidation through that process. They also wanted to reduce time to market, much like the first case study, typical of waterfall development processes, typical of ITSM and service catalog, or even basic orchestration and automation. In the environment, it's still a very slow process without self-service API, WebUI, and CLI, and all the automation there for self-service. So, they wanted to increase time to value. They wanted to go from taking months to deploying a platform to maybe even days or hours from a pretty typical use case or desire. They wanted an agile, friendly development environment. They were shifting from, again, traditional waterfall development and deployment processes to more of a CI CD and agile-based environment. Again, they wanted a self-service UI and CLI and API for both their cloud admins and cloud operators, their developers, and their end users. So, the environment was two regions and four availability zones, two for dev test and two for prod per region around 100 hypervisors. You know, pointing back to the first use case, going with the Big Bang, and I don't know if we call this out directly in the lessons learned, but going for a Big Bang approach, probably not the best methodology to go. It's like if you're going to fail, fail fast and fail often and iterate quickly, and you can do that with a 16 or 32 node environment much more easily than you can with a 100 node environment. So, the primary service for this new platform was SIP and VoIP services. Lastly, they went with a vendor-managed OpenStack consumption model. We'll talk about that a little bit later. So, you know, everything looks good. It was very nice. They had executive sponsorship and budget. They had a good level of adoption and a good culture fit. They had good dev and platform tools. They were going from a lot of roll-your-own Pearl and Bash and kind of one-off scripts. They decided on Ansible as a standard for provisioning and configuration management. They did, it was a big environment that they put in, but it was a very conservative migration approach where they started with a less than 1% workload and moved up from there. Lastly, they went with a managed service. I'm not saying that DIY is bad or a distro-based solution is bad or a managed service is the panacea or the absolute best way to go, but from your first foray into it, it's not a bad option because of the staffing issues and because of the investment and time-to-value and time-to-market and not getting your staff poached and things like that. I'm putting that in the check column as a good approach to go. So, what happened? Everything looked good here, so the major issue again was a lift-and-shift mentality. So 90% of the 95% of failure rate is around culture, adoption, leadership, tools and all those things. Everything was lined up good. The only thing that they kind of skipped on was they did a lift-and-shift of how they did their workload today. They automated it and they added orchestration to it, but in the end it was 128 gig instances that were vertically scaled, a Java application, and I won't call out the third-party ISV SIP provider, and that didn't really matter, but the SIP provider and I can say who the managed service provider was is it was us. I can't mention the customer, Cisco and the third-party ISV said, you know, you can run large vertical instances. We do it with Hadoop, we do it with HPC. I mean, it wasn't that the vertical instance was the problem, it was that the way the application was interacting and it came down to some Java issues, but they decided not to re-architect, we'll just take what we have and we'll move it over. So when the load was between 1 and 49%, so let's say for the first 12 to 15 months that it was up and running, everything was great, performance was great, consolidation was great, higher utilization, everything was running fine. At 50%, and I can't remember specifically what it was, the last fat kid in the pool, there was something like 4,000 active sessions per hypervisor. All of a sudden, the time went from a 20% or a 20x decrease in time, which caused jitter, call drops. It was essentially, it tanked the entire platform and this is now they're in production with this and now it's a crisis and now it's an issue. Excuse me. So it took thousands of man hours, literally, between the physical operating system, support, ISV, the operating system within the instance, the VoIP application, the managed service provider, hardware, OEM, all the on-site engineers, literally it took six weeks of all of us digging in to this. What turned out to be was a Java garbage collection process that happened, that caused all the time and it came down to NUMA alignment, but it took weeks to get down to that because you've got literally hundreds or even thousands of attributes trying to debug this specific issue. I mean, the other challenge was they weren't on Kilo or later yet. I'll keep going, or I'll keep, go faster. I'll speed up. They weren't on Kilo yet, so they didn't have a NUMA, they didn't have the NUMA scheduler. They're on Juno and they're on Liberty now, but the short-term solution was a very low-tech thing which was to pull one of the processors out of the box and force NUMA alignment. We're trying to do all these things, but anyway. So the lessons I learned here, again, OpenStack is not VMware and the organizations that say, I want a better, cheaper, faster VMware. Yes, it's virtualization. Yes, you have virtual networks, you have virtual IPs and things like that, but take the time to do things right. Re-architect your application so that it's also a good fit. In the end, it would have been multiple times easier, cheaper and faster to take the time to re-architect the application versus getting 15 months down the road and literally taking the whole platform. Now, the black eye, this isn't technically a broken stack, I'll call it a near broken stack because the C-level management was ready to scrap the entire project. I mean, they had made a strategic direction, millions of dollars, you know, thousands of hours to move to this, everything was fine. Literally almost pulled the OpenStack initiative out altogether. Chris. Thanks, Jason. This last case study is really focused around a very large media company that's had numerous acquisitions throughout the years, mostly speaking with a CIO and a vice president of data centers. So what was actually happening is all these product leads from these companies they've acquired, the DevOps leads, they were demanding, you know, more agile environment, they wanted more speed, and very importantly, they wanted root access, which typically wasn't provided to the end users who are spinning up virtual machines and doing development. So they evaluated some different solutions and settled on OpenStack. Sounds great. They actually took, I thought, a great approach. They were going to do it themselves. What I mean by that is they were very... They took a very conservative approach. They weren't installing the latest release with all the bells and whistles. They had planned on the specific features that they needed. The storage tiers, the networking, that was well thought out in advance. So they were considering the upgradeability of that cloud. So things are looking pretty good so far. But what we actually found was it was really... Because the company was so large, a lot of it was these issues which I think kind of analogous to, you know, states' rights versus federal government, where they had a bunch of these different teams almost competing with corporate IT. And all these little teams wanted to do things their own way and corporate IT was kind of mentioning something else. But what we actually found out when they rolled out this private cloud was that the dev groups didn't fully grasp the implications of having raw access to the infrastructure. And what do I mean by that? Is they didn't realize that once we get this raw access to the infrastructure with root access, we now have to worry about security. And there were organizational issues such that who would be responsible if there was some sort of security compromise, right? Is it the corporate IT or is it the individual people running the virtual machines with root access? Who is going to take care of the patching? And who is going to take care of some of the third party integration with repositories, et cetera. So the kind of biggest gap in this case was that once this functionality was provided to the end users, they quite weren't ready to embrace it, right? They have to worry about security, patching, configuration, et cetera. And looking further into that, what we found was that existing corporate IT probably could have done a better job marketing those services and also outlining the current burden that their team takes on. So they could have gotten together with all their teams, consolidate all the feedback, and then say, look, we're going to provide root access. Here's the implementations. These are the tools that we're using for patching and security. We can help your team get those up and running, but your team needs to be prepared to take care of this themselves. And one of the other things was they said that a quick start guide would have been immensely helpful because you're taking these people who are used to working in the existing infrastructure. They're demanding cloud and agility, but then they weren't quite ready to ramp up and start using it right away. So we kind of talked about, if they had some sort of quick start guide, here's some of the common workflows that people are going to do. Here's how you execute those common workflows on this new platform. Okay, well, I talked too long in the last one. We're going to speed up here. On Lessons Learned, what we're going to do is, as we get through the second half of this, we're going to kind of break things out into the five leading causes of broken stacks or open stack private cloud failures. In our mind and our experience, it comes down to staffing issues or challenges, operational hurdles, time to value, choosing the right consumption model and failure to adapt an operational model. So staffing challenges. The cost available, you know, the time to value to either find the staff, acquire them or train them internally, pay disparity versus existing staff. This is a huge issue, I would say, with 100% of the customers that I've been working with in the last five years, even from small-born in-the-cloud shops that are repatriating a workload to, you know, net-new green-field open stack cloud folks. So the usual steps of doing this probably the wrong way is, you know, step one, you acquire the talent. It's hard to find, and they're expensive to hire. If you look at it like it's supplied demand chart, open stack is not commoditized, and staffing for open stack is even less commoditized. So it's doing it by itself. I think they're clicking me through the slides, making me go fast. Very high demand, very low supply, the air go, very high cost, and a lot of movement and, you know, of engineers between companies. But, you know, you try and mitigate. You address pay disparity in the staff. You try... You know, once you get the talent, it's doing it by itself again, so once you get the talent, retaining the talent to prevent poaching is difficult to do. You've got to, you know, continue to make it not just fiscally interesting for them, but give them good workloads and good challenges so that they, you know, they are interested to stay around. It's not all about the money all the time, but working on a good project. But typically you lose the talent and you have to kind of return to step one. The second part of this, this is real... Again, we won't mention company names, but this really happens. And I actually couldn't even fit all the people on the slide. There's about 18 people over a 90-day period going from a major telecom to a major retailer. And this was around 18 of their 20-ish or so top open-stack talent. And no one... Us in the community talk about this stuff and everyone knows somebody when we talk about it, could not find any story published on it, but if you actually go out, this is based on a LinkedIn analysis. So, you know, another one... Whoops. So... You know, another one went from an open-stack provider to a major networking company. Again, if you've built a product or service, right, if you're delivering whatever it is, Ferraris or Tupperware or anything, and someone comes and takes 90% of your talent in a 90- to 100-day period, totally disruptive, more disruptive than losing hardware, more disruptive than losing a site, because those are the people that can feed for what you're working on and are ultimately responsible for the success or failure of your cloud. Okay, so... I'll cover the next two causes of failure. So, failure to adapt to operational models. This, according to Gartner, is the most common cause of failure in private cloud deployments. So, 31% is the number that they quote, where this is the primary factor. And this, you know, when we say VMware is not open-stack, this is really what we're talking about, right? We're talking about, you know, traditional methodologies, Waterfall versus Agile. We're talking about Pets versus Cattle. And one of the big issues that we see time and time again is if your application relies on the infrastructure for HA and you just lift and shift into open-stack, spoil my great next slide. Yeah, anyway, you're gonna have a bad time, right? And, you know, if you haven't accounted for refactoring your application and building it in a more cloudy manner, you're gonna have a bad time, right? If you're expecting cheap VMware out of open-stack, you're definitely gonna have a bad time. And I can't tell you the number of people who've come into it with that expectation. It's like open-source VMware for free. It's like, no, it is nothing like that whatsoever. Like, yeah, I had a great experience with a customer where I had to explain to them that they were under the impression, you know, they said, well, it's not like if one of AWS's physical servers goes down, all the instances on it go down. And I was like, yes, that is exactly what it's like. And they were just completely confused. Like, well, doesn't that mean that everyone running on AWS would constantly be having outages? You need to take about 100 steps back and explain exactly what cloud is and why you should be thinking about this. Time to value is another really big one, right? You can succeed and still fail if it takes you too long, or if after having stood up your cloud, it takes you too long to patch, to upgrade, to stay up to date with open-stack. If you're still running on Essex today, which I know there are people out there still running on Essex today, that's probably not success as any of us would define it, right? There are a number of factors here, right? Technical difficulties. Is your app actually suited for running in the cloud? How do you manage VMs, right? VM lifecycle management in the cloud is very different than it is in VMware, et cetera, et cetera. There are political difficulties, right? Who is the sponsor for this cloud project? Is it someone who has a vested interest in preventing any change and maintaining the status quo because change is risk, right? Or is it someone who's trying to make a name for themselves and say, hey, I stood up this giant private cloud within my company and not really concerned about the long-term success of the project, but just looking for a feather in their caps so they can jump to a higher-level position somewhere else. In addition, like, are people actively obstructing the project internally? And these are things that we all run into as solutions architects on a daily basis. Like, this is just ubiquitous, and I'm sure many of you have encountered this as well. Security is another big thing. Traditional security teams do not like cloud. They don't like the concept of self-service, of people using APIs to provision VMs. You know, people, developers having root on a VM, like, whoa, right? And so a lot of times, even if you can get a cloud stood up, the corporate security team says, well, no, no, no, before you can provision a VM, you need to put in a ServiceNow ticket, and it needs to go to these seven people, and, you know, a month later, you'll actually be, you know, and so that can kill a cloud more than anything else, right? And finally, legal. That's especially applicable if you're looking to go for a service or managed service solution. If the legal team's not on board and able to, you know, sign off on that, not really understanding open source or anything like that, those are common issues that can cause major problems. Chris? All right, so a little bit more on the consumption models. I kind of like this slide because it does a great job comparing the different options, right? So let's say you're sold on OpenStack. There's a few different options. First and foremost, we've been talking about do-it-yourself OpenStack, and, you know, this is great for some people. You evaluate your needs, and you determine that you need cutting edge, you need access to the latest projects, then that's fine. You can, you know, roll OpenStack yourselves, but you need to be prepared for, you know, to maintain the OpenStack skillset, the expertise that's going to require, the operational complexity, and also the support is pretty much all being handled by your team that's doing this deployment and, accordingly, your SLAs. On the other side, you have a full managed OpenStack, which is going to be a bit different. Lots of times those aren't going to be the absolute latest release. It's not going to have the latest cutting edge features, but it tends to be more robust and highly available. So, you know, the SLA can be as high as, you know, 4.9s or whatever, depending on the vendor. And the other thing is your, lots of times, going to get the full stack support, and it's not going to require all the OpenStack skillsets with the idea being you just want the capabilities of Cloud, have someone else make sure it's always up and running, and you can focus on your business running on top of that set. And kind of in between there, you have the various OpenStack distros. Coming from, I came from piston cloud computing, as we mentioned earlier, we had a number of instances sometimes where we would, you know, partner with an SDN vendor, we work across numerous hardware, and sometimes that could get really difficult if there was some sort of issue. How do you go about troubleshooting the root cause? Well, the customer would open a ticket with us because we were, you know, the OpenStack guys. We start digging into it. We find it's some misconfiguration network. It could be SDN. If you open up a ticket with the other vendor, they start looking into it, and then maybe they determine it's some misconfiguration on the switch, which you have to kick that ticket back over to the customer in the end. So the customer is kind of frustrated because they're working with all these different parties. So that's just something else to consider for the operational hurdles. And as we've been talking about VMware, so the thing about VMware is, you know, everyone's very familiar with it. It's understood within the organization. It's already widely deployed, heavily used, et cetera. So we can walk through... I like to consider a quick example because, again, OpenStack is not VMware. Let's look at launching an instance. So if you're launching an instance in VMware, lots of times it's, you know, select the image, choose the storage type, select network, click start, or next, next, next paycheck, as I think someone was telling me earlier. And it's actually designed in such a way that you're limited to the specific options and it's kind of fail-proof in terms of launching a VM. How would you do the same thing at OpenStack? Well, select the flavor, select a boot source, the key pair, choose a security group, pick a network, and click launch. Right? Easy enough. You do all that, and great. I launched my first OpenStack instance. But as I'm sure many people can relate to here, you probably saw something more like this. You tried to launch an instance and you got this error. Maybe not that one, maybe you got another one. Where, you know, the flavor you were trying to launch an instance that was too big for a flavor. Vice versa. Maybe this. This one's particularly useful. It's in no state. And so you're kind of left feeling like this. And I think this is quite interesting because it took me a long time to just kind of understand how all this stuff worked. And now that you've been working with it for a certain period of time, it's a little more comfortable. So I can get in front of OpenStack and we know, okay, I have to find or create an OpenStack friendly image. Well, what does that mean? If it's a Linux-based instance, that can be quite easy. Just download. If it's Windows, you might have to create your own. And then once you have that image, you need to upload it into your environment. Well, lots of times images can be rather large. You can't just click Upload Image. You have to do it via the CLI, which means you need the OpenStack CLI. And depending on what cloud you're connecting to, it may have SSL endpoints. It could be different versions of OpenStack. You can need different versions of the OpenStack CLI tools. And you need to resolve those dependencies to even get those installed. Then you need to consider the image format. Depending on the type of cloud you're running on, you may need to convert that image from, you know, you may need to convert it from QCal to raw, et cetera. So you now have to convert that image. Then once you launch that, generate the key pair, ensure the specs match the size of the image. The networks. If you're using Neutron, you can't put a VM directly on the public network. That'll fail to launch, so you have to put it on a private network. Check the security groups. You know, you have to make sure what ports are going to be running. You have to make sure those are open in the security groups. And also, if you want to actually access that instance, which you probably do, you may have to assign a floating IP address to it to actually get to that instance in the end. And then there's, you know, a few other options. What do you want to do for disk partitions? Customization scripts, et cetera. So I think this is just an example. I mean, a lot of us sitting in this room take this for granted. It's pretty easy to do this, but you have to remember if you're coming from the VMware mindset, we're used to that click, click, click. It's not going to be, you're not going to be able to do that exact same thing. So there is kind of a ramp up of learning this new environment. So when we start talking about ensuring success, what can you do to ensure success? Some of the best practices that we've had, we're going to go into a little more detail of those, summarizing these. So a lot of it is really planning for the future, and what we mean by that is, when you decide on a cloud solution, you want to upfront as much as possible look at the different options, right? Do you want to be responsible for managing it yourself? Do you want to rely on another vendor to handle all that for you? You need to really consider your storage options, your networking requirements, different tiered storage options, because those aren't something you're going to easily add on once you start to build in that cloud. Also planning for things like upgrades, expansion, success. If it's going to be a wildly successful cloud, everyone starts consuming it. If you're doing it yourself, do you have enough power in your data center? Are you able to procure the hardware fast enough as it's growing? Can you incorporate those new hypervisors into the cloud? You know, it was finance on board. Is security okay with the self-service model that everyone's accessing? And also understanding the scope of operational change. So as we mentioned, OpenStack is a much better fit for your more agile Mode 2 applications than your Mode 1. So is your application such that your application's pretty resilient so it can sustain hardware failures? Kind of going through those tick boxes and ensuring other groups within the company already. Like I was just mentioning in terms of security, finance, the IT operations team, et cetera. Got it. I realize we're running short on time. I'm going to speak very quickly. Those of you who know me will know that this is not a problem. What do we have? So OpenStack is not VMware, right? Did we already... We're good. Yes. So what does this mean? What actions can you take to help prevent this issue from causing you problems? So set expectations with ops and with leadership. Let the operations people let the devs know you're not going to have hardware-mediated HA. If a physical host goes down, any ephemeral storage that's attached, you may lose all the data that's on that, right? You need to plan for that and build your application accordingly. Let leadership know that this is not cheap VMware, right? This is a transformational project that is going to change the way that your IT operations function for the better, right? Also understand that trying to force these legacy models onto your cloud means failure, guaranteed. Plan for adequate staffing. So hire experts. Don't try and cross-train or retask existing staff. Unless you have some really awesome full-stack engineers in which case, that may work out. But your general mid-level sysadmin, VMware admin, they're not going to be able to hack it, right? Production equals eight full-time heads at a minimum. Ensure executive sponsorship. So you need to be able to express the value of open stack of private cloud in terms of increased revenue, reduced cost, and reduced risk. That's what leadership cares about and to explain it to them in those terms. You need to ensure that there's a firm commitment from leadership before starting the project, both financial and in terms of executive sponsorship, right? Finally, learn from the successes and failures of others. Participate in the community. Come to open stack summits. Participate in local meet-ups, right? And talk to other companies. Use your network. Use LinkedIn if you need to find some people. Talk to people who've completed similar projects preferably successfully, right? Learn from them what problems they encountered and how they resolved those. So to recap real quick, we walked you guys through a couple case studies. We talked about the lessons that all those people learned, and we talked about how you can learn from their mistakes, what actions you can take to prevent those problems from occurring in the future. We went through the five leading causes of open stack failure. Staffing, operational hurdles, time-to-value, consumption model, and failure to adapt operational models. Finally, we went through these best practices, right? Plan for the future. Understand the scope of operational change. Understand that open stack is not VMware. Plan for adequate staffing. Ensure executive sponsorship and learn from the successes and failures of others. There's a couple related talks here. If you guys are interested, I'm doing one on open stack consumption models tomorrow evening, and they're doing one on MetaCloud, which is our managed open stack offering at Cisco. If we have time, I'll open it up for questions. Our contact info is up there. If you guys want to email any of us, feel free. Any questions from the audience? I'm sorry? Yes. Take it with a pinch of salt. Your recommendation about staffing, when you say don't retrain, get new people. Right there, I see a conflict of interest with the existing team. Open stack to replace VMware if I know that that's going to be... That's a very good point that we encounter a lot of resistance from people because they're afraid it's going to take their jobs. I don't think you necessarily need to eliminate all your IT staff, but you can't train an open stack expert from zero. You need other open stack experts to train open stack experts. I was one of the founding members of Rackspace's private cloud engineering team. We started with five people, and we trained... Back there if you know, he's walking out now. You jerk. We started off as a Windows developer and got trained up as an open stack admin. But you can't really do that on your own, right? You need other experts to help you get to that point. And on the consumption model as well from practical experience, what I have seen is as a first step you have to draw a parallel and say what you do today, this is how you can. And then go on to say, okay, this is how you're used to consuming it, but this is another way of doing it. So to tell them that what you have been doing is wrong and this is the new model of consumption again is faced with a lot of resistance. Yeah, and I think we'll cover that in a lot more detail in my talk tomorrow where we examine, you know, DIY versus distro versus managed offerings and why you would choose one over the other, what the pros and cons of each are. So I actually participated in one of the failures, not one of these, another very big one that you probably know about. And one of the big reasons was the immaturity of the open stack in terms of fit for purpose for the product that they were planning to do with it. So I assume that's still true. That's good feedback. Yeah, I mean, it's different than what most people are used to using, right? If you're coming from AWS, no big deal, but if you're coming from VMware fair enough. Yes. I wanted to mention that one of the messages you showed, the root is too small, can be avoided by a competent admin setting a proper size on it. Yeah, absolutely. All of those things. Yeah, so the point is more like if you've got a staff of VMware admins who are trying to transition to open stack, there's a learning curve, right? There's a... That's a problem everywhere in IT. Are they axing us? Okay, are we done? All right, thank you guys. APPLAUSE